from:"Michael Ellerman"

Re: [PATCH v2 3/6] powerpc/smp: Move shared_processor static key to smp.h

2023-10-20 Thread Michael Ellerman

Srikar Dronamraju  writes:
> * Michael Ellerman  [2023-10-19 15:41:40]:
>
>> Srikar Dronamraju  writes:
>> > The ability to detect if the system is running in a shared processor
>> > mode is helpful in few more generic cases not just in
>> > paravirtualization.
>> > For example: At boot time, different scheduler/ topology flags may be
>> > set based on the processor mode. Hence move it to a more generic file.
>> 
>> I'd rather you just included paravirt.h in the few files where you need it.
>
> I thought, detecting if a Processor was shared or not was more a
> smp/processor related than a paravirt related.

It's both really :)

It's definitely paravirt related though, because if we weren't
*para*virt then we wouldn't know there was a hypervisor at all :)

But having smaller more focused headers is preferable in general just
for mechanical reasons.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 3/6] powerpc/smp: Move shared_processor static key to smp.h

2023-10-18 Thread Michael Ellerman

Srikar Dronamraju  writes:
> The ability to detect if the system is running in a shared processor
> mode is helpful in few more generic cases not just in
> paravirtualization.
> For example: At boot time, different scheduler/ topology flags may be
> set based on the processor mode. Hence move it to a more generic file.

I'd rather you just included paravirt.h in the few files where you need it.

cheers

> diff --git a/arch/powerpc/include/asm/paravirt.h 
> b/arch/powerpc/include/asm/paravirt.h
> index 0372b0093f72..cf83e837a571 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -15,13 +15,6 @@
>  #include 
>  #include 
>  
> -DECLARE_STATIC_KEY_FALSE(shared_processor);
> -
> -static inline bool is_shared_processor(void)
> -{
> - return static_branch_unlikely(_processor);
> -}
> -
>  #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
>  extern struct static_key paravirt_steal_enabled;
>  extern struct static_key paravirt_steal_rq_enabled;
> @@ -77,11 +70,6 @@ static inline bool is_vcpu_idle(int vcpu)
>   return lppaca_of(vcpu).idle;
>  }
>  #else
> -static inline bool is_shared_processor(void)
> -{
> - return false;
> -}
> -
>  static inline u32 yield_count_of(int cpu)
>  {
>   return 0;
> diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
> index 576d0e15..08631b2a4528 100644
> --- a/arch/powerpc/include/asm/smp.h
> +++ b/arch/powerpc/include/asm/smp.h
> @@ -34,6 +34,20 @@ extern bool coregroup_enabled;
>  extern int cpu_to_chip_id(int cpu);
>  extern int *chip_id_lookup_table;
>  
> +#ifdef CONFIG_PPC_SPLPAR
> +DECLARE_STATIC_KEY_FALSE(shared_processor);
> +
> +static inline bool is_shared_processor(void)
> +{
> + return static_branch_unlikely(_processor);
> +}
> +#else
> +static inline bool is_shared_processor(void)
> +{
> + return false;
> +}
> +#endif
> +
>  DECLARE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map);
>  DECLARE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map);
>  DECLARE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map);
> -- 
> 2.31.1
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] powerpc/paravirt: Improve vcpu_is_preempted

2023-10-18 Thread Michael Ellerman

Hi Srikar,

Srikar Dronamraju  writes:
> PowerVM Hypervisor dispatches on a whole core basis. In a shared LPAR, a
> CPU from a core that is CEDED or preempted may have a larger latency. In
> such a scenario, its preferable to choose a different CPU to run.
>
> If one of the CPUs in the core is active, i.e neither CEDED nor
> preempted, then consider this CPU as not preempted.
>
> Also if any of the CPUs in the core has yielded but OS has not requested
> CEDE or CONFER, then consider this CPU to be preempted.

I think the change is OK, but the change log and comments are slightly
confusing IMHO.

In several places you use "this CPU", but that usually means "the CPU
the code is currently executing on".

I think it would be clearer if you used eg. "target CPU" or something to
make it clear that you're not talking about the currently executing CPU.

cheers

> Correct detection of preempted CPUs is important for detecting idle
> CPUs/cores in task scheduler.
>
> Changelog:
> v1 -> v2: Handle lppaca_of(cpu) in !PPC_SPLPAR case.
> v1: 
> https://lore.kernel.org/r/20231009051740.17683-1-srikar%40linux.vnet.ibm.com
> 1. Fixed some compilation issues reported by kernelbot
> a. https://lore.kernel.org/oe-kbuild-all/202310102341.k0sgoqql-...@intel.com/
> b.  https://lore.kernel.org/oe-kbuild-all/202310091636.lelmjkyv-...@intel.com/
> 2. Resolved comments from Shrikanth

That change log should appear below the break "---".

> Tested-by: Aboorva Devarajan 
> Reviewed-by: Shrikanth Hegde 
> Signed-off-by: Srikar Dronamraju 
> ---
>  arch/powerpc/include/asm/paravirt.h | 42 ++---
>  1 file changed, 39 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/paravirt.h 
> b/arch/powerpc/include/asm/paravirt.h
> index e08513d73119..0372b0093f72 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -71,6 +71,11 @@ static inline void yield_to_any(void)
>  {
>   plpar_hcall_norets_notrace(H_CONFER, -1, 0);
>  }
> +
> +static inline bool is_vcpu_idle(int vcpu)
> +{
> + return lppaca_of(vcpu).idle;
> +}
>  #else
>  static inline bool is_shared_processor(void)
>  {
> @@ -100,6 +105,10 @@ static inline void prod_cpu(int cpu)
>   ___bad_prod_cpu(); /* This would be a bug */
>  }
>  
> +static inline bool is_vcpu_idle(int vcpu)
> +{
> + return false;
> +}
>  #endif
>  
>  #define vcpu_is_preempted vcpu_is_preempted
> @@ -121,9 +130,19 @@ static inline bool vcpu_is_preempted(int cpu)
>   if (!is_shared_processor())
>   return false;
>  
> + if (!(yield_count_of(cpu) & 1))
> + return false;

Would be nice for that to have a short comment too.

> +
> + /*
> +  * If CPU has yielded to Hypervisor but OS has not requested idle
> +  * then this CPU is definitely preempted.

eg. If the target CPU has yielded to the Hypervisor, but the OS has not
requested idle then the target CPU has definitely been preempted.

> +  */
> + if (!is_vcpu_idle(cpu))
> + return true;
> +
>  #ifdef CONFIG_PPC_SPLPAR
>   if (!is_kvm_guest()) {
> - int first_cpu;
> + int first_cpu, i;
>  
>   /*
>* The result of vcpu_is_preempted() is used in a
> @@ -149,11 +168,28 @@ static inline bool vcpu_is_preempted(int cpu)
>*/
>   if (cpu_first_thread_sibling(cpu) == first_cpu)
>   return false;
> +
> + /*
> +  * If any of the threads of this core is not preempted or
> +  * ceded, then consider this CPU to be non-preempted
> +  */

eg. If any of the threads of the target CPU's core are not preempted or
ceded, then consider that the target CPU is also not preempted.

> + first_cpu = cpu_first_thread_sibling(cpu);
> + for (i = first_cpu; i < first_cpu + threads_per_core; i++) {
> + if (i == cpu)
> + continue;
> + if (!(yield_count_of(i) & 1))
> + return false;
> + if (!is_vcpu_idle(i))
> + return true;
> + }
>   }
>  #endif
>  
> - if (yield_count_of(cpu) & 1)
> - return true;
> + /*
> +  * None of the threads in this core are running but none of
> +  * them were preempted too. Hence assume the thread to be
> +  * non-preempted.
> +  */
>   return false;
>  }
>  
>
> base-commit: eddc90ea2af5933249ea1a78119f2c8ef8d07156
> -- 
> 2.31.1
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_pci: use irq to detect interrupt support

2022-10-12 Thread Michael Ellerman

"Michael S. Tsirkin"  writes:
> commit 71491c54eafa ("virtio_pci: don't try to use intxif pin is zero")
> breaks virtio_pci on powerpc, when running as a qemu guest.
>
> vp_find_vqs() bails out because pci_dev->pin == 0.
>
> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
> succeed if we called it - which is what the code used to do.
>
> This seems to happen because pci_dev->pin is not populated in
> pci_assign_irq().
>
> Which is absolutely a bug in the relevant PCI code, but it
> may also affect other platforms that use of_irq_parse_and_map_pci().
>
> However Linus said:
>   The correct way to check for "no irq" doesn't use NO_IRQ at all, it 
> just does
>   if (dev->irq) ...
> so let's just check irq and be done with it.
>
> Suggested-by: Linus Torvalds 
> Reported-by: Michael Ellerman 
> Fixes: 71491c54eafa ("virtio_pci: don't try to use intxif pin is zero")
> Cc: "Angus Chen" 
> Signed-off-by: Michael S. Tsirkin 
> ---
>
> Build tested only - very late here. Angus any chance you could
> help test this? Thanks!

This works for me on powerpc.

Tested-by: Michael Ellerman 

cheers

> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 4df77eeb4d16..a6c86f916dbd 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -409,8 +409,8 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned int 
> nvqs,
>   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false, ctx, 
> desc);
>   if (!err)
>   return 0;
> - /* Is there an interrupt pin? If not give up. */
> - if (!(to_vp_device(vdev)->pci_dev->pin))
> + /* Is there an interrupt? If not give up. */
> + if (!(to_vp_device(vdev)->pci_dev->irq))
>   return err;
>   /* Finally fall back to regular interrupts. */
>   return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names, ctx);
> -- 
> MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael Ellerman

Michael Ellerman  writes:
> [ Cc += Bjorn & linux-pci ]
>
> "Michael S. Tsirkin"  writes:
>> On Wed, Oct 12, 2022 at 05:21:24PM +1100, Michael Ellerman wrote:
>>> "Michael S. Tsirkin"  writes:
> ...
>>> > 
>>> > virtio: fixes, features
>>> >
>>> > 9k mtu perf improvements
>>> > vdpa feature provisioning
>>> > virtio blk SECURE ERASE support
>>> >
>>> > Fixes, cleanups all over the place.
>>> >
>>> > Signed-off-by: Michael S. Tsirkin 
>>> >
>>> > 
>>> > Alvaro Karsz (1):
>>> >   virtio_blk: add SECURE ERASE command support
>>> >
>>> > Angus Chen (1):
>>> >   virtio_pci: don't try to use intxif pin is zero
>>> 
>>> This commit breaks virtio_pci for me on powerpc, when running as a qemu
>>> guest.
>>> 
>>> vp_find_vqs() bails out because pci_dev->pin == 0.
>>> 
>>> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
>>> succeed if we called it - which is what the code used to do.
>>> 
>>> I think this happens because pci_dev->pin is not populated in
>>> pci_assign_irq().
>>> 
>>> I would absolutely believe this is bug in our PCI code, but I think it
>>> may also affect other platforms that use of_irq_parse_and_map_pci().
>>
>> How about fixing this in of_irq_parse_and_map_pci then?
>> Something like the below maybe?
>> 
>> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
>> index 196834ed44fe..504c4d75c83f 100644
>> --- a/drivers/pci/of.c
>> +++ b/drivers/pci/of.c
>> @@ -446,6 +446,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, 
>> struct of_phandle_args *
>>  if (pin == 0)
>>  return -ENODEV;
>>  
>> +pdev->pin = pin;
>> +
>>  /* Local interrupt-map in the device node? Use it! */
>>  if (of_get_property(dn, "interrupt-map", NULL)) {
>>  pin = pci_swizzle_interrupt_pin(pdev, pin);

Backing up a bit. Should the virtio code be looking at pci_dev->pin in
the first place?

Shouldn't it be checking pci_dev->irq instead?

The original commit talks about irq being 0 and colliding with the timer
interrupt.

But all (most?) platforms have converged on 0 meaning NO_IRQ since quite
a fews ago AFAIK.

And the timer irq == 0 is a special case AIUI:
  
https://lore.kernel.org/all/ca+55afwilp1z+2mzkrfsid1wzq0tqkcn8f2e6nl_avr+m1f...@mail.gmail.com/

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_pci: read interrupt pin directly

2022-10-12 Thread Michael Ellerman

"Michael S. Tsirkin"  writes:
> commit 71491c54eafa ("virtio_pci: don't try to use intxif pin is zero")
> breaks virtio_pci on powerpc, when running as a qemu guest.
>
> vp_find_vqs() bails out because pci_dev->pin == 0.
>
> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
> succeed if we called it - which is what the code used to do.
>
> This seems to happen because pci_dev->pin is not populated in
> pci_assign_irq().
>
> Which is absolutely a bug in the relevant PCI code, but it
> may also affect other platforms that use of_irq_parse_and_map_pci().
>
> Work around the issue in virtio for now, and let's try to fix
> all affected pci systems and then we can revert this.
>
> Reported-by: Michael Ellerman 
> Fixes: 71491c54eafa ("virtio_pci: don't try to use intxif pin is zero")
> Cc: "Angus Chen" 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/virtio/virtio_pci_common.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 4df77eeb4d16..6155ea4e7e4b 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -400,6 +400,7 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned int 
> nvqs,
>   struct irq_affinity *desc)
>  {
>   int err;
> + u8 pin = 0;
>  
>   /* Try MSI-X with one vector per queue. */
>   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, true, ctx, 
> desc);
> @@ -409,8 +410,13 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned int 
> nvqs,
>   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false, ctx, 
> desc);
>   if (!err)
>   return 0;
> - /* Is there an interrupt pin? If not give up. */
> - if (!(to_vp_device(vdev)->pci_dev->pin))
> + /*
> +  * Is there an interrupt pin? If not give up.
> +  * NB: It would seem to be better to use pci_dev->pin - unfortunately
> +  * not all platforms populate it.
> +  */
> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
> + if (!pin)
>   return err;
>       /* Finally fall back to regular interrupts. */
>   return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names, ctx);

Needs the delta below in order to compile.

But with that it fixes the issue for me.

Tested-by: Michael Ellerman 

cheers


diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 6155ea4e7e4b..cae134e2573f 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -415,7 +415,7 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned int 
nvqs,
 * NB: It would seem to be better to use pci_dev->pin - unfortunately
 * not all platforms populate it.
 */
-   pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
+   pci_read_config_byte(to_vp_device(vdev)->pci_dev, PCI_INTERRUPT_PIN, 
);
if (!pin)
return err;
/* Finally fall back to regular interrupts. */
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael Ellerman

[ Cc += Bjorn & linux-pci ]

"Michael S. Tsirkin"  writes:
> On Wed, Oct 12, 2022 at 05:21:24PM +1100, Michael Ellerman wrote:
>> "Michael S. Tsirkin"  writes:
...
>> > 
>> > virtio: fixes, features
>> >
>> > 9k mtu perf improvements
>> > vdpa feature provisioning
>> > virtio blk SECURE ERASE support
>> >
>> > Fixes, cleanups all over the place.
>> >
>> > Signed-off-by: Michael S. Tsirkin 
>> >
>> > 
>> > Alvaro Karsz (1):
>> >   virtio_blk: add SECURE ERASE command support
>> >
>> > Angus Chen (1):
>> >   virtio_pci: don't try to use intxif pin is zero
>> 
>> This commit breaks virtio_pci for me on powerpc, when running as a qemu
>> guest.
>> 
>> vp_find_vqs() bails out because pci_dev->pin == 0.
>> 
>> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
>> succeed if we called it - which is what the code used to do.
>> 
>> I think this happens because pci_dev->pin is not populated in
>> pci_assign_irq().
>> 
>> I would absolutely believe this is bug in our PCI code, but I think it
>> may also affect other platforms that use of_irq_parse_and_map_pci().
>
> How about fixing this in of_irq_parse_and_map_pci then?
> Something like the below maybe?
> 
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 196834ed44fe..504c4d75c83f 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -446,6 +446,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, 
> struct of_phandle_args *
>   if (pin == 0)
>   return -ENODEV;
>  
> + pdev->pin = pin;
> +
>   /* Local interrupt-map in the device node? Use it! */
>   if (of_get_property(dn, "interrupt-map", NULL)) {
>   pin = pci_swizzle_interrupt_pin(pdev, pin);

That doesn't fix it in all cases, because there's an early return if
there's a struct device_node associated with the pci_dev, before we even
read the pin.

Also the pci_dev is const, and removing the const would propagate to a
few other places.

The other obvious place to fix it would be in pci_assign_irq(), as
below. That fixes this bug for me, but is otherwise very lightly tested.

cheers


diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c
index cc7d26b015f3..0135413b33af 100644
--- a/drivers/pci/setup-irq.c
+++ b/drivers/pci/setup-irq.c
@@ -22,6 +22,15 @@ void pci_assign_irq(struct pci_dev *dev)
int irq = 0;
struct pci_host_bridge *hbrg = pci_find_host_bridge(dev->bus);
 
+   /* Make sure dev->pin is populated */
+   pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
+
+   /* Cope with illegal. */
+   if (pin > 4)
+   pin = 1;
+
+   dev->pin = pin;
+
if (!(hbrg->map_irq)) {
pci_dbg(dev, "runtime IRQ mapping not provided by arch\n");
return;
@@ -34,11 +43,6 @@ void pci_assign_irq(struct pci_dev *dev)
 * time the interrupt line passes through a PCI-PCI bridge we must
 * apply the swizzle function.
 */
-   pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
-   /* Cope with illegal. */
-   if (pin > 4)
-   pin = 1;
-
if (pin) {
/* Follow the chain of bridges, swizzling as we go. */
if (hbrg->swizzle_irq)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael Ellerman

"Michael S. Tsirkin"  writes:
> The following changes since commit 4fe89d07dcc2804c8b562f6c7896a45643d34b2f:
>
>   Linux 6.0 (2022-10-02 14:09:07 -0700)
>
> are available in the Git repository at:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
>
> for you to fetch changes up to 71491c54eafa318fdd24a1f26a1c82b28e1ac21d:
>
>   virtio_pci: don't try to use intxif pin is zero (2022-10-07 20:00:44 -0400)
>
> 
> virtio: fixes, features
>
> 9k mtu perf improvements
> vdpa feature provisioning
> virtio blk SECURE ERASE support
>
> Fixes, cleanups all over the place.
>
> Signed-off-by: Michael S. Tsirkin 
>
> 
> Alvaro Karsz (1):
>   virtio_blk: add SECURE ERASE command support
>
> Angus Chen (1):
>   virtio_pci: don't try to use intxif pin is zero

This commit breaks virtio_pci for me on powerpc, when running as a qemu
guest.

vp_find_vqs() bails out because pci_dev->pin == 0.

But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
succeed if we called it - which is what the code used to do.

I think this happens because pci_dev->pin is not populated in
pci_assign_irq().

I would absolutely believe this is bug in our PCI code, but I think it
may also affect other platforms that use of_irq_parse_and_map_pci().

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: last minute fixup

2022-05-12 Thread Michael Ellerman

Linus Torvalds  writes:
> On Wed, May 11, 2022 at 3:12 AM Michael Ellerman  wrote:
>>
>> Which I read as you endorsing Link: tags :)
>
> I absolutely adore "Link:" tags. They've been great.
>
> But they've been great for links that are *usedful*.
>
> They are wonderful when they link to the original problem.
>
> They are *really* wonderful when they link to some long discussion
> about how to solve the problem.
>
> They are completely useless when they link to "this is the patch
> submission of the SAME DAMN PATCH THAT THE COMMIT IS".

Folks wanted to add Change-Id: tags to every commit.

You said we didn't need to, because we have the Link: to the original
patch submission, which includes the Message-Id and therefore is a
de facto change id.

Links to other random places don't serve that function.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: last minute fixup

2022-05-11 Thread Michael Ellerman

Konstantin Ryabitsev  writes:
...
>
> I think we should simply disambiguate the trailer added by tooling like b4.
> Instead of using Link:, it can go back to using Message-Id, which is already
> standard with git -- it's trivial for git.kernel.org to link them to
> lore.kernel.org.

But my mailer, editor and terminal don't know what to do with a Message-Id.

Whereas they can all open an https link.

Making people paste message ids into lore to see the original submission
is not a win. People make enough fun of us already for still using email
to submit patches, let's not make their job any easier :)

> Before:
>
> Signed-off-by: Main Tainer 
> Link: 
> https://lore.kernel.org/r/CAHk-=wgAk3NEJ2PHtb0jXzCUOGytiHLq=rzjkfkfpiuh-sr...@mail.gmail.com
>
> After:
>
> Signed-off-by: Main Tainer 
> Message-Id: 
> 
>
> This would allow people to still use Link: for things like linking to actual
> ML discussions. I know this pollutes commits a bit, but I would argue that
> this is a worthwhile trade-off that allows us to improve our automation and
> better scale maintainers.

I went back through the history and I'm pretty sure that the original use
for "Link:" was to link to the original submission, done by tip-bot
starting back in 2011:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f994d99cf140dbb637e49882891c89b3fd84becd

Prior to that there were no "Link:" tags, only "BugLink:".

But if people want to reclaim "Link:" for generic links then fine, but
let's still use a https link, just give it a different name.
eg. "PatchLink:", or "Submitted:" etc.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: last minute fixup

2022-05-11 Thread Michael Ellerman

Linus Torvalds  writes:
> On Tue, May 10, 2022 at 5:24 AM Michael S. Tsirkin  wrote:
>>
>> A last minute fixup of the transitional ID numbers.
>> Important to get these right - if users start to depend on the
>> wrong ones they are very hard to fix.
>
> Hmm. I've pulled this, but those numbers aren't exactly "new".
>
> They've been that way since 5.14, so what makes you think people
> haven't already started depending on them?
>
> And - once again - I want to complain about the "Link:" in that commit.
>
> It points to a completely useless patch submission. It doesn't point
> to anything useful at all.
>
> I think it's a disease that likely comes from "b4", and people decided
> that "hey, I can use the -l parameter to add that Link: field", and it
> looks better that way.

Some folks have been doing it since the early 2010s.

But I think it really took off after the Change-Id discussion a few
years back:

https://lore.kernel.org/all/CAHk-=whFbgy4RXG11c_=S7O-248oWmwB_aZOcWzWMVh3w7=r...@mail.gmail.com/

Which I read as you endorsing Link: tags :)

> And then they add it all the time, whether it makes any sense or not.

For me it always makes sense, because it means I can easily go from a
commit back to the original submission. That's useful for automating
various things like replies and patchwork status updates.

It allows me to find the exact patch I applied, even if what I committed
is slightly different (due to fuzz or editing), which would be harder
with a search based approach.

It gives us a way to essentially augment the change log after the fact,
by replying to the original patch with things we didn't know at the time
of commit - eg. this patch was reverted because it caused a bug, etc.

If you follow the Link: and there's nothing useful there explaining
what motivated the change then that's a bug in the patch submission, not
the Link: tag.

Really important information should be in the change log itself, but the
space below the "---" is perfect for added context that would be too
verbose for the committed change log. And anyone can reply to the
original submission to add even more useful information.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: update_balloon_size_func blocked for more than 120 seconds

2021-11-23 Thread Michael Ellerman

David Hildenbrand  writes:
> On Thu, Nov 11, 2021 at 11:49 PM Luis Chamberlain  wrote:
>>
>> I get the following splats with a kvm guest in idle, after a few seconds
>> it starts:
>>
>> [  242.412806] INFO: task kworker/6:2:271 blockedfor more than 120 seconds.
>> [  242.415790]   Tainted: GE 5.15.0-next-2021 #68
>> [  242.417755] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [  242.418332] task:kworker/6:2 state:D stack:0 pid:  271 ppid: 2 
>> flags:0x4000
>> [  242.418954] Workqueue: events_freezable update_balloon_size_func 
>> [virtio_balloon]
>> [  242.419518] Call Trace:
>> [  242.419709]  
>> [  242.419873]  __schedule+0x2fd/0x990
>> [  242.420142]  schedule+0x4e/0xc0
>> [  242.420382]  tell_host+0xaa/0xf0 [virtio_balloon]
>> [  242.420757]  ? do_wait_intr_irq+0xa0/0xa0
>> [  242.421065]  update_balloon_size_func+0x2c9/0x2e0 [virtio_balloon]
>> [  242.421527]  process_one_work+0x1e5/0x3c0
>> [  242.421833]  worker_thread+0x50/0x3b0
>> [  242.422204]  ? rescuer_thread+0x370/0x370
>> [  242.422507]  kthread+0x169/0x190
>> [  242.422754]  ? set_kthread_struct+0x40/0x40
>> [  242.423073]  ret_from_fork+0x1f/0x30
>> [  242.423347]  
>>
>> And this goes on endlessly. The last one says it blocked for more than 1208
>> seconds. This was not happening until the last few weeks but I see no
>> relevant recent commits for virtio_balloon, so the related change could
>> be elsewhere.
>
> We're stuck somewhere in:
>
> wq: update_balloon_size_func()->fill_balloon()->tell_host()
>
> Most probably in wait_event().
>
>
> I am no waitqueue expert, but my best guess would be that we're
> waiting more than 2 minutes
> on a host reply with TASK_UNINTERRUPTIBLE. At least that's my interpretation,
>
> In case we're really stuck for more than 2 minutes, the hypervisor
> might not be processing our
> requests anymore -- or it's not getting processed for some other reason (or 
> the
> waitqueue is not getting woken up due do some other BUG).
>
> IIUC, we can sleep longer via wait_event_interruptible(), TASK_UNINTERRUPTIBLE
> seems to be the issue that triggers the warning. But by changing that
> might just be hiding the fact that
> we're waiting more than 2 minutes on a reply.
>
>>
>> I could bisect but first I figured I'd check to see if someone already
>> had spotted this.
>
> Bisecting would be awesome, then we might at least know if this is a
> guest or a hypervisor issue.

I see this on ppc64le also.

I bisected it to:

  # first bad commit: [939779f5152d161b34f612af29e7dc1ac4472fcf] virtio_ring: 
validate used buffer length

I also reported it in the thread hanging off that patch:

  https://lore.kernel.org/lkml/87zgpupcga@mpe.ellerman.id.au/


cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V5 1/4] virtio_ring: validate used buffer length

2021-11-23 Thread Michael Ellerman

"Michael S. Tsirkin"  writes:
> On Tue, Nov 23, 2021 at 10:25:20AM +0800, Jason Wang wrote:
>> On Tue, Nov 23, 2021 at 4:24 AM Halil Pasic  wrote:
>> >
>> > On Mon, 22 Nov 2021 14:25:26 +0800
>> > Jason Wang  wrote:
>> >
>> > > I think the fixes are:
>> > >
>> > > 1) fixing the vhost vsock
>> > > 2) use suppress_used_validation=true to let vsock driver to validate
>> > > the in buffer length
>> > > 3) probably a new feature so the driver can only enable the validation
>> > > when the feature is enabled.
>> >
>> > I'm not sure, I would consider a F_DEV_Y_FIXED_BUG_X a perfectly good
>> > feature. Frankly the set of such bugs is device implementation
>> > specific and it makes little sense to specify a feature bit
>> > that says the device implementation claims to adhere to some
>> > aspect of the specification. Also what would be the semantic
>> > of not negotiating F_DEV_Y_FIXED_BUG_X?
>> 
>> Yes, I agree. Rethink of the feature bit, it seems unnecessary,
>> especially considering the driver should not care about the used
>> length for tx.
>> 
>> >
>> > On the other hand I see no other way to keep the validation
>> > permanently enabled for fixed implementations, and get around the problem
>> > with broken implementations. So we could have something like
>> > VHOST_USED_LEN_STRICT.
>> 
>> It's more about a choice of the driver's knowledge. For vsock TX it
>> should be fine. If we introduce a parameter and disable it by default,
>> it won't be very useful.
>> 
>> >
>> > Maybe, we can also think of 'warn and don't alter behavior' instead of
>> > 'warn' and alter behavior. Or maybe even not having such checks on in
>> > production, but only when testing.
>> 
>> I think there's an agreement that virtio drivers need more hardening,
>> that's why a lot of patches were merged. Especially considering the
>> new requirements came from confidential computing, smart NIC and
>> VDUSE. For virtio drivers, enabling the validation may help to
>> 
>> 1) protect the driver from the buggy and malicious device
>> 2) uncover the bugs of the devices (as vsock did, and probably rpmsg)
>> 3) force the have a smart driver that can do the validation itself
>> then we can finally remove the validation in the core
>> 
>> So I'd like to keep it enabled.
>> 
>> Thanks
>
> Let's see how far we can get. But yes, maybe we were too aggressive in
> breaking things by default, a warning might be a better choice for a
> couple of cycles.

This series appears to break the virtio_balloon driver as well.

The symptom is soft lockup warnings, eg:

  INFO: task kworker/1:1:109 blocked for more than 614 seconds.
Not tainted 5.16.0-rc2-gcc-10.3.0 #21
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:kworker/1:1 state:D stack:12496 pid:  109 ppid: 2 
flags:0x0800
  Workqueue: events_freezable update_balloon_size_func
  Call Trace:
  [c3cef7c0] [c3cef820] 0xc3cef820 (unreliable)
  [c3cef9b0] [c001e238] __switch_to+0x1e8/0x2f0
  [c3cefa10] [c0f0a00c] __schedule+0x2cc/0xb50
  [c3cefae0] [c0f0a8fc] schedule+0x6c/0x140
  [c3cefb10] [c095b6c4] tell_host+0xe4/0x130
  [c3cefba0] [c095d234] update_balloon_size_func+0x394/0x3f0
  [c3cefc70] [c0178064] process_one_work+0x2c4/0x5b0
  [c3cefd10] [c01783f8] worker_thread+0xa8/0x640
  [c3cefda0] [c0185444] kthread+0x1b4/0x1c0
  [c3cefe10] [c000cee4] ret_from_kernel_thread+0x5c/0x64

Similar backtrace reported here by Luis:

  https://lore.kernel.org/lkml/yy2duti0wayak...@bombadil.infradead.org/

Bisect points to:

  # first bad commit: [939779f5152d161b34f612af29e7dc1ac4472fcf] virtio_ring: 
validate used buffer length

Adding suppress used validation to the virtio balloon driver "fixes" it, eg.

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index c22ff0117b46..a14b82ceebb2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -1150,6 +1150,7 @@ static unsigned int features[] = {
 };
 
 static struct virtio_driver virtio_balloon_driver = {
+   .suppress_used_validation = true,
.feature_table = features,
.feature_table_size = ARRAY_SIZE(features),
.driver.name =  KBUILD_MODNAME,


cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-08-20 Thread Michael Ellerman

Daniel Axtens  writes:
> Xianting Tian  writes:
>
>> As well known, hvc backend driver(eg, virtio-console) can register its
>> operations to hvc framework. The operations can contain put_chars(),
>> get_chars() and so on.
>>
>> Some hvc backend may do dma in its operations. eg, put_chars() of
>> virtio-console. But in the code of hvc framework, it may pass DMA
>> incapable memory to put_chars() under a specific configuration, which
>> is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
>
> We could also run into issues on powerpc where Andrew is working on
> adding vmap-stack but the opal hvc driver assumes that it is passed a
> buffer which is not in vmalloc space but in the linear mapping.

The right fix for that is our code that calls opal has to be careful
that it's not passing vmalloc addresses.

We have many cases where we pass stack variables to opal, they'll all
have to be fixed to pass the underlying phyiscal/linear map address. The
opal hvc code will just be one more case of that.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1 05/12] mm/memory_hotplug: remove nid parameter from remove_memory() and friends

2021-06-08 Thread Michael Ellerman

David Hildenbrand  writes:
> There is only a single user remaining. We can simply try to offline all
> online nodes - which is fast, because we usually span pages and can skip
> such nodes right away.

That makes me slightly nervous, because our big powerpc boxes tend to
trip on these scaling issues before others.

But the spanned pages check is just:

void try_offline_node(int nid)
{
pg_data_t *pgdat = NODE_DATA(nid);
...
if (pgdat->node_spanned_pages)
return;

So I guess that's pretty cheap, and it's only O(nodes), which should
never get that big.

> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: "Rafael J. Wysocki" 
> Cc: Len Brown 
> Cc: Dan Williams 
> Cc: Vishal Verma 
> Cc: Dave Jiang 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Andrew Morton 
> Cc: Nathan Lynch 
> Cc: Laurent Dufour 
> Cc: "Aneesh Kumar K.V" 
> Cc: Scott Cheloha 
> Cc: Anton Blanchard 
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-a...@vger.kernel.org
> Cc: nvd...@lists.linux.dev
> Signed-off-by: David Hildenbrand 
> ---
>  .../platforms/pseries/hotplug-memory.c|  9 -
>  drivers/acpi/acpi_memhotplug.c|  7 +--
>  drivers/dax/kmem.c|  3 +--
>  drivers/virtio/virtio_mem.c   |  4 ++--
>  include/linux/memory_hotplug.h| 10 +-
>  mm/memory_hotplug.c   | 20 +--
>  6 files changed, 23 insertions(+), 30 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 8377f1f7c78e..4a9232ddbefe 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -286,7 +286,7 @@ static int pseries_remove_memblock(unsigned long base, 
> unsigned long memblock_si
>  {
>   unsigned long block_sz, start_pfn;
>   int sections_per_block;
> - int i, nid;
> + int i;
>  
>   start_pfn = base >> PAGE_SHIFT;
>  
> @@ -297,10 +297,9 @@ static int pseries_remove_memblock(unsigned long base, 
> unsigned long memblock_si
>  
>   block_sz = pseries_memory_block_size();
>   sections_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> - nid = memory_add_physaddr_to_nid(base);
>  
>   for (i = 0; i < sections_per_block; i++) {
> - __remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
> + __remove_memory(base, MIN_MEMORY_BLOCK_SIZE);
>   base += MIN_MEMORY_BLOCK_SIZE;
>   }
>  
> @@ -386,7 +385,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
>  
>   block_sz = pseries_memory_block_size();
>  
> - __remove_memory(mem_block->nid, lmb->base_addr, block_sz);
> + __remove_memory(lmb->base_addr, block_sz);
>   put_device(_block->dev);
>  
>   /* Update memory regions for memory remove */
> @@ -638,7 +637,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>  
>   rc = dlpar_online_lmb(lmb);
>   if (rc) {
> - __remove_memory(nid, lmb->base_addr, block_sz);
> + __remove_memory(lmb->base_addr, block_sz);
>   invalidate_lmb_associativity_index(lmb);
>   } else {
>   lmb->flags |= DRCONF_MEM_ASSIGNED;


Acked-by: Michael Ellerman  (powerpc)

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1 04/12] mm/memory_hotplug: remove nid parameter from arch_remove_memory()

2021-06-08 Thread Michael Ellerman

David Hildenbrand  writes:

> The parameter is unused, let's remove it.
>
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Christian Borntraeger 
> Cc: Yoshinori Sato 
> Cc: Rich Felker 
> Cc: Dave Hansen 
> Cc: Andy Lutomirski 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: x...@kernel.org
> Cc: "H. Peter Anvin" 
> Cc: Andrew Morton 
> Cc: Anshuman Khandual 
> Cc: Ard Biesheuvel 
> Cc: Mike Rapoport 
> Cc: Nicholas Piggin 
> Cc: Pavel Tatashin 
> Cc: Baoquan He 
> Cc: Laurent Dufour 
> Cc: Sergei Trofimovich 
> Cc: Kefeng Wang 
> Cc: Michel Lespinasse 
> Cc: Christophe Leroy 
> Cc: "Aneesh Kumar K.V" 
> Cc: Thiago Jung Bauermann 
> Cc: Joe Perches 
> Cc: Pierre Morel 
> Cc: Jia He 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-i...@vger.kernel.org
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Signed-off-by: David Hildenbrand 
> ---
>  arch/arm64/mm/mmu.c| 3 +--
>  arch/ia64/mm/init.c| 3 +--
>  arch/powerpc/mm/mem.c  | 3 +--
>  arch/s390/mm/init.c| 3 +--
>  arch/sh/mm/init.c  | 3 +--
>  arch/x86/mm/init_32.c  | 3 +--
>  arch/x86/mm/init_64.c  | 3 +--
>  include/linux/memory_hotplug.h | 3 +--
>  mm/memory_hotplug.c| 4 ++--
>  mm/memremap.c  | 5 +
>  10 files changed, 11 insertions(+), 22 deletions(-)
>
...
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 043bbeaf407c..fc5c36189c26 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -115,8 +115,7 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>   return rc;
>  }
>  
> -void __ref arch_remove_memory(int nid, u64 start, u64 size,
> -   struct vmem_altmap *altmap)
> +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap 
> *altmap)
>  {
>   unsigned long start_pfn = start >> PAGE_SHIFT;
>   unsigned long nr_pages = size >> PAGE_SHIFT;

Acked-by: Michael Ellerman  (powerpc)

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends

2020-09-09 Thread Michael Ellerman

David Hildenbrand  writes:
> On 09.09.20 09:17, Greg Kroah-Hartman wrote:
>> On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote:
>>> We soon want to pass flags, e.g., to mark added System RAM resources.
>>> mergeable. Prepare for that.
>> 
>> What are these random "flags", and how do we know what should be passed
>> to them?
>> 
>> Why not make this an enumerated type so that we know it all works
>> properly, like the GPF_* flags are?  Passing around a random unsigned
>> long feels very odd/broken...
>
> Agreed, an enum (mhp_flags) seems to give a better hint what can
> actually be passed. Thanks!

You probably know this but ...

Just using a C enum doesn't get you any type safety.

You can get some checking via sparse by using __bitwise, which is what
gfp_t does. You don't actually have to use an enum for that, it works
with #defines also.

Or you can wrap the flag in a struct, the way atomic_t does, and then
the compiler will prevent passing plain integers in place of your custom
type.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-09 Thread Michael Ellerman

Nicholas Piggin  writes:

> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/paravirt.h   | 28 
>  arch/powerpc/include/asm/qspinlock.h  | 66 +++
>  arch/powerpc/include/asm/qspinlock_paravirt.h |  7 ++
>  arch/powerpc/platforms/pseries/Kconfig|  5 ++
>  arch/powerpc/platforms/pseries/setup.c|  6 +-
>  include/asm-generic/qspinlock.h   |  2 +

Another ack?

> diff --git a/arch/powerpc/include/asm/paravirt.h 
> b/arch/powerpc/include/asm/paravirt.h
> index 7a8546660a63..f2d51f929cf5 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -45,6 +55,19 @@ static inline void yield_to_preempted(int cpu, u32 
> yield_count)
>  {
>   ___bad_yield_to_preempted(); /* This would be a bug */
>  }
> +
> +extern void ___bad_yield_to_any(void);
> +static inline void yield_to_any(void)
> +{
> + ___bad_yield_to_any(); /* This would be a bug */
> +}

Why do we do that rather than just not defining yield_to_any() at all
and letting the build fail on that?

There's a condition somewhere that we know will false at compile time
and drop the call before linking?

> diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h 
> b/arch/powerpc/include/asm/qspinlock_paravirt.h
> new file mode 100644
> index ..750d1b5e0202
> --- /dev/null
> +++ b/arch/powerpc/include/asm/qspinlock_paravirt.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef __ASM_QSPINLOCK_PARAVIRT_H
> +#define __ASM_QSPINLOCK_PARAVIRT_H

_ASM_POWERPC_QSPINLOCK_PARAVIRT_H please.

> +
> +EXPORT_SYMBOL(__pv_queued_spin_unlock);

Why's that in a header? Should that (eventually) go with the generic 
implementation?

> diff --git a/arch/powerpc/platforms/pseries/Kconfig 
> b/arch/powerpc/platforms/pseries/Kconfig
> index 24c18362e5ea..756e727b383f 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -25,9 +25,14 @@ config PPC_PSERIES
>   select SWIOTLB
>   default y
>  
> +config PARAVIRT_SPINLOCKS
> + bool
> + default n

default n is the default.

> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index 2db8469e475f..747a203d9453 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -771,8 +771,12 @@ static void __init pSeries_setup_arch(void)
>   if (firmware_has_feature(FW_FEATURE_LPAR)) {
>   vpa_init(boot_cpuid);
>  
> - if (lppaca_shared_proc(get_lppaca()))
> + if (lppaca_shared_proc(get_lppaca())) {
>   static_branch_enable(_processor);
> +#ifdef CONFIG_PARAVIRT_SPINLOCKS
> + pv_spinlocks_init();
> +#endif
> + }

We could avoid the ifdef with this I think?

diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 434615f1d761..6ec72282888d 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -10,5 +10,9 @@
 #include 
 #endif

+#ifndef CONFIG_PARAVIRT_SPINLOCKS
+static inline void pv_spinlocks_init(void) { }
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_SPINLOCK_H */


cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 4/6] powerpc/64s: implement queued spinlocks and rwlocks

2020-07-09 Thread Michael Ellerman

Nicholas Piggin  writes:
> These have shown significantly improved performance and fairness when
> spinlock contention is moderate to high on very large systems.
>
>  [ Numbers hopefully forthcoming after more testing, but initial
>results look good ]

Would be good to have something here, even if it's preliminary.

> Thanks to the fast path, single threaded performance is not noticably
> hurt.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/Kconfig  | 13 
>  arch/powerpc/include/asm/Kbuild   |  2 ++
>  arch/powerpc/include/asm/qspinlock.h  | 25 +++
>  arch/powerpc/include/asm/spinlock.h   |  5 +
>  arch/powerpc/include/asm/spinlock_types.h |  5 +
>  arch/powerpc/lib/Makefile |  3 +++

>  include/asm-generic/qspinlock.h   |  2 ++

Who's ack do we need for that part?

> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 24ac85c868db..17663ea57697 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -492,6 +494,17 @@ config HOTPLUG_CPU
>  
> Say N if you are unsure.
>  
> +config PPC_QUEUED_SPINLOCKS
> + bool "Queued spinlocks"
> + depends on SMP
> + default "y" if PPC_BOOK3S_64

Not sure about default y? At least until we've got a better idea of the
perf impact on a range of small/big new/old systems.

> + help
> +   Say Y here to use to use queued spinlocks which are more complex
> +   but give better salability and fairness on large SMP and NUMA
> +   systems.
> +
> +   If unsure, say "Y" if you have lots of cores, otherwise "N".

Would be nice if we could give a range for "lots".

> diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
> index dadbcf3a0b1e..1dd8b6adff5e 100644
> --- a/arch/powerpc/include/asm/Kbuild
> +++ b/arch/powerpc/include/asm/Kbuild
> @@ -6,5 +6,7 @@ generated-y += syscall_table_spu.h
>  generic-y += export.h
>  generic-y += local64.h
>  generic-y += mcs_spinlock.h
> +generic-y += qrwlock.h
> +generic-y += qspinlock.h

The 2nd line spits a warning about a redundant entry. I think you want
to just drop it.


cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 3/6] powerpc: move spinlock implementation to simple_spinlock

2020-07-09 Thread Michael Ellerman

Nicholas Piggin  writes:
> To prepare for queued spinlocks. This is a simple rename except to update
> preprocessor guard name and a file reference.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/simple_spinlock.h| 292 ++
>  .../include/asm/simple_spinlock_types.h   |  21 ++
>  arch/powerpc/include/asm/spinlock.h   | 285 +
>  arch/powerpc/include/asm/spinlock_types.h |  12 +-
>  4 files changed, 315 insertions(+), 295 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/simple_spinlock.h
>  create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
>
> diff --git a/arch/powerpc/include/asm/simple_spinlock.h 
> b/arch/powerpc/include/asm/simple_spinlock.h
> new file mode 100644
> index ..e048c041c4a9
> --- /dev/null
> +++ b/arch/powerpc/include/asm/simple_spinlock.h
> @@ -0,0 +1,292 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef __ASM_SIMPLE_SPINLOCK_H
> +#define __ASM_SIMPLE_SPINLOCK_H

_ASM_POWERPC_SIMPLE_SPINLOCK_H

> +#ifdef __KERNEL__

Shouldn't be necessary.

> +/*
> + * Simple spin lock operations.  
> + *
> + * Copyright (C) 2001-2004 Paul Mackerras , IBM
> + * Copyright (C) 2001 Anton Blanchard , IBM
> + * Copyright (C) 2002 Dave Engebretsen , IBM
> + *   Rework to support virtual processors
> + *
> + * Type of int is used as a full 64b word is not necessary.
> + *
> + * (the type definitions are in asm/simple_spinlock_types.h)
> + */
> +#include 
> +#include 
> +#ifdef CONFIG_PPC64
> +#include 
> +#endif

I don't think paca.h needs a CONFIG_PPC64 guard, it contains one. I know
you're just moving the code, but still nice to cleanup slightly along
the way.

cheers

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 2/6] powerpc/pseries: move some PAPR paravirt functions to their own file

2020-07-09 Thread Michael Ellerman

Nicholas Piggin  writes:
>

Little bit of changelog would be nice :D

> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/paravirt.h | 61 +
>  arch/powerpc/include/asm/spinlock.h | 24 +---
>  arch/powerpc/lib/locks.c| 12 +++---
>  3 files changed, 68 insertions(+), 29 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/paravirt.h
>
> diff --git a/arch/powerpc/include/asm/paravirt.h 
> b/arch/powerpc/include/asm/paravirt.h
> new file mode 100644
> index ..7a8546660a63
> --- /dev/null
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -0,0 +1,61 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef __ASM_PARAVIRT_H
> +#define __ASM_PARAVIRT_H

Should be _ASM_POWERPC_PARAVIRT_H

> +#ifdef __KERNEL__

We shouldn't need __KERNEL__ in here, it's not a uapi header.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 1/6] powerpc/powernv: must include hvcall.h to get PAPR defines

2020-07-09 Thread Michael Ellerman

Nicholas Piggin  writes:
> An include goes away in future patches which breaks compilation
> without this.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/platforms/powernv/pci-ioda-tce.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c 
> b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
> index f923359d8afc..8eba6ece7808 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
> @@ -15,6 +15,7 @@
>  
>  #include 
>  #include 
> +#include  /* share error returns with PAPR */
>  #include "pci.h"
>  
>  unsigned long pnv_ioda_parse_tce_sizes(struct pnv_phb *phb)
> -- 
> 2.23.0

This isn't needed anymore AFAICS, since:

5f202c1a1d42 ("powerpc/powernv/ioda: Return correct error if TCE level 
allocation failed")

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 5/8] powerpc/64s: implement queued spinlocks and rwlocks

2020-07-03 Thread Michael Ellerman

Nicholas Piggin  writes:
> Excerpts from Will Deacon's message of July 2, 2020 8:35 pm:
>> On Thu, Jul 02, 2020 at 08:25:43PM +1000, Nicholas Piggin wrote:
>>> Excerpts from Will Deacon's message of July 2, 2020 6:02 pm:
>>> > On Thu, Jul 02, 2020 at 05:48:36PM +1000, Nicholas Piggin wrote:
>>> >> diff --git a/arch/powerpc/include/asm/qspinlock.h 
>>> >> b/arch/powerpc/include/asm/qspinlock.h
>>> >> new file mode 100644
>>> >> index ..f84da77b6bb7
>>> >> --- /dev/null
>>> >> +++ b/arch/powerpc/include/asm/qspinlock.h
>>> >> @@ -0,0 +1,20 @@
>>> >> +/* SPDX-License-Identifier: GPL-2.0 */
>>> >> +#ifndef _ASM_POWERPC_QSPINLOCK_H
>>> >> +#define _ASM_POWERPC_QSPINLOCK_H
>>> >> +
>>> >> +#include 
>>> >> +
>>> >> +#define _Q_PENDING_LOOPS(1 << 9) /* not tuned */
>>> >> +
>>> >> +#define smp_mb__after_spinlock()   smp_mb()
>>> >> +
>>> >> +static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
>>> >> +{
>>> >> +smp_mb();
>>> >> +return atomic_read(>val);
>>> >> +}
>>> > 
>>> > Why do you need the smp_mb() here?
>>> 
>>> A long and sad tale that ends here 51d7d5205d338
>>> 
>>> Should probably at least refer to that commit from here, since this one 
>>> is not going to git blame back there. I'll add something.
>> 
>> Is this still an issue, though?
>> 
>> See 38b850a73034 (where we added a similar barrier on arm64) and then
>> c6f5d02b6a0f (where we removed it).
>> 
>
> Oh nice, I didn't know that went away. Thanks for the heads up.

Argh! I spent so much time chasing that damn bug in the ipc code.

> I'm going to say I'm too scared to remove it while changing the
> spinlock algorithm, but I'll open an issue and we should look at 
> removing it.

Sounds good.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] iommu: spapr_tce: Disable compile testing to fix build on book3s_32 config

2020-04-19 Thread Michael Ellerman

Christophe Leroy  writes:
> On 04/14/2020 02:26 PM, Krzysztof Kozlowski wrote:
>> Although SPAPR_TCE_IOMMU itself can be compile tested on certain PowerPC
>> configurations, its presence makes arch/powerpc/kvm/Makefile to select
>> modules which do not build in such configuration.
>> 
>> The arch/powerpc/kvm/ modules use kvm_arch.spapr_tce_tables which exists
>> only with CONFIG_PPC_BOOK3S_64.  However these modules are selected when
>> COMPILE_TEST and SPAPR_TCE_IOMMU are chosen leading to build failures:
>> 
>>  In file included from 
>> arch/powerpc/include/asm/book3s/64/mmu-hash.h:20:0,
>>   from arch/powerpc/kvm/book3s_64_vio_hv.c:22:
>>  arch/powerpc/include/asm/book3s/64/pgtable.h:17:0: error: "_PAGE_EXEC" 
>> redefined [-Werror]
>>   #define _PAGE_EXEC  0x1 /* execute permission */
>> 
>>  In file included from arch/powerpc/include/asm/book3s/32/pgtable.h:8:0,
>>   from arch/powerpc/include/asm/book3s/pgtable.h:8,
>>   from arch/powerpc/include/asm/pgtable.h:18,
>>   from include/linux/mm.h:95,
>>   from arch/powerpc/include/asm/io.h:29,
>>   from include/linux/io.h:13,
>>   from include/linux/irq.h:20,
>>   from arch/powerpc/include/asm/hardirq.h:6,
>>   from include/linux/hardirq.h:9,
>>   from include/linux/kvm_host.h:7,
>>   from arch/powerpc/kvm/book3s_64_vio_hv.c:12:
>>  arch/powerpc/include/asm/book3s/32/hash.h:29:0: note: this is the 
>> location of the previous definition
>>   #define _PAGE_EXEC 0x200 /* software: exec allowed */
>> 
>> Reported-by: Geert Uytterhoeven 
>> Fixes: e93a1695d7fb ("iommu: Enable compile testing for some of drivers")
>> Signed-off-by: Krzysztof Kozlowski 
>> ---
>>   drivers/iommu/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 58b4a4dbfc78..3532b1ead19d 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -362,7 +362,7 @@ config IPMMU_VMSA
>>   
>>   config SPAPR_TCE_IOMMU
>>  bool "sPAPR TCE IOMMU Support"
>> -depends on PPC_POWERNV || PPC_PSERIES || (PPC && COMPILE_TEST)
>> +depends on PPC_POWERNV || PPC_PSERIES
>>  select IOMMU_API
>>  help
>>Enables bits of IOMMU API required by VFIO. The iommu_ops
>> 
>
> Should it be fixed the other way round, something like:

That doesn't actually fix this specific issue, the code will build but
then not link:

  ld: arch/powerpc/../../virt/kvm/vfio.o: in function 
`.kvm_spapr_tce_release_vfio_group':
  vfio.c:(.text.kvm_spapr_tce_release_vfio_group+0xb0): undefined reference to 
`.kvm_spapr_tce_release_iommu_group'
  ld: arch/powerpc/../../virt/kvm/vfio.o: in function `.kvm_vfio_set_group':
  vfio.c:(.text.kvm_vfio_set_group+0x7f4): undefined reference to 
`.kvm_spapr_tce_attach_iommu_group'
  ld: arch/powerpc/kvm/powerpc.o: in function `.kvm_arch_vm_ioctl':
  (.text.kvm_arch_vm_ioctl+0x1a4): undefined reference to 
`.kvm_vm_ioctl_create_spapr_tce'
  ld: (.text.kvm_arch_vm_ioctl+0x230): undefined reference to 
`.kvm_vm_ioctl_create_spapr_tce'
  make[1]: *** [/home/michael/linux/Makefile:1106: vmlinux] Error 1

> diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
> index 2bfeaa13befb..906707d15810 100644
> --- a/arch/powerpc/kvm/Makefile
> +++ b/arch/powerpc/kvm/Makefile
> @@ -135,4 +135,4 @@ obj-$(CONFIG_KVM_BOOK3S_32) += kvm.o
>   obj-$(CONFIG_KVM_BOOK3S_64_PR) += kvm-pr.o
>   obj-$(CONFIG_KVM_BOOK3S_64_HV) += kvm-hv.o
>
> -obj-y += $(kvm-book3s_64-builtin-objs-y)
> +obj-$(CONFIG_KVM_BOOK3S_64) += $(kvm-book3s_64-builtin-objs-y)

But this is probably still a good thing to do, as it would have made the
error messages clearer in this case I think.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: Build regressions/improvements in v5.7-rc1

2020-04-14 Thread Michael Ellerman

Geert Uytterhoeven  writes:
> On Mon, 13 Apr 2020, Geert Uytterhoeven wrote:
>> Below is the list of build error/warning regressions/improvements in
>> v5.7-rc1[1] compared to v5.6[2].
>>
>> Summarized:
>>  - build errors: +132/-3
>>  - build warnings: +257/-79
>>
>> Happy fixing! ;-)
>>
>> Thanks to the linux-next team for providing the build service.
>>
>> [1] 
>> http://kisskb.ellerman.id.au/kisskb/branch/linus/head/8f3d9f354286745c751374f5f1fcafee6b3f3136/
>>  (all 239 configs)
>> [2] 
>> http://kisskb.ellerman.id.au/kisskb/branch/linus/head/7111951b8d4973bda27ff663f2cf18b663d15b48/
>>  (all 239 configs)
>>
>>
>> *** ERRORS ***
>>
>> 132 error regressions:
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/hash.h: error: implicit 
>> declaration of function 'pte_raw' [-Werror=implicit-function-declaration]:  
>> => 192:2
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/hash.h: error: implicit 
>> declaration of function 'pte_raw'; did you mean 'pte_read'? 
>> [-Werror=implicit-function-declaration]:  => 192:8
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 
>> 'SLICE_NUM_HIGH' undeclared here (not in a function):  => 698:2, 698:30
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 'struct 
>> mmu_psize_def' has no member named 'ap':  => 207:28
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 'struct 
>> mmu_psize_def' has no member named 'avpnm':  => 337:57
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 'struct 
>> mmu_psize_def' has no member named 'penc':  => 411:49
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 'struct 
>> mmu_psize_def' has no member named 'penc'; did you mean 'enc'?:  => 411:50
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 'struct 
>> mmu_psize_def' has no member named 'sllp':  => 218:32, 219:26
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 
>> redefinition of 'mmu_psize_to_shift':  => 195:28
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/mmu-hash.h: error: 
>> redefinition of 'shift_to_mmu_psize':  => 185:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> implicit declaration of function 'pgd_raw' 
>> [-Werror=implicit-function-declaration]:  => 35:3
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> implicit declaration of function 'pgd_raw'; did you mean 'pgd_val'? 
>> [-Werror=implicit-function-declaration]:  => 35:13
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> implicit declaration of function 'pud_raw' 
>> [-Werror=implicit-function-declaration]:  => 25:3
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> implicit declaration of function 'pud_raw'; did you mean 'pud_val'? 
>> [-Werror=implicit-function-declaration]:  => 25:13
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> redefinition of 'hugepd_ok':  => 44:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> redefinition of 'pgd_huge':  => 29:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> redefinition of 'pmd_huge':  => 9:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable-4k.h: error: 
>> redefinition of 'pud_huge':  => 19:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: expected 
>> ')' before '!=' token:  => 964:19, 921:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: expected 
>> ')' before '==' token:  => 979:19, 801:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: expected 
>> ')' before '^' token:  => 801:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: expected 
>> ')' before '__vmalloc_end':  => 291:21, 274:21
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: expected 
>> identifier or '(' before '!' token:  => 916:19, 904:19, 939:19, 959:19, 
>> 868:19, 873:19
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: expected 
>> identifier or '(' before 'struct':  => 291:21
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: implicit 
>> declaration of function '__pgd_raw' [-Werror=implicit-function-declaration]: 
>>  => 976:2
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: implicit 
>> declaration of function '__pgd_raw'; did you mean '__fdget_raw'? 
>> [-Werror=implicit-function-declaration]:  => 976:9
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: implicit 
>> declaration of function '__pmd_raw' [-Werror=implicit-function-declaration]: 
>>  => 1075:9, 1075:2
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: implicit 
>> declaration of function '__pte_raw' [-Werror=implicit-function-declaration]: 
>>  => 552:2, 552:9
>>  + /kisskb/src/arch/powerpc/include/asm/book3s/64/pgtable.h: error: implicit

Re: [PATCH] vhost: do not enable VHOST_MENU by default

2020-04-13 Thread Michael Ellerman

Jason Wang  writes:
> We try to keep the defconfig untouched after decoupling CONFIG_VHOST
> out of CONFIG_VIRTUALIZATION in commit 20c384f1ea1a
> ("vhost: refine vhost and vringh kconfig") by enabling VHOST_MENU by
> default. Then the defconfigs can keep enabling CONFIG_VHOST_NET
> without the caring of CONFIG_VHOST.
>
> But this will leave a "CONFIG_VHOST_MENU=y" in all defconfigs and even
> for the ones that doesn't want vhost. So it actually shifts the
> burdens to the maintainers of all other to add "CONFIG_VHOST_MENU is
> not set". So this patch tries to enable CONFIG_VHOST explicitly in
> defconfigs that enables CONFIG_VHOST_NET and CONFIG_VHOST_VSOCK.
>
> Cc: Thomas Bogendoerfer 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Christian Borntraeger 
> Reported-by: Geert Uytterhoeven 
> Signed-off-by: Jason Wang 
> ---
>  arch/mips/configs/malta_kvm_defconfig  |  1 +
>  arch/powerpc/configs/powernv_defconfig |  1 +
>  arch/powerpc/configs/ppc64_defconfig   |  1 +
>  arch/powerpc/configs/pseries_defconfig |  1 +

Fine by me.

Acked-by: Michael Ellerman  (powerpc)

cheers

>  arch/s390/configs/debug_defconfig  |  1 +
>  arch/s390/configs/defconfig|  1 +
>  drivers/vhost/Kconfig  | 18 +-
>  7 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/arch/mips/configs/malta_kvm_defconfig 
> b/arch/mips/configs/malta_kvm_defconfig
> index 8ef612552a19..06f0c7a0ca87 100644
> --- a/arch/mips/configs/malta_kvm_defconfig
> +++ b/arch/mips/configs/malta_kvm_defconfig
> @@ -18,6 +18,7 @@ CONFIG_PCI=y
>  CONFIG_VIRTUALIZATION=y
>  CONFIG_KVM=m
>  CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS=y
> +CONFIG_VHOST=m
>  CONFIG_VHOST_NET=m
>  CONFIG_MODULES=y
>  CONFIG_MODULE_UNLOAD=y
> diff --git a/arch/powerpc/configs/powernv_defconfig 
> b/arch/powerpc/configs/powernv_defconfig
> index 71749377d164..404245b4594d 100644
> --- a/arch/powerpc/configs/powernv_defconfig
> +++ b/arch/powerpc/configs/powernv_defconfig
> @@ -346,5 +346,6 @@ CONFIG_CRYPTO_DEV_VMX=y
>  CONFIG_VIRTUALIZATION=y
>  CONFIG_KVM_BOOK3S_64=m
>  CONFIG_KVM_BOOK3S_64_HV=m
> +CONFIG_VHOST=m
>  CONFIG_VHOST_NET=m
>  CONFIG_PRINTK_TIME=y
> diff --git a/arch/powerpc/configs/ppc64_defconfig 
> b/arch/powerpc/configs/ppc64_defconfig
> index 7e68cb222c7b..4599fc7be285 100644
> --- a/arch/powerpc/configs/ppc64_defconfig
> +++ b/arch/powerpc/configs/ppc64_defconfig
> @@ -61,6 +61,7 @@ CONFIG_ELECTRA_CF=y
>  CONFIG_VIRTUALIZATION=y
>  CONFIG_KVM_BOOK3S_64=m
>  CONFIG_KVM_BOOK3S_64_HV=m
> +CONFIG_VHOST=m
>  CONFIG_VHOST_NET=m
>  CONFIG_OPROFILE=m
>  CONFIG_KPROBES=y
> diff --git a/arch/powerpc/configs/pseries_defconfig 
> b/arch/powerpc/configs/pseries_defconfig
> index 6b68109e248f..4cad3901b5de 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -321,5 +321,6 @@ CONFIG_CRYPTO_DEV_VMX=y
>  CONFIG_VIRTUALIZATION=y
>  CONFIG_KVM_BOOK3S_64=m
>  CONFIG_KVM_BOOK3S_64_HV=m
> +CONFIG_VHOST=m
>  CONFIG_VHOST_NET=m
>  CONFIG_PRINTK_TIME=y
> diff --git a/arch/s390/configs/debug_defconfig 
> b/arch/s390/configs/debug_defconfig
> index 0c86ba19fa2b..6ec6e69630d1 100644
> --- a/arch/s390/configs/debug_defconfig
> +++ b/arch/s390/configs/debug_defconfig
> @@ -57,6 +57,7 @@ CONFIG_PROTECTED_VIRTUALIZATION_GUEST=y
>  CONFIG_CMM=m
>  CONFIG_APPLDATA_BASE=y
>  CONFIG_KVM=m
> +CONFIG_VHOST=m
>  CONFIG_VHOST_NET=m
>  CONFIG_VHOST_VSOCK=m
>  CONFIG_OPROFILE=m
> diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig
> index 6b27d861a9a3..d1b3bf83d687 100644
> --- a/arch/s390/configs/defconfig
> +++ b/arch/s390/configs/defconfig
> @@ -57,6 +57,7 @@ CONFIG_PROTECTED_VIRTUALIZATION_GUEST=y
>  CONFIG_CMM=m
>  CONFIG_APPLDATA_BASE=y
>  CONFIG_KVM=m
> +CONFIG_VHOST=m
>  CONFIG_VHOST_NET=m
>  CONFIG_VHOST_VSOCK=m
>  CONFIG_OPROFILE=m
> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index e79cbbdfea45..14d296dc18cd 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -12,23 +12,18 @@ config VHOST_RING
> This option is selected by any driver which needs to access
> the host side of a virtio ring.
>  
> -config VHOST
> - tristate
> +menuconfig VHOST
> + tristate "Vhost Devices"
>   select VHOST_IOTLB
>   help
> -   This option is selected by any driver which needs to access
> -   the core of vhost.
> -
> -menuconfig VHOST_MENU
> - bool "VHOST drivers"
> - default y
> +   Enable option to

Re: [RESEND PATCH v2 1/9] iomap: Constify ioreadX() iomem argument (as in generic implementation)

2020-03-12 Thread Michael Ellerman

Krzysztof Kozlowski  writes:
> diff --git a/arch/powerpc/kernel/iomap.c b/arch/powerpc/kernel/iomap.c
> index 5ac84efc6ede..9fe4fb3b08aa 100644
> --- a/arch/powerpc/kernel/iomap.c
> +++ b/arch/powerpc/kernel/iomap.c
> @@ -15,23 +15,23 @@
>   * Here comes the ppc64 implementation of the IOMAP 
>   * interfaces.
>   */
> -unsigned int ioread8(void __iomem *addr)
> +unsigned int ioread8(const void __iomem *addr)
>  {
>   return readb(addr);
>  }
> -unsigned int ioread16(void __iomem *addr)
> +unsigned int ioread16(const void __iomem *addr)
>  {
>   return readw(addr);
>  }
> -unsigned int ioread16be(void __iomem *addr)
> +unsigned int ioread16be(const void __iomem *addr)
>  {
>   return readw_be(addr);
>  }
> -unsigned int ioread32(void __iomem *addr)
> +unsigned int ioread32(const void __iomem *addr)
>  {
>   return readl(addr);
>  }
> -unsigned int ioread32be(void __iomem *addr)
> +unsigned int ioread32be(const void __iomem *addr)
>  {
>   return readl_be(addr);
>  }
> @@ -41,27 +41,27 @@ EXPORT_SYMBOL(ioread16be);
>  EXPORT_SYMBOL(ioread32);
>  EXPORT_SYMBOL(ioread32be);
>  #ifdef __powerpc64__
> -u64 ioread64(void __iomem *addr)
> +u64 ioread64(const void __iomem *addr)
>  {
>   return readq(addr);
>  }
> -u64 ioread64_lo_hi(void __iomem *addr)
> +u64 ioread64_lo_hi(const void __iomem *addr)
>  {
>   return readq(addr);
>  }
> -u64 ioread64_hi_lo(void __iomem *addr)
> +u64 ioread64_hi_lo(const void __iomem *addr)
>  {
>   return readq(addr);
>  }
> -u64 ioread64be(void __iomem *addr)
> +u64 ioread64be(const void __iomem *addr)
>  {
>   return readq_be(addr);
>  }
> -u64 ioread64be_lo_hi(void __iomem *addr)
> +u64 ioread64be_lo_hi(const void __iomem *addr)
>  {
>   return readq_be(addr);
>  }
> -u64 ioread64be_hi_lo(void __iomem *addr)
> +u64 ioread64be_hi_lo(const void __iomem *addr)
>  {
>   return readq_be(addr);
>  }
> @@ -139,15 +139,15 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
>   * FIXME! We could make these do EEH handling if we really
>   * wanted. Not clear if we do.
>   */
> -void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
> +void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
>  {
>   readsb(addr, dst, count);
>  }
> -void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
> +void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
>  {
>   readsw(addr, dst, count);
>  }
> -void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
> +void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
>  {
>   readsl(addr, dst, count);
>  }

This looks OK to me.

Acked-by: Michael Ellerman  (powerpc)

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] headers: untangle kmemleak.h from mm.h

2018-02-13 Thread Michael Ellerman

Randy Dunlap <rdun...@infradead.org> writes:

> On 02/12/2018 04:28 AM, Michael Ellerman wrote:
>> Randy Dunlap <rdun...@infradead.org> writes:
>> 
>>> From: Randy Dunlap <rdun...@infradead.org>
>>>
>>> Currently  #includes  for no obvious
>>> reason. It looks like it's only a convenience, so remove kmemleak.h
>>> from slab.h and add  to any users of kmemleak_*
>>> that don't already #include it.
>>> Also remove  from source files that do not use it.
>>>
>>> This is tested on i386 allmodconfig and x86_64 allmodconfig. It
>>> would be good to run it through the 0day bot for other $ARCHes.
>>> I have neither the horsepower nor the storage space for the other
>>> $ARCHes.
>>>
>>> [slab.h is the second most used header file after module.h; kernel.h
>>> is right there with slab.h. There could be some minor error in the
>>> counting due to some #includes having comments after them and I
>>> didn't combine all of those.]
>>>
>>> This is Lingchi patch #1 (death by a thousand cuts, applied to kernel
>>> header files).
>>>
>>> Signed-off-by: Randy Dunlap <rdun...@infradead.org>
>> 
>> I threw it at a random selection of configs and so far the only failures
>> I'm seeing are:
>> 
>>   lib/test_firmware.c:134:2: error: implicit declaration of function 'vfree' 
>> [-Werror=implicit-function-declaration]  
>> 
>>   lib/test_firmware.c:620:25: error: implicit declaration of function 
>> 'vzalloc' [-Werror=implicit-function-declaration]
>>   lib/test_firmware.c:620:2: error: implicit declaration of function 
>> 'vzalloc' [-Werror=implicit-function-declaration]
>>   security/integrity/digsig.c:146:2: error: implicit declaration of function 
>> 'vfree' [-Werror=implicit-function-declaration]
>
> Both of those source files need to #include .

Yep, I added those and rebuilt. I don't see any more failures that look
related to your patch.

  http://kisskb.ellerman.id.au/kisskb/head/13399/


I haven't gone through the defconfigs I have enabled for a while, so
it's possible I have some missing but it's still a reasonable cross
section.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] headers: untangle kmemleak.h from mm.h

2018-02-12 Thread Michael Ellerman

Randy Dunlap  writes:

> From: Randy Dunlap 
>
> Currently  #includes  for no obvious
> reason. It looks like it's only a convenience, so remove kmemleak.h
> from slab.h and add  to any users of kmemleak_*
> that don't already #include it.
> Also remove  from source files that do not use it.
>
> This is tested on i386 allmodconfig and x86_64 allmodconfig. It
> would be good to run it through the 0day bot for other $ARCHes.
> I have neither the horsepower nor the storage space for the other
> $ARCHes.
>
> [slab.h is the second most used header file after module.h; kernel.h
> is right there with slab.h. There could be some minor error in the
> counting due to some #includes having comments after them and I
> didn't combine all of those.]
>
> This is Lingchi patch #1 (death by a thousand cuts, applied to kernel
> header files).
>
> Signed-off-by: Randy Dunlap 

I threw it at a random selection of configs and so far the only failures
I'm seeing are:

  lib/test_firmware.c:134:2: error: implicit declaration of function 'vfree' 
[-Werror=implicit-function-declaration] 
 
  lib/test_firmware.c:620:25: error: implicit declaration of function 'vzalloc' 
[-Werror=implicit-function-declaration]
  lib/test_firmware.c:620:2: error: implicit declaration of function 'vzalloc' 
[-Werror=implicit-function-declaration]
  security/integrity/digsig.c:146:2: error: implicit declaration of function 
'vfree' [-Werror=implicit-function-declaration]

Full results trickling in here, not all the failures there are caused by
this patch, ie. some configs are broken in mainline:

  http://kisskb.ellerman.id.au/kisskb/head/13396/

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: Ping Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2018-01-28 Thread Michael Ellerman

Vincent Legoll <vincent.leg...@gmail.com> writes:
> On 1/23/18, Michael Ellerman <m...@ellerman.id.au> wrote:
>> This has been broken in linux-next for ~6 weeks now, can we please merge
>> this and get it fixed.
>
> Added Stephen Rothwell to cc

It should be fixed in the virtio tree, which the other Michael and Jason
maintain AFAIK.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Ping Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2018-01-22 Thread Michael Ellerman

Vincent Legoll  writes:

> No need to get into the submenu to disable all VIRTIO-related
> config entries.
>
> This makes it easier to disable all VIRTIO config options
> without entering the submenu. It will also enable one
> to see that en/dis-abled state from the outside menu.
>
> This is only intended to change menuconfig UI, not change
> the config dependencies.
>
> v2: Added "default y" to avoid breaking existing configs
> v3: Fixed wrong indentation, added *-by from Randy
>
> Signed-off-by: Vincent Legoll 
> Reviewed-by: Randy Dunlap 
> Tested-by: Randy Dunlap  # works for me
> ---
>  drivers/virtio/Kconfig | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

This has been broken in linux-next for ~6 weeks now, can we please merge
this and get it fixed.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2018-01-02 Thread Michael Ellerman

Michael Ellerman <m...@ellerman.id.au> writes:

> Vincent Legoll <vincent.leg...@gmail.com> writes:
>
>> Hello,
>>
>> thanks for the help, and sorry for the poor patch,
>>
>> On Thu, Dec 21, 2017 at 11:53 AM, Michael Ellerman <m...@ellerman.id.au> 
>> wrote:
>>> This breaks all existing .configs *and* defconfigs that use VIRTIO.
>>>
>>> Please don't do that.
>>>
>>> If you make it default y then everything will continue to work.
>>
>> Do you want me to respin it with the added default ?
>
> Yes please, either that or revert it.

This still seems to be broken.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2017-12-21 Thread Michael Ellerman

Vincent Legoll <vincent.leg...@gmail.com> writes:

> Hello,
>
> thanks for the help, and sorry for the poor patch,
>
> On Thu, Dec 21, 2017 at 11:53 AM, Michael Ellerman <m...@ellerman.id.au> 
> wrote:
>> This breaks all existing .configs *and* defconfigs that use VIRTIO.
>>
>> Please don't do that.
>>
>> If you make it default y then everything will continue to work.
>
> Do you want me to respin it with the added default ?

Yes please, either that or revert it.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2017-12-21 Thread Michael Ellerman

Vincent Legoll  writes:

> No need to get into the submenu to disable all VIRTIO-related
> config entries.
>
> This makes it easier to disable all VIRTIO config options
> without entering the submenu. It will also enable one
> to see that en/dis-abled state from the outside menu.
>
> This is only intended to change menuconfig UI, not change
> the config dependencies.
>
> Signed-off-by: Vincent Legoll 
> ---
>  drivers/virtio/Kconfig | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index cff773f15b7e..d485a63a8233 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -5,7 +5,10 @@ config VIRTIO
> bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_RPMSG
> or CONFIG_S390_GUEST.
>  
> -menu "Virtio drivers"
> +menuconfig VIRTIO_MENU
> + bool "Virtio drivers"
> +
> +if VIRTIO_MENU

This breaks all existing .configs *and* defconfigs that use VIRTIO.

Please don't do that.

If you make it default y then everything will continue to work.

cheers


diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index d485a63a8233..35897649c24f 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -7,6 +7,7 @@ config VIRTIO
 
 menuconfig VIRTIO_MENU
bool "Virtio drivers"
+   default y
 
 if VIRTIO_MENU
 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: Memory corruption in powerpc guests with virtio_balloon (was Re: [PATCH v3] virtio_balloon: fix deadlock on OOM)

2017-12-03 Thread Michael Ellerman

"Michael S. Tsirkin" <m...@redhat.com> writes:
> On Fri, Dec 01, 2017 at 11:31:08PM +1100, Michael Ellerman wrote:
>> "Michael S. Tsirkin" <m...@redhat.com> writes:
>> 
>> > fill_balloon doing memory allocations under balloon_lock
>> > can cause a deadlock when leak_balloon is called from
>> > virtballoon_oom_notify and tries to take same lock.
...
>> 
>> 
>> Somehow this commit seems to be killing powerpc guests.
...
>
> Thanks for the report!
> A fix was just posted:
> virtio_balloon: fix increment of vb->num_pfns in fill_balloon()
>
> Would appreciate testing.

Yep that fixes it. Thanks.

Tested-by: Michael Ellerman <m...@ellerman.id.au>

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Memory corruption in powerpc guests with virtio_balloon (was Re: [PATCH v3] virtio_balloon: fix deadlock on OOM)

2017-12-01 Thread Michael Ellerman

"Michael S. Tsirkin"  writes:

> fill_balloon doing memory allocations under balloon_lock
> can cause a deadlock when leak_balloon is called from
> virtballoon_oom_notify and tries to take same lock.
>
> To fix, split page allocation and enqueue and do allocations outside the lock.
>
> Here's a detailed analysis of the deadlock by Tetsuo Handa:
>
> In leak_balloon(), mutex_lock(>balloon_lock) is called in order to
> serialize against fill_balloon(). But in fill_balloon(),
> alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
> called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
> implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
> is specified, this allocation attempt might indirectly depend on somebody
> else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
> __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
> virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
> out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
> mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
> will cause OOM lockup.
>
>   Thread1   Thread2
> fill_balloon()
>   takes a balloon_lock
>   balloon_page_enqueue()
> alloc_page(GFP_HIGHUSER_MOVABLE)
>   direct reclaim (__GFP_FS context)   takes a fs lock
> waits for that fs lock  alloc_page(GFP_NOFS)
>   __alloc_pages_may_oom()
> takes the oom_lock
> out_of_memory()
>   
> blocking_notifier_call_chain()
> leak_balloon()
>   tries to take 
> that balloon_lock and deadlocks
>
> Reported-by: Tetsuo Handa 
> Cc: Michal Hocko 
> Cc: Wei Wang 
> Signed-off-by: Michael S. Tsirkin 
>
>  drivers/virtio/virtio_balloon.c| 24 +++-
>  include/linux/balloon_compaction.h | 35 ++-
>  mm/balloon_compaction.c| 28 +---
>  3 files changed, 74 insertions(+), 13 deletions(-)

Somehow this commit seems to be killing powerpc guests.

The symptom is that the first page (64K) of the guests memory gets over
written with zeroes, which is where our interrupt handlers are, so the
system rapidly locks up due to illegal instructions in the illegal
instruction handler.

There seems to be some element of a race, because it doesn't always
crash. Sometimes I can boot to a shell, but not often. When it does
happen it's fairly late in boot, but before I get to a shell.

I had a few bisects go off into the weeds due to the intermittent
nature. But once I realised that I changed my script to boot 5 times
before declaring a kernel good, and that bisected straight here.

I can also revert this commit on v4.15-rc1 and everything's fine again -
I got through ~250 boots with that kernel.

So I'm pretty sure this commit is triggering/exposing/causing the bug.

The other data point is that the page that's overwritten is mapped read
only in the guest kernel. So either the guest kernel is writing to it in
real mode (MMU off), or the hypervisor/DMA is doing it.

I haven't isolated if it's host kernel/qemu version dependent, at the
moment I'm just using distro packaged versions of both.

Anyway I'll try and dig further into it on Monday, but I thought I'd let
you know in case this is a known bug with a fix in the pipeline, or
rings any bells or whatever.

cheers

> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index f0b3a0b..7960746 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -143,16 +143,17 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  
>  static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
> - struct balloon_dev_info *vb_dev_info = >vb_dev_info;
>   unsigned num_allocated_pages;
> + unsigned num_pfns;
> + struct page *page;
> + LIST_HEAD(pages);
>  
>   /* We can only do one array worth at a time. */
>   num = min(num, ARRAY_SIZE(vb->pfns));
>  
> - mutex_lock(>balloon_lock);
> - for (vb->num_pfns = 0; vb->num_pfns < num;
> -  vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> - struct page *page = balloon_page_enqueue(vb_dev_info);
> + for (num_pfns = 0; num_pfns < num;
> +  num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> + struct page *page = balloon_page_alloc();
>  
>   if (!page) {
>   dev_info_ratelimited(>vdev->dev,
> @@

Re: [PATCH v7 6/6] powerpc: pSeries: Add pv-qspinlock build config/make

2016-09-25 Thread Michael Ellerman

xinhui  writes:

> hi, all
>   ok, this patch set depends on
> https://patchwork.kernel.org/patch/8953981/ [V4] powerpc: Implement {cmp}xchg 
> for u8 and u16

AKA: https://patchwork.ozlabs.org/patch/615480/

Sorry I saw the discussion on that and thought there'd be a new version.
But now I read the whole thread it looks like it was OK in the end.

I'll try and get it merged.

cheers
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 3/3] Use separate struct console structure for each hvc_console.

2011-11-09 Thread Michael Ellerman

On Tue, 2011-11-08 at 13:45 -0800, Miche Baker-Harvey wrote:
 It is possible to make any virtio_console port be a console
 by sending VIRITO_CONSOLE_CONSOLE_PORT.  But hvc_alloc was
 using a single struct console hvc_console, which contains
 both an index and flags which are per-port.
 
 This adds a separate struct console for each virtio_console
 that is CONSOLE_PORT.

Hi Miche,

I'm testing this on powerpc and unfortunately it's working a little _too
well_. I end up with two struct consoles registered and so I get every
line of output twice :)

The problem is that we're registering two struct consoles. The first
obviously is hvc_console, either in hvc_console_init(), or in my case
from hvc_instantiate().

Then we register the allocated one in hvc_alloc(). But because they both
point back to the same hardware you get duplicate output.

We _do_ want to register a console early, in either/both
hvc_console_init() and hvc_instantiate(), because we want to have
console during boot prior to when hvc_alloc() gets called.

I think maybe we should be checking in hvc_alloc() whether we already
have hvc_console associated with the vtermno and if so we use
hvc_console instead of allocating a new one.

Patch below to do that, and works for me, but it's a bit of a hack,
there must be a better solution.

Finally I'm not sure how your patch affects the code in hvc_poll() which
checks hvc_console.index to do the SYSRQ hack.

cheers

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index fff35da..b249195 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -815,13 +815,15 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
kref_init(hp-kref);
 
INIT_WORK(hp-tty_resize, hvc_set_winsz);
-   /*
-* Make each console its own struct console.
-*/
-   cp = kmemdup(hvc_console, sizeof(*cp), GFP_KERNEL);
-   if (!cp) {
-   kfree(hp);
-   return ERR_PTR(-ENOMEM);
+
+   if (hvc_console.index = 0  vtermnos[hvc_console.index] == 
hp-vtermno)
+   cp = hvc_console;
+   else {
+   cp = kmemdup(hvc_console, sizeof(*cp), GFP_KERNEL);
+   if (!cp) {
+   kfree(hp);
+   return ERR_PTR(-ENOMEM);
+   }
}
 
hp-hvc_console = cp;
@@ -850,7 +852,9 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
 
list_add_tail((hp-next), hvc_structs);
spin_unlock(hvc_structs_lock);
-   register_console(cp);
+
+   if (cp != hvc_console)
+   register_console(cp);
 
return hp;
 }



signature.asc
Description: This is a digitally signed message part
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: Extending virtio_console to support multiple ports

2009-08-26 Thread Michael Ellerman

On Wed, 2009-08-26 at 21:15 +0530, Amit Shah wrote:
[cc'ing some people who have made some commits in hvc_console.c]

On (Wed) Aug 26 2009 [16:57:18], Amit Shah wrote:
On (Tue) Aug 25 2009 [11:47:20], Amit Shah wrote:

Hello all,

Here is a new iteration of the patch series that implements a
transport for guest and host communications.

The code has been updated to reuse the virtio-console device instead
of creating a new virtio-serial device.

And the problem now is that hvc calls the put_chars function with
spinlocks held and we now allocate pages in send_buf(), called from
put_chars.

A few solutions:

[snip]

- Convert hvc's usage of spinlocks to mutexes. I've no idea how this
will play out; I'm no expert here. But I did try doing this and so far
it all looks OK. No lockups, lockdep warnings, nothing. I have full
debugging enabled. But this doesn't mean it's right.

So just to test this further I added the capability to have more than
one hvc console spawn from virtio_console, created two consoles and did
a 'cat' of a file in each of the virtio-consoles. It's been running for
half an hour now without any badness. No spew in debug logs too.

I also checked the code in hvc_console.c that takes the spin_locks.
Nothing there that runs from (or needs to run from) interrupt context.
So the change to mutexes does seem reasonable. Also, the spinlock code
was added really long back -- git blame shows Linus' first git commit
introduced them in the git history, so it's pure legacy baggage.

I won't tell Ryan you called his code pure legacy baggage if you
don't ;)

http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=d450b4ae023fb4be175389c18f4f87677da03020

cheers

signature.asc
Description: This is a digitally signed message part
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Re: [PATCH] msi-x: let drivers retry when not enough vectors

2009-05-07 Thread Michael Ellerman

On Fri, 2009-05-08 at 09:25 +0930, Rusty Russell wrote:
 On Thu, 7 May 2009 07:49:53 pm Sheng Yang wrote:
  On Thursday 07 May 2009 17:53:02 Matthew Wilcox wrote:
   Here's a good example.  Let's suppose you have a driver which supports
   two different models of cards, one has 16 MSI-X interrupts, the other
   has 10.  You can call pci_enable_msix() asking for 16 vectors.  If your
   card is model A, you get 16 interrupts.  If your card is model B, it says
   you can have 10.
 
 Sheng is absolutely right, that's a horrid API.

It's not horrid, though it is tricky - but only for drivers that care,
you can still do:

if (pci_enable_msix(..))
bail;

 If it actually enabled that number and returned it, it might make sense (cf. 
 write() returning less bytes than you give it).  

It could do that, but I think that would be worse. The driver, on
finding out it can only get a certain number of MSIs might need to make
a decision about how it allocates those - eg. in a network driver,
sharing them between TX/RX/management.

And in practice most of the drivers just bail if they can't get what
they asked for, so enabling less than they wanted would just mean they
have to go and disable them.

 But overloading the return 
 value to save an explicit call is just ugly; it's not worth saving a few 
 lines 
 of code at cost of making all the drivers subtle and tricksy.

Looking at just this patch, I would agree, but unfortunately it's not
that simple. The first limitation on the number of MSIs the driver can
have is the number the device supports, that's what this patch does. But
there are others, and they come out of the arch code, or even the
firmware. So to implement pci_how_many_msis(), we'd need a parallel API
all the way down to the arch code, or a flag to all the existing
routines saying don't really allocate, just find out. That would be
horrid IMHO.

cheers


signature.asc
Description: This is a digitally signed message part
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

40 matches

Mail list logo