Re: [PATCH v2] virtio_pci: Switch away from deprecated irq_set_affinity_hint

2023-10-27 Thread Jakub Sitnicki via Virtualization
On Thu, Oct 26, 2023 at 01:20 PM -04, Michael S. Tsirkin wrote:
> On Thu, Oct 26, 2023 at 06:25:08PM +0200, Jakub Sitnicki wrote:
>> On Wed, Oct 25, 2023 at 04:53 PM +02, Jakub Sitnicki wrote:
>> > Since commit 65c7cdedeb30 ("genirq: Provide new interfaces for affinity
>> > hints") irq_set_affinity_hint is being phased out.
>> >
>> > Switch to new interfaces for setting and applying irq affinity hints.
>> >
>> > Signed-off-by: Jakub Sitnicki 
>> > ---
>> > v2:
>> >  - Leave cpumask_copy as is. We can't pass pointer to stack memory as hint.
>> >Proposed a change to IRQ affinity interface to address this limitation:
>> >https://lore.kernel.org/r/20231025141517.375378-1-ja...@cloudflare.com
>> 
>> Just a note to the ^ - if we wanted to get rid of msix_affinity_masks,
>> we could call irq_set_affinity directly, instead of calling it through
>> irq_set_affinity[_and]_hint.
>> 
>> The hint wouldn't be available any more in /proc/irq/N/affinity_hint,
>> but the same information can be gathered from /proc/irq/N/smp_affinity.
>> 
>> [...]
>
>
> So we are potentially breaking some userspace - what's the value we
> gain?  Is there some way we can make disable_irq/enable_irq work?
> That would have a lot of value.
> There is an actual need for that in virtio for coco but we can't use
> these APIs with affinity managed IRQs.

Sorry, that is beyond my ken today.

I saw the comment in vp_modern_disable_vq_and_reset:

/* For the case where vq has an exclusive irq, call synchronize_irq() to
 * wait for completion.
 *
 * note: We can't use disable_irq() since it conflicts with the affinity
 * managed IRQ that is used by some drivers.
 */

... but I fail to follow how the two conflict.

Perhaps Xuah could shed some light here.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] virtio_pci: Switch away from deprecated irq_set_affinity_hint

2023-10-26 Thread Jakub Sitnicki via Virtualization
On Wed, Oct 25, 2023 at 04:53 PM +02, Jakub Sitnicki wrote:
> Since commit 65c7cdedeb30 ("genirq: Provide new interfaces for affinity
> hints") irq_set_affinity_hint is being phased out.
>
> Switch to new interfaces for setting and applying irq affinity hints.
>
> Signed-off-by: Jakub Sitnicki 
> ---
> v2:
>  - Leave cpumask_copy as is. We can't pass pointer to stack memory as hint.
>Proposed a change to IRQ affinity interface to address this limitation:
>https://lore.kernel.org/r/20231025141517.375378-1-ja...@cloudflare.com

Just a note to the ^ - if we wanted to get rid of msix_affinity_masks,
we could call irq_set_affinity directly, instead of calling it through
irq_set_affinity[_and]_hint.

The hint wouldn't be available any more in /proc/irq/N/affinity_hint,
but the same information can be gathered from /proc/irq/N/smp_affinity.

[...]
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2] virtio_pci: Switch away from deprecated irq_set_affinity_hint

2023-10-25 Thread Jakub Sitnicki via Virtualization
Since commit 65c7cdedeb30 ("genirq: Provide new interfaces for affinity
hints") irq_set_affinity_hint is being phased out.

Switch to new interfaces for setting and applying irq affinity hints.

Signed-off-by: Jakub Sitnicki 
---
v2:
 - Leave cpumask_copy as is. We can't pass pointer to stack memory as hint.
   Proposed a change to IRQ affinity interface to address this limitation:
   https://lore.kernel.org/r/20231025141517.375378-1-ja...@cloudflare.com

v1: https://lore.kernel.org/r/20231019101625.412936-2-ja...@cloudflare.com
---
 drivers/virtio/virtio_pci_common.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index c2524a7207cf..7a5593997e0e 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -242,7 +242,7 @@ void vp_del_vqs(struct virtio_device *vdev)
if (v != VIRTIO_MSI_NO_VECTOR) {
int irq = pci_irq_vector(vp_dev->pci_dev, v);
 
-   irq_set_affinity_hint(irq, NULL);
+   irq_update_affinity_hint(irq, NULL);
free_irq(irq, vq);
}
}
@@ -443,10 +443,10 @@ int vp_set_vq_affinity(struct virtqueue *vq, const struct 
cpumask *cpu_mask)
mask = vp_dev->msix_affinity_masks[info->msix_vector];
irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector);
if (!cpu_mask)
-   irq_set_affinity_hint(irq, NULL);
+   irq_update_affinity_hint(irq, NULL);
else {
cpumask_copy(mask, cpu_mask);
-   irq_set_affinity_hint(irq, mask);
+   irq_set_affinity_and_hint(irq, mask);
}
}
return 0;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-24 Thread Jakub Sitnicki via Virtualization
On Tue, Oct 24, 2023 at 07:46 PM +08, Xuan Zhuo wrote:
> On Tue, 24 Oct 2023 13:26:49 +0200, Jakub Sitnicki  
> wrote:
>> On Tue, Oct 24, 2023 at 06:53 PM +08, Xuan Zhuo wrote:
>> > On Tue, 24 Oct 2023 10:17:19 +0200, Jakub Sitnicki  
>> > wrote:
>> >> On Tue, Oct 24, 2023 at 10:31 AM +08, Xuan Zhuo wrote:
>> >> > On Mon, 23 Oct 2023 18:52:45 +0200, Jakub Sitnicki 
>> >> >  wrote:
>> >> >> On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote:
>> >> >> > On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki 
>> >> >> >  wrote:
>> >> >> >> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
>> >> >> >> mask.") it is actually not needed to have a local copy of the cpu 
>> >> >> >> mask.
>> >> >> >
>> >> >> >
>> >> >> > Could you give more info to prove this?
>> >> >
>> >> >
>> >> > Actually, my question is that can we pass a val on the stack(or temp 
>> >> > value) to
>> >> > the irq_set_affinity_hint()?
>> >> >
>> >> > Such as the virtio-net uses zalloc_cpumask_var to alloc a cpu_mask, and
>> >> > that will be released.
>> >> >
>> >> >
>> >> >
>> >> > int __irq_apply_affinity_hint(unsigned int irq, const struct 
>> >> > cpumask *m,
>> >> >   bool setaffinity)
>> >> > {
>> >> > unsigned long flags;
>> >> > struct irq_desc *desc = irq_get_desc_lock(irq, , 
>> >> > IRQ_GET_DESC_CHECK_GLOBAL);
>> >> >
>> >> > if (!desc)
>> >> > return -EINVAL;
>> >> > ->  desc->affinity_hint = m;
>> >> > irq_put_desc_unlock(desc, flags);
>> >> > if (m && setaffinity)
>> >> > __irq_set_affinity(irq, m, false);
>> >> > return 0;
>> >> > }
>> >> > EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>> >> >
>> >> > The above code directly refers the mask pointer. If the mask is a temp 
>> >> > value, I
>> >> > think that is a bug.
>> >>
>> >> You are completely right. irq_set_affinity_hint stores the mask pointer.
>> >> irq_affinity_hint_proc_show later dereferences it when user reads out
>> >> /proc/irq/*/affinity_hint.
>> >>
>> >> I have failed to notice that. That's why we need cpumask_copy to stay.
>> >>
>> >> My patch is buggy. Please disregard.
>> >>
>> >> I will send a v2 to only migrate from deprecated irq_set_affinity_hint.
>> >>
>> >> > And I notice that many places directly pass the temp value to this API.
>> >> > And I am a little confused. ^_^ Or I missed something.
>> >>
>> >> There seem two be two gropus of callers:
>> >>
>> >> 1. Those that use get_cpu_mask/cpumask_of/cpumask_of_node to produce a
>> >>cpumask pointer which is a preallocated constant.
>> >>
>> >>$ weggli 'irq_set_affinity_hint(_, $func(_));' ~/src/linux
>> >>
>> >> 2. Those that pass a pointer to memory somewhere.
>> >>
>> >>$ weggli 'irq_set_affinity_hint(_, $mask);' ~/src/linux
>> >>
>> >> (weggli tool can be found at https://github.com/weggli-rs/weggli)
>> >>
>> >> I've looked over the callers from group #2 but I couldn't find any
>> >> passing a pointer memory on stack :-)
>> >
>> > Pls check stmmac_request_irq_multi_msi()
>>
>> Good catch. That one looks buggy.
>>
>> I should also checked for callers that take an address of a var/field:
>>
>>   $ weggli 'irq_set_affinity_hint(_, &$mask);' ~/src/linux
>
> Do you find more?

No, just the one you pointed out. Unless I missed something.

I ran an improved query. Shows everything but the non-interesting cases:

$ weggli '{
NOT: irq_set_affinity_hint(_, NULL);
NOT: irq_set_affinity_hint(_, get_cpu_mask(_));
NOT: irq_set_affinity_hint(_, cpumask_of(_));
irq_set_affinity_hint(_, _);
}' ~/src/linux

And repeated it for irq_set_affinity_and_hint and irq_update_affinity.

The calls where it was not obvious at first sight that we're passing a
pointer to some heap memory, turned out to use a temporary variable to
either store address to heap memory or return value from cpumask_of*().

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-24 Thread Jakub Sitnicki via Virtualization
On Tue, Oct 24, 2023 at 06:53 PM +08, Xuan Zhuo wrote:
> On Tue, 24 Oct 2023 10:17:19 +0200, Jakub Sitnicki  
> wrote:
>> On Tue, Oct 24, 2023 at 10:31 AM +08, Xuan Zhuo wrote:
>> > On Mon, 23 Oct 2023 18:52:45 +0200, Jakub Sitnicki  
>> > wrote:
>> >> On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote:
>> >> > On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki 
>> >> >  wrote:
>> >> >> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
>> >> >> mask.") it is actually not needed to have a local copy of the cpu mask.
>> >> >
>> >> >
>> >> > Could you give more info to prove this?
>> >
>> >
>> > Actually, my question is that can we pass a val on the stack(or temp 
>> > value) to
>> > the irq_set_affinity_hint()?
>> >
>> > Such as the virtio-net uses zalloc_cpumask_var to alloc a cpu_mask, and
>> > that will be released.
>> >
>> >
>> >
>> >int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
>> >  bool setaffinity)
>> >{
>> >unsigned long flags;
>> >struct irq_desc *desc = irq_get_desc_lock(irq, , 
>> > IRQ_GET_DESC_CHECK_GLOBAL);
>> >
>> >if (!desc)
>> >return -EINVAL;
>> > -> desc->affinity_hint = m;
>> >irq_put_desc_unlock(desc, flags);
>> >if (m && setaffinity)
>> >__irq_set_affinity(irq, m, false);
>> >return 0;
>> >}
>> >EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>> >
>> > The above code directly refers the mask pointer. If the mask is a temp 
>> > value, I
>> > think that is a bug.
>>
>> You are completely right. irq_set_affinity_hint stores the mask pointer.
>> irq_affinity_hint_proc_show later dereferences it when user reads out
>> /proc/irq/*/affinity_hint.
>>
>> I have failed to notice that. That's why we need cpumask_copy to stay.
>>
>> My patch is buggy. Please disregard.
>>
>> I will send a v2 to only migrate from deprecated irq_set_affinity_hint.
>>
>> > And I notice that many places directly pass the temp value to this API.
>> > And I am a little confused. ^_^ Or I missed something.
>>
>> There seem two be two gropus of callers:
>>
>> 1. Those that use get_cpu_mask/cpumask_of/cpumask_of_node to produce a
>>cpumask pointer which is a preallocated constant.
>>
>>$ weggli 'irq_set_affinity_hint(_, $func(_));' ~/src/linux
>>
>> 2. Those that pass a pointer to memory somewhere.
>>
>>$ weggli 'irq_set_affinity_hint(_, $mask);' ~/src/linux
>>
>> (weggli tool can be found at https://github.com/weggli-rs/weggli)
>>
>> I've looked over the callers from group #2 but I couldn't find any
>> passing a pointer memory on stack :-)
>
> Pls check stmmac_request_irq_multi_msi()

Good catch. That one looks buggy.

I should also checked for callers that take an address of a var/field:

  $ weggli 'irq_set_affinity_hint(_, &$mask);' ~/src/linux
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-24 Thread Jakub Sitnicki via Virtualization
On Tue, Oct 24, 2023 at 10:31 AM +08, Xuan Zhuo wrote:
> On Mon, 23 Oct 2023 18:52:45 +0200, Jakub Sitnicki  
> wrote:
>> On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote:
>> > On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki  
>> > wrote:
>> >> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
>> >> mask.") it is actually not needed to have a local copy of the cpu mask.
>> >
>> >
>> > Could you give more info to prove this?
>
>
> Actually, my question is that can we pass a val on the stack(or temp value) to
> the irq_set_affinity_hint()?
>
> Such as the virtio-net uses zalloc_cpumask_var to alloc a cpu_mask, and
> that will be released.
>
>
>
>   int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> bool setaffinity)
>   {
>   unsigned long flags;
>   struct irq_desc *desc = irq_get_desc_lock(irq, , 
> IRQ_GET_DESC_CHECK_GLOBAL);
>
>   if (!desc)
>   return -EINVAL;
> ->desc->affinity_hint = m;
>   irq_put_desc_unlock(desc, flags);
>   if (m && setaffinity)
>   __irq_set_affinity(irq, m, false);
>   return 0;
>   }
>   EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>
> The above code directly refers the mask pointer. If the mask is a temp value, 
> I
> think that is a bug.

You are completely right. irq_set_affinity_hint stores the mask pointer.
irq_affinity_hint_proc_show later dereferences it when user reads out
/proc/irq/*/affinity_hint.

I have failed to notice that. That's why we need cpumask_copy to stay.

My patch is buggy. Please disregard.

I will send a v2 to only migrate from deprecated irq_set_affinity_hint.

> And I notice that many places directly pass the temp value to this API.
> And I am a little confused. ^_^ Or I missed something.

There seem two be two gropus of callers:

1. Those that use get_cpu_mask/cpumask_of/cpumask_of_node to produce a
   cpumask pointer which is a preallocated constant.

   $ weggli 'irq_set_affinity_hint(_, $func(_));' ~/src/linux

2. Those that pass a pointer to memory somewhere.

   $ weggli 'irq_set_affinity_hint(_, $mask);' ~/src/linux

(weggli tool can be found at https://github.com/weggli-rs/weggli)

I've looked over the callers from group #2 but I couldn't find any
passing a pointer memory on stack :-)

Thanks for pointing this out.

[...]
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-23 Thread Jakub Sitnicki via Virtualization
On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote:
> On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki  
> wrote:
>> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
>> mask.") it is actually not needed to have a local copy of the cpu mask.
>
>
> Could you give more info to prove this?
>
> If you are right, I think you should delete all code about 
> msix_affinity_masks?

Sorry for the late reply. I've been away.

It looks that msix_affinity_masks became unused - intentionally - in
2015, after commit 210d150e1f5d ("virtio_pci: Clear stale cpumask when
setting irq affinity") [1].

Originally introduced in 2012 in commit 75a0a52be3c2 ("virtio: introduce
an API to set affinity for a virtqueue") [2]. As I understand, it was
meant to make it possible to set VQ affinity to more than once CPU.

Now that we can pass a CPU mask, listing all CPUs, to set_vq_affinity,
msix_affinity_masks seems to no longer have a purpose.

So, IMO, you're right. We can remove it.

Happy to do that in a follow up series.

That is - if you're okay with these two patches in the current form.

Thanks for reviewing.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=210d150e1f5da506875e376422ba31ead2d49621
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75a0a52be3c27b58654fbed2c8f2ff401482b9a4
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/2] virtio_pci: Switch away from deprecated irq_set_affinity_hint

2023-10-19 Thread Jakub Sitnicki via Virtualization
Since commit 65c7cdedeb30 ("genirq: Provide new interfaces for affinity
hints") irq_set_affinity_hint is being phased out.

Switch to new interfaces for setting and applying irq affinity hints.

Signed-off-by: Jakub Sitnicki 
---
 drivers/virtio/virtio_pci_common.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 8927bc338f06..9fab99b540f1 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -242,7 +242,7 @@ void vp_del_vqs(struct virtio_device *vdev)
if (v != VIRTIO_MSI_NO_VECTOR) {
int irq = pci_irq_vector(vp_dev->pci_dev, v);
 
-   irq_set_affinity_hint(irq, NULL);
+   irq_update_affinity_hint(irq, NULL);
free_irq(irq, vq);
}
}
@@ -440,7 +440,7 @@ int vp_set_vq_affinity(struct virtqueue *vq, const struct 
cpumask *cpu_mask)
 
if (vp_dev->msix_enabled) {
irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector);
-   irq_set_affinity_hint(irq, cpu_mask);
+   irq_set_affinity_and_hint(irq, cpu_mask);
}
return 0;
 }
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-19 Thread Jakub Sitnicki via Virtualization
Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
mask.") it is actually not needed to have a local copy of the cpu mask.

Pass the cpu mask we got as argument to set the irq affinity hint.

Cc: Caleb Raitto 
Signed-off-by: Jakub Sitnicki 
---
 drivers/virtio/virtio_pci_common.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index c2524a7207cf..8927bc338f06 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -433,21 +433,14 @@ int vp_set_vq_affinity(struct virtqueue *vq, const struct 
cpumask *cpu_mask)
struct virtio_device *vdev = vq->vdev;
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index];
-   struct cpumask *mask;
unsigned int irq;
 
if (!vq->callback)
return -EINVAL;
 
if (vp_dev->msix_enabled) {
-   mask = vp_dev->msix_affinity_masks[info->msix_vector];
irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector);
-   if (!cpu_mask)
-   irq_set_affinity_hint(irq, NULL);
-   else {
-   cpumask_copy(mask, cpu_mask);
-   irq_set_affinity_hint(irq, mask);
-   }
+   irq_set_affinity_hint(irq, cpu_mask);
}
return 0;
 }
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio-mmio: Support multiple interrupts per device

2023-10-14 Thread Jakub Sitnicki via Virtualization
Sorry for the delay in my response. I've been away at a conference.

On Tue, Oct 10, 2023 at 02:52 PM +08, Jason Wang wrote:
> On Sat, Sep 30, 2023 at 4:46 AM Jakub Sitnicki  wrote:
>>
>> Some virtual devices, such as the virtio network device, can use multiple
>> virtqueues (or multiple pairs of virtqueues in the case of a vNIC). In such
>> case, when there are multiple vCPUs present, it is possible to process
>> virtqueue events in parallel. Each vCPU can service a subset of all
>> virtqueues when notified that there is work to carry out.
>>
>> However, the current virtio-mmio transport implementation poses a
>> limitation. Only one vCPU can service notifications from any of the
>> virtqueues of a single virtio device. This is because a virtio-mmio device
>> driver supports registering just one interrupt per device. With such setup
>> we are not able to scale virtqueue event processing among vCPUs.
>>
>> Now, with more than one IRQ resource registered for a virtio-mmio platform
>> device, we can address this limitation.
>>
>> First, we request multiple IRQs when creating virtqueues for a device.
>>
>> Then, map each virtqueue to one of the IRQs assigned to the device. The
>> mapping is done in a device type specific manner. For instance, a network
>> device will want each RX/TX virtqueue pair mapped to a different IRQ
>> line. Other device types might require a different mapping scheme. We
>> currently provide a mapping for virtio-net device type.
>>
>> Finally, when handling an interrupt, we service only the virtqueues
>> associated with the IRQ line that triggered the event.
>>
>> Signed-off-by: Jakub Sitnicki 
>> ---
>>  drivers/virtio/virtio_mmio.c | 102 
>> +++
>>  1 file changed, 83 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
>> index 06a587b23542..180c51c27704 100644
>> --- a/drivers/virtio/virtio_mmio.c
>> +++ b/drivers/virtio/virtio_mmio.c

[...]

>> @@ -488,6 +511,18 @@ static struct virtqueue *vm_setup_vq(struct 
>> virtio_device *vdev, unsigned int in
>> return ERR_PTR(err);
>>  }
>>
>> +/* Map virtqueue to zero-based interrupt number */
>> +static unsigned int vq2irq(const struct virtqueue *vq)
>> +{
>> +   switch (vq->vdev->id.device) {
>> +   case VIRTIO_ID_NET:
>> +   /* interrupt shared by rx/tx virtqueue pair */
>> +   return vq->index / 2;
>> +   default:
>> +   return 0;
>> +   }
>
> Transport drivers should have no knowledge of a specific type of device.
>

Makes sense. This breaks layering. I will see how to pull this into the
device driver. Perhaps this can be communicated through set_vq_affinity
op.

>> +}
>> +
>>  static int vm_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
>>struct virtqueue *vqs[],
>>vq_callback_t *callbacks[],

[...]

>> @@ -519,12 +544,51 @@ static int vm_find_vqs(struct virtio_device *vdev, 
>> unsigned int nvqs,
>> vqs[i] = vm_setup_vq(vdev, queue_idx++, callbacks[i], 
>> names[i],
>>  ctx ? ctx[i] : false);
>> if (IS_ERR(vqs[i])) {
>> -   vm_del_vqs(vdev);
>> -   return PTR_ERR(vqs[i]);
>> +   err = PTR_ERR(vqs[i]);
>> +   goto fail_vq;
>> }
>> }
>>
>> +   nirqs = platform_irq_count(vm_dev->pdev);
>> +   if (nirqs < 0) {
>> +   err = nirqs;
>> +   goto fail_vq;
>> +   }
>> +
>> +   for (i = 0; i < nirqs; i++) {
>> +   irq = platform_get_irq(vm_dev->pdev, i);
>> +   if (irq < 0)
>> +   goto fail_irq;
>> +   if (irq < irq_base)
>> +   irq_base = irq;
>> +
>> +   err = devm_request_irq(>dev, irq, vm_interrupt,
>> +  IRQF_SHARED, NULL, vm_dev);
>> +   if (err)
>> +   goto fail_irq;
>> +
>> +   if (of_property_read_bool(vm_dev->pdev->dev.of_node, 
>> "wakeup-source"))
>> +   enable_irq_wake(irq);
>
> Could we simply use the same policy as PCI (vp_find_vqs_msix())?

Reading that routine, the PCI policy is:

1) Best option: one for change interrupt, one per vq.
2) Second best: one for change, shared for all vqs.

Would be great to be able to go with option (1), but we have only a
limited number of legacy IRQs to spread among MMIO devices. 48 IRQs at
most in a 2 x IOAPIC setup.

Having one IRQ per VQ would mean less Rx/Tx queue pairs for a vNIC. Less
than 24 queue pairs. While, from our target workload PoV, ideally, we
would like to support at least 32 queue pairs.

Hence the idea to have one IRQ per Rx/Tx VQ pair. Not as ideal as (1),
but a lot better than (2).

Comparing this to PCI - virtio-net, with one interrupt per VQ, will map
each Rx/Tx VQ pair