Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-29 Thread Dou Liyang

Hi Thomas,

On 2018/11/29 6:03, Thomas Gleixner wrote:


+   affi_desc = kcalloc(nvec, sizeof(*affi_desc), GFP_KERNEL);


Why do you want to do that separate allocation here? Just let


I thought the irq_create_affinity_desc() also can be called by other 
functions which may convert cpumasks to irq_affinity_desc, such as

__devm_irq_alloc_descs().

Now, I know I was wrong, will modify it.


irq_create_affinity_masks() allocate an array of affinity descriptors and
use that. There is no point in copying that stuff over and over. Setting
the flag field can be done in the existing function as well.



Can you please change the function signature and fixup the callers, if
there are any of them? Copying this over and over is horrible.


I have searched, no one calls __devm_irq_alloc_descs, it may be called
by some users' own modules or drives.

yes, I will change it.


struct irq_affinity_desc {
struct cpumask  masks;
unsigned intmanaged : 1; > };


yes, BTW, If the following is more fit for irq_affinity_desc:

s/masks/mask/
s/managed/is_managed/






You can spare a lot of pointless churn by just keeping the 'affinity' name
and only changing the struct type. The compiler will catch all places which
need to be fixed and 'affinity' is generic enough to be used with the new
struct type as well. As Bjorn said, even 'masks' is fine.


Yes, I see

Thanks,
dou


Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-29 Thread Dou Liyang

Hi Thomas,

On 2018/11/29 6:03, Thomas Gleixner wrote:


+   affi_desc = kcalloc(nvec, sizeof(*affi_desc), GFP_KERNEL);


Why do you want to do that separate allocation here? Just let


I thought the irq_create_affinity_desc() also can be called by other 
functions which may convert cpumasks to irq_affinity_desc, such as

__devm_irq_alloc_descs().

Now, I know I was wrong, will modify it.


irq_create_affinity_masks() allocate an array of affinity descriptors and
use that. There is no point in copying that stuff over and over. Setting
the flag field can be done in the existing function as well.



Can you please change the function signature and fixup the callers, if
there are any of them? Copying this over and over is horrible.


I have searched, no one calls __devm_irq_alloc_descs, it may be called
by some users' own modules or drives.

yes, I will change it.


struct irq_affinity_desc {
struct cpumask  masks;
unsigned intmanaged : 1; > };


yes, BTW, If the following is more fit for irq_affinity_desc:

s/masks/mask/
s/managed/is_managed/






You can spare a lot of pointless churn by just keeping the 'affinity' name
and only changing the struct type. The compiler will catch all places which
need to be fixed and 'affinity' is generic enough to be used with the new
struct type as well. As Bjorn said, even 'masks' is fine.


Yes, I see

Thanks,
dou


Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-29 Thread Dou Liyang

Hi Bjorn,

on 2018/11/29 4:00, Bjorn Helgaas wrote:

[+cc linux-pci]
Since you mention reports, are there URLs to mailing list archives you
can include?

OK, I will add it:

https://marc.info/?l=linux-kernel=153543887027997=2


-   entry = alloc_msi_entry(>dev, nvec, masks);
+   entry = alloc_msi_entry(>dev, nvec, affi_desc);


Can you split this into two or more patches?  Most of these changes


Yes, next non-RFC version will do.


are trivial and not very interesting, and the fact that they're all in
one patch makes it hard to find and review the interesting bits.  For
example,

   1) Rename all the local variables while keeping the type the same
(or just leave the name the same; I think "affinity" would be a fine
name, and I would be OK if we ended up with "struct irq_affinity_desc
*masks" or "struct irq_affinity_desc *affinity").  This patch would
obviously have no functional impact and would remove a lot of the
trivial changes.
Oops, how stupid I am !



   2) Add "struct irq_affinity_desc" containing only "struct cpumask
masks" and irq_create_affinity_desc() (or leave the name as
irq_create_affinity_masks() and adjust the signature).  This would
also have no functional impact and would be a fairly trivial patch.
>3) Add "flags" to struct irq_affinity_desc and the related code.
This is the real meat of your patch, and with the above out of the
way, it will be much smaller and it'll be obvious what the important
changes are.


I see, :)

Thanks,
dou


Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-29 Thread Dou Liyang

Hi Bjorn,

on 2018/11/29 4:00, Bjorn Helgaas wrote:

[+cc linux-pci]
Since you mention reports, are there URLs to mailing list archives you
can include?

OK, I will add it:

https://marc.info/?l=linux-kernel=153543887027997=2


-   entry = alloc_msi_entry(>dev, nvec, masks);
+   entry = alloc_msi_entry(>dev, nvec, affi_desc);


Can you split this into two or more patches?  Most of these changes


Yes, next non-RFC version will do.


are trivial and not very interesting, and the fact that they're all in
one patch makes it hard to find and review the interesting bits.  For
example,

   1) Rename all the local variables while keeping the type the same
(or just leave the name the same; I think "affinity" would be a fine
name, and I would be OK if we ended up with "struct irq_affinity_desc
*masks" or "struct irq_affinity_desc *affinity").  This patch would
obviously have no functional impact and would remove a lot of the
trivial changes.
Oops, how stupid I am !



   2) Add "struct irq_affinity_desc" containing only "struct cpumask
masks" and irq_create_affinity_desc() (or leave the name as
irq_create_affinity_masks() and adjust the signature).  This would
also have no functional impact and would be a fairly trivial patch.
>3) Add "flags" to struct irq_affinity_desc and the related code.
This is the real meat of your patch, and with the above out of the
way, it will be much smaller and it'll be obvious what the important
changes are.


I see, :)

Thanks,
dou


Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Thomas Gleixner
Dou,

On Thu, 29 Nov 2018, Dou Liyang wrote:
> +/**
> + * struct irq_affinity_desc - Description for kinds of irq assignements
> + *   which will be transferred to irqdesc core

Please align this proper

 * struct irq_affinity_desc - Description for kinds of irq assignements
 *which will be transferred to irqdesc core

Aside of that, it's not interesting where these structs are going to be
transferred today because that might change tomorrow. So something like
this:

 * struct irq_affinity_desc - Interrupt affinity descriptor


> + * @masks:   cpumask of automatic irq affinity assignments

@mask: please. It's one cpumask per descriptor.

> + * @flags:   flags to differentiate between managed and
> + *   unmanaged interrupts

Again, that's the purpose today.

 * @flags:  Flags to convey complementary information

But see further down.

> + */
> +struct irq_affinity_desc {
> + struct cpumask masks;
> + unsigned intflags;
> +};

Please align the member names vertically with tabs

struct irq_affinity_desc {
struct cpumask  masks;
unsigned intflags;
};

> +/**
> + * irq_create_affinity_desc - Create affinity desc for multiqueue spreading
> + * @nvec:The total number of vectors
> + * @affd:Description of the affinity requirements
> + *
> + * Returns the irq_affinity_desc pointer or NULL if allocation failed.
> + */
> +struct irq_affinity_desc *
> +irq_create_affinity_desc(int nvec, const struct irq_affinity *affd)
> +{
> + struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
> + struct cpumask *curmask, *masks = NULL;
> + int i;
> +
> + masks = irq_create_affinity_masks(nvec, affd);
> + if (masks) {
> + affi_desc = kcalloc(nvec, sizeof(*affi_desc), GFP_KERNEL);

Why do you want to do that separate allocation here? Just let
irq_create_affinity_masks() allocate an array of affinity descriptors and
use that. There is no point in copying that stuff over and over. Setting
the flag field can be done in the existing function as well.

>  /**
>   * irq_calc_affinity_vectors - Calculate the optimal number of vectors
>   * @minvec:  The minimum number of vectors available
> diff --git a/kernel/irq/devres.c b/kernel/irq/devres.c
> index 6a682c229e10..2335b89d9bde 100644
> --- a/kernel/irq/devres.c
> +++ b/kernel/irq/devres.c
> @@ -181,14 +181,33 @@ int __devm_irq_alloc_descs(struct device *dev, int irq, 
> unsigned int from,
>  unsigned int cnt, int node, struct module *owner,
>  const struct cpumask *affinity)
>  {
> + struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
> + const struct cpumask *curmask;
>   struct irq_desc_devres *dr;
> - int base;
> + int base, i;
>  
>   dr = devres_alloc(devm_irq_desc_release, sizeof(*dr), GFP_KERNEL);
>   if (!dr)
>   return -ENOMEM;
>  
> - base = __irq_alloc_descs(irq, from, cnt, node, owner, affinity);
> + if (affinity) {
> + affi_desc = kcalloc(cnt, sizeof(*affi_desc), GFP_KERNEL);
> + if (!affi_desc)
> + return -ENOMEM;
> +
> + curmask = affinity;
> + cur_affi_desc = affi_desc;
> + for (i = 0; i < cnt; i++) {
> + cpumask_copy(_affi_desc->masks, curmask);
> + cur_affi_desc->flags = 1;

No magic constant's please for a flags field. You really want proper
constants for that, but I'd rather avoid the whole flags fiddling
completely. See below.

> + curmask++;
> + cur_affi_desc++;
> + }

Can you please change the function signature and fixup the callers, if
there are any of them? Copying this over and over is horrible.

>   for (i = 0; i < cnt; i++) {
> - if (affinity) {
> - node = cpu_to_node(cpumask_first(affinity));
> - mask = affinity;
> - affinity++;
> + if (affi_desc && affi_desc->flags ) {

Please don't check flags for being !=0. Once we add other information than
the managed/non-managed decision to the flag field, then this will fall
apart.

To avoid issues like that please use explicit bits in the structure instead
of the opaque flags field:

struct irq_affinity_desc {
struct cpumask  masks;
unsigned intmanaged : 1;
};

So if we add new things later they will have an explicit name and there is
absolute no conflict with existing code.

> + flags =  IRQD_AFFINITY_MANAGED |
> + IRQD_MANAGED_SHUTDOWN;
> + } else
> + flags = 0;

If you need braces for the if then the else path wants them as well.

> +
> + if (affi_desc) {
> + mask = _desc->masks;
> + node = cpu_to_node(cpumask_first(mask));
> + affi_desc++;
>   }

>  

Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Thomas Gleixner
Dou,

On Thu, 29 Nov 2018, Dou Liyang wrote:
> +/**
> + * struct irq_affinity_desc - Description for kinds of irq assignements
> + *   which will be transferred to irqdesc core

Please align this proper

 * struct irq_affinity_desc - Description for kinds of irq assignements
 *which will be transferred to irqdesc core

Aside of that, it's not interesting where these structs are going to be
transferred today because that might change tomorrow. So something like
this:

 * struct irq_affinity_desc - Interrupt affinity descriptor


> + * @masks:   cpumask of automatic irq affinity assignments

@mask: please. It's one cpumask per descriptor.

> + * @flags:   flags to differentiate between managed and
> + *   unmanaged interrupts

Again, that's the purpose today.

 * @flags:  Flags to convey complementary information

But see further down.

> + */
> +struct irq_affinity_desc {
> + struct cpumask masks;
> + unsigned intflags;
> +};

Please align the member names vertically with tabs

struct irq_affinity_desc {
struct cpumask  masks;
unsigned intflags;
};

> +/**
> + * irq_create_affinity_desc - Create affinity desc for multiqueue spreading
> + * @nvec:The total number of vectors
> + * @affd:Description of the affinity requirements
> + *
> + * Returns the irq_affinity_desc pointer or NULL if allocation failed.
> + */
> +struct irq_affinity_desc *
> +irq_create_affinity_desc(int nvec, const struct irq_affinity *affd)
> +{
> + struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
> + struct cpumask *curmask, *masks = NULL;
> + int i;
> +
> + masks = irq_create_affinity_masks(nvec, affd);
> + if (masks) {
> + affi_desc = kcalloc(nvec, sizeof(*affi_desc), GFP_KERNEL);

Why do you want to do that separate allocation here? Just let
irq_create_affinity_masks() allocate an array of affinity descriptors and
use that. There is no point in copying that stuff over and over. Setting
the flag field can be done in the existing function as well.

>  /**
>   * irq_calc_affinity_vectors - Calculate the optimal number of vectors
>   * @minvec:  The minimum number of vectors available
> diff --git a/kernel/irq/devres.c b/kernel/irq/devres.c
> index 6a682c229e10..2335b89d9bde 100644
> --- a/kernel/irq/devres.c
> +++ b/kernel/irq/devres.c
> @@ -181,14 +181,33 @@ int __devm_irq_alloc_descs(struct device *dev, int irq, 
> unsigned int from,
>  unsigned int cnt, int node, struct module *owner,
>  const struct cpumask *affinity)
>  {
> + struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
> + const struct cpumask *curmask;
>   struct irq_desc_devres *dr;
> - int base;
> + int base, i;
>  
>   dr = devres_alloc(devm_irq_desc_release, sizeof(*dr), GFP_KERNEL);
>   if (!dr)
>   return -ENOMEM;
>  
> - base = __irq_alloc_descs(irq, from, cnt, node, owner, affinity);
> + if (affinity) {
> + affi_desc = kcalloc(cnt, sizeof(*affi_desc), GFP_KERNEL);
> + if (!affi_desc)
> + return -ENOMEM;
> +
> + curmask = affinity;
> + cur_affi_desc = affi_desc;
> + for (i = 0; i < cnt; i++) {
> + cpumask_copy(_affi_desc->masks, curmask);
> + cur_affi_desc->flags = 1;

No magic constant's please for a flags field. You really want proper
constants for that, but I'd rather avoid the whole flags fiddling
completely. See below.

> + curmask++;
> + cur_affi_desc++;
> + }

Can you please change the function signature and fixup the callers, if
there are any of them? Copying this over and over is horrible.

>   for (i = 0; i < cnt; i++) {
> - if (affinity) {
> - node = cpu_to_node(cpumask_first(affinity));
> - mask = affinity;
> - affinity++;
> + if (affi_desc && affi_desc->flags ) {

Please don't check flags for being !=0. Once we add other information than
the managed/non-managed decision to the flag field, then this will fall
apart.

To avoid issues like that please use explicit bits in the structure instead
of the opaque flags field:

struct irq_affinity_desc {
struct cpumask  masks;
unsigned intmanaged : 1;
};

So if we add new things later they will have an explicit name and there is
absolute no conflict with existing code.

> + flags =  IRQD_AFFINITY_MANAGED |
> + IRQD_MANAGED_SHUTDOWN;
> + } else
> + flags = 0;

If you need braces for the if then the else path wants them as well.

> +
> + if (affi_desc) {
> + mask = _desc->masks;
> + node = cpu_to_node(cpumask_first(mask));
> + affi_desc++;
>   }

>  

Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Thomas Gleixner
On Wed, 28 Nov 2018, Bjorn Helgaas wrote:
> On Wed, Nov 28, 2018 at 10:45 AM Dou Liyang  wrote:
> > if (affd)
> > -   masks = irq_create_affinity_masks(nvec, affd);
> > -
> > +   affi_desc = irq_create_affinity_desc(nvec, affd);
> >
> > /* MSI Entry Initialization */
> > -   entry = alloc_msi_entry(>dev, nvec, masks);
> > +   entry = alloc_msi_entry(>dev, nvec, affi_desc);
> 
> Can you split this into two or more patches?  Most of these changes
> are trivial and not very interesting, and the fact that they're all in
> one patch makes it hard to find and review the interesting bits.  For
> example,
> 
>   1) Rename all the local variables while keeping the type the same
> (or just leave the name the same; I think "affinity" would be a fine
> name, and I would be OK if we ended up with "struct irq_affinity_desc
> *masks" or "struct irq_affinity_desc *affinity").  This patch would
> obviously have no functional impact and would remove a lot of the
> trivial changes.
> 
>   2) Add "struct irq_affinity_desc" containing only "struct cpumask
> masks" and irq_create_affinity_desc() (or leave the name as
> irq_create_affinity_masks() and adjust the signature).  This would
> also have no functional impact and would be a fairly trivial patch.
> 
>   3) Add "flags" to struct irq_affinity_desc and the related code.
> This is the real meat of your patch, and with the above out of the
> way, it will be much smaller and it'll be obvious what the important
> changes are.

You beat me to it. I was going to ask the same thing. Albeit I'll go over
the conceptual stuff right now as I roughly know what it tries to achieve.

Thanks,

tglx


Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Thomas Gleixner
On Wed, 28 Nov 2018, Bjorn Helgaas wrote:
> On Wed, Nov 28, 2018 at 10:45 AM Dou Liyang  wrote:
> > if (affd)
> > -   masks = irq_create_affinity_masks(nvec, affd);
> > -
> > +   affi_desc = irq_create_affinity_desc(nvec, affd);
> >
> > /* MSI Entry Initialization */
> > -   entry = alloc_msi_entry(>dev, nvec, masks);
> > +   entry = alloc_msi_entry(>dev, nvec, affi_desc);
> 
> Can you split this into two or more patches?  Most of these changes
> are trivial and not very interesting, and the fact that they're all in
> one patch makes it hard to find and review the interesting bits.  For
> example,
> 
>   1) Rename all the local variables while keeping the type the same
> (or just leave the name the same; I think "affinity" would be a fine
> name, and I would be OK if we ended up with "struct irq_affinity_desc
> *masks" or "struct irq_affinity_desc *affinity").  This patch would
> obviously have no functional impact and would remove a lot of the
> trivial changes.
> 
>   2) Add "struct irq_affinity_desc" containing only "struct cpumask
> masks" and irq_create_affinity_desc() (or leave the name as
> irq_create_affinity_masks() and adjust the signature).  This would
> also have no functional impact and would be a fairly trivial patch.
> 
>   3) Add "flags" to struct irq_affinity_desc and the related code.
> This is the real meat of your patch, and with the above out of the
> way, it will be much smaller and it'll be obvious what the important
> changes are.

You beat me to it. I was going to ask the same thing. Albeit I'll go over
the conceptual stuff right now as I roughly know what it tries to achieve.

Thanks,

tglx


Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Bjorn Helgaas
[+cc linux-pci]

On Wed, Nov 28, 2018 at 10:45 AM Dou Liyang  wrote:
>
> Now, Linux just spread the irq affinity to irqdesc core by a cpumask pointer.
> if an Vector's affinity is not NULL, it will be marked as managed.
>
> But, as Kashyap and Sumit reported, in MSI/-x subsystem, the pre/post vectors
> may be used to some extra reply queues for performance. their affinities are
> not NULL, but, they should be mapped as unmanaged interrupts. So, only
> transfering the irq affinity assignments is not enough
>
> Create a new structure named irq_affinity_desc, which include both the irq
> affinity masks and flags. Replace the cpumask pointer with a irq_affinity_desc
> pointer which allows to expand this in the future without touching all the
> functions ever again, just modify the data irq_affinity_desc structure.
>
> Reported-by: Kashyap Desai 
> Reported-by: Sumit Saxena 

Since you mention reports, are there URLs to mailing list archives you
can include?

> Suggested-by: Thomas Gleixner 
> Signed-off-by: Dou Liyang 
> ---
> Changelog:
>   v2 --> v3
>   - Create a new irq_affinity_desc pointer to transfer the info
> suggested by tglx
>   - rebase to the tip irq/core branch
>
>   v1 --> v2
>   - Add a bitmap for marking if an interrupt is managed or not.
>   the size of bitmap is runtime allocation.
>
>   - Need more tests, Just test this patch in QEmu.
>
>   - v1: https://lkml.org/lkml/2018/9/13/366
>
>  drivers/pci/msi.c | 29 ++--
>  include/linux/interrupt.h | 19 ---
>  include/linux/irq.h   |  3 ++-
>  include/linux/irqdomain.h |  7 ---
>  include/linux/msi.h   |  4 ++--
>  kernel/irq/affinity.c | 40 +--
>  kernel/irq/devres.c   | 23 --
>  kernel/irq/irqdesc.c  | 32 +++
>  kernel/irq/irqdomain.c| 14 +++---
>  kernel/irq/msi.c  | 21 ++--
>  10 files changed, 135 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 265ed3e4c920..431449163316 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -534,16 +534,15 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
>  static struct msi_desc *
>  msi_setup_entry(struct pci_dev *dev, int nvec, const struct irq_affinity 
> *affd)
>  {
> -   struct cpumask *masks = NULL;
> +   struct irq_affinity_desc *affi_desc = NULL;
> struct msi_desc *entry;
> u16 control;
>
> if (affd)
> -   masks = irq_create_affinity_masks(nvec, affd);
> -
> +   affi_desc = irq_create_affinity_desc(nvec, affd);
>
> /* MSI Entry Initialization */
> -   entry = alloc_msi_entry(>dev, nvec, masks);
> +   entry = alloc_msi_entry(>dev, nvec, affi_desc);

Can you split this into two or more patches?  Most of these changes
are trivial and not very interesting, and the fact that they're all in
one patch makes it hard to find and review the interesting bits.  For
example,

  1) Rename all the local variables while keeping the type the same
(or just leave the name the same; I think "affinity" would be a fine
name, and I would be OK if we ended up with "struct irq_affinity_desc
*masks" or "struct irq_affinity_desc *affinity").  This patch would
obviously have no functional impact and would remove a lot of the
trivial changes.

  2) Add "struct irq_affinity_desc" containing only "struct cpumask
masks" and irq_create_affinity_desc() (or leave the name as
irq_create_affinity_masks() and adjust the signature).  This would
also have no functional impact and would be a fairly trivial patch.

  3) Add "flags" to struct irq_affinity_desc and the related code.
This is the real meat of your patch, and with the above out of the
way, it will be much smaller and it'll be obvious what the important
changes are.

> if (!entry)
> goto out;
>
> @@ -567,7 +566,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, const 
> struct irq_affinity *affd)
> pci_read_config_dword(dev, entry->mask_pos, >masked);
>
>  out:
> -   kfree(masks);
> +   kfree(affi_desc);
> return entry;
>  }
>
> @@ -672,15 +671,15 @@ static int msix_setup_entries(struct pci_dev *dev, void 
> __iomem *base,
>   struct msix_entry *entries, int nvec,
>   const struct irq_affinity *affd)
>  {
> -   struct cpumask *curmsk, *masks = NULL;
> +   struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
> struct msi_desc *entry;
> int ret, i;
>
> if (affd)
> -   masks = irq_create_affinity_masks(nvec, affd);
> +   affi_desc = irq_create_affinity_desc(nvec, affd);
>
> -   for (i = 0, curmsk = masks; i < nvec; i++) {
> -   entry = alloc_msi_entry(>dev, 1, curmsk);
> +   for (i = 0, cur_affi_desc = affi_desc; i < nvec; i++) {
> +   entry = 

Re: [RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Bjorn Helgaas
[+cc linux-pci]

On Wed, Nov 28, 2018 at 10:45 AM Dou Liyang  wrote:
>
> Now, Linux just spread the irq affinity to irqdesc core by a cpumask pointer.
> if an Vector's affinity is not NULL, it will be marked as managed.
>
> But, as Kashyap and Sumit reported, in MSI/-x subsystem, the pre/post vectors
> may be used to some extra reply queues for performance. their affinities are
> not NULL, but, they should be mapped as unmanaged interrupts. So, only
> transfering the irq affinity assignments is not enough
>
> Create a new structure named irq_affinity_desc, which include both the irq
> affinity masks and flags. Replace the cpumask pointer with a irq_affinity_desc
> pointer which allows to expand this in the future without touching all the
> functions ever again, just modify the data irq_affinity_desc structure.
>
> Reported-by: Kashyap Desai 
> Reported-by: Sumit Saxena 

Since you mention reports, are there URLs to mailing list archives you
can include?

> Suggested-by: Thomas Gleixner 
> Signed-off-by: Dou Liyang 
> ---
> Changelog:
>   v2 --> v3
>   - Create a new irq_affinity_desc pointer to transfer the info
> suggested by tglx
>   - rebase to the tip irq/core branch
>
>   v1 --> v2
>   - Add a bitmap for marking if an interrupt is managed or not.
>   the size of bitmap is runtime allocation.
>
>   - Need more tests, Just test this patch in QEmu.
>
>   - v1: https://lkml.org/lkml/2018/9/13/366
>
>  drivers/pci/msi.c | 29 ++--
>  include/linux/interrupt.h | 19 ---
>  include/linux/irq.h   |  3 ++-
>  include/linux/irqdomain.h |  7 ---
>  include/linux/msi.h   |  4 ++--
>  kernel/irq/affinity.c | 40 +--
>  kernel/irq/devres.c   | 23 --
>  kernel/irq/irqdesc.c  | 32 +++
>  kernel/irq/irqdomain.c| 14 +++---
>  kernel/irq/msi.c  | 21 ++--
>  10 files changed, 135 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 265ed3e4c920..431449163316 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -534,16 +534,15 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
>  static struct msi_desc *
>  msi_setup_entry(struct pci_dev *dev, int nvec, const struct irq_affinity 
> *affd)
>  {
> -   struct cpumask *masks = NULL;
> +   struct irq_affinity_desc *affi_desc = NULL;
> struct msi_desc *entry;
> u16 control;
>
> if (affd)
> -   masks = irq_create_affinity_masks(nvec, affd);
> -
> +   affi_desc = irq_create_affinity_desc(nvec, affd);
>
> /* MSI Entry Initialization */
> -   entry = alloc_msi_entry(>dev, nvec, masks);
> +   entry = alloc_msi_entry(>dev, nvec, affi_desc);

Can you split this into two or more patches?  Most of these changes
are trivial and not very interesting, and the fact that they're all in
one patch makes it hard to find and review the interesting bits.  For
example,

  1) Rename all the local variables while keeping the type the same
(or just leave the name the same; I think "affinity" would be a fine
name, and I would be OK if we ended up with "struct irq_affinity_desc
*masks" or "struct irq_affinity_desc *affinity").  This patch would
obviously have no functional impact and would remove a lot of the
trivial changes.

  2) Add "struct irq_affinity_desc" containing only "struct cpumask
masks" and irq_create_affinity_desc() (or leave the name as
irq_create_affinity_masks() and adjust the signature).  This would
also have no functional impact and would be a fairly trivial patch.

  3) Add "flags" to struct irq_affinity_desc and the related code.
This is the real meat of your patch, and with the above out of the
way, it will be much smaller and it'll be obvious what the important
changes are.

> if (!entry)
> goto out;
>
> @@ -567,7 +566,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, const 
> struct irq_affinity *affd)
> pci_read_config_dword(dev, entry->mask_pos, >masked);
>
>  out:
> -   kfree(masks);
> +   kfree(affi_desc);
> return entry;
>  }
>
> @@ -672,15 +671,15 @@ static int msix_setup_entries(struct pci_dev *dev, void 
> __iomem *base,
>   struct msix_entry *entries, int nvec,
>   const struct irq_affinity *affd)
>  {
> -   struct cpumask *curmsk, *masks = NULL;
> +   struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
> struct msi_desc *entry;
> int ret, i;
>
> if (affd)
> -   masks = irq_create_affinity_masks(nvec, affd);
> +   affi_desc = irq_create_affinity_desc(nvec, affd);
>
> -   for (i = 0, curmsk = masks; i < nvec; i++) {
> -   entry = alloc_msi_entry(>dev, 1, curmsk);
> +   for (i = 0, cur_affi_desc = affi_desc; i < nvec; i++) {
> +   entry = 

[RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Dou Liyang
Now, Linux just spread the irq affinity to irqdesc core by a cpumask pointer.
if an Vector's affinity is not NULL, it will be marked as managed.

But, as Kashyap and Sumit reported, in MSI/-x subsystem, the pre/post vectors
may be used to some extra reply queues for performance. their affinities are
not NULL, but, they should be mapped as unmanaged interrupts. So, only
transfering the irq affinity assignments is not enough

Create a new structure named irq_affinity_desc, which include both the irq
affinity masks and flags. Replace the cpumask pointer with a irq_affinity_desc
pointer which allows to expand this in the future without touching all the
functions ever again, just modify the data irq_affinity_desc structure.

Reported-by: Kashyap Desai 
Reported-by: Sumit Saxena 
Suggested-by: Thomas Gleixner 
Signed-off-by: Dou Liyang 
---
Changelog:
  v2 --> v3
  - Create a new irq_affinity_desc pointer to transfer the info
suggested by tglx
  - rebase to the tip irq/core branch

  v1 --> v2
  - Add a bitmap for marking if an interrupt is managed or not.
  the size of bitmap is runtime allocation.

  - Need more tests, Just test this patch in QEmu.

  - v1: https://lkml.org/lkml/2018/9/13/366

 drivers/pci/msi.c | 29 ++--
 include/linux/interrupt.h | 19 ---
 include/linux/irq.h   |  3 ++-
 include/linux/irqdomain.h |  7 ---
 include/linux/msi.h   |  4 ++--
 kernel/irq/affinity.c | 40 +--
 kernel/irq/devres.c   | 23 --
 kernel/irq/irqdesc.c  | 32 +++
 kernel/irq/irqdomain.c| 14 +++---
 kernel/irq/msi.c  | 21 ++--
 10 files changed, 135 insertions(+), 57 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 265ed3e4c920..431449163316 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -534,16 +534,15 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
 static struct msi_desc *
 msi_setup_entry(struct pci_dev *dev, int nvec, const struct irq_affinity *affd)
 {
-   struct cpumask *masks = NULL;
+   struct irq_affinity_desc *affi_desc = NULL;
struct msi_desc *entry;
u16 control;
 
if (affd)
-   masks = irq_create_affinity_masks(nvec, affd);
-
+   affi_desc = irq_create_affinity_desc(nvec, affd);
 
/* MSI Entry Initialization */
-   entry = alloc_msi_entry(>dev, nvec, masks);
+   entry = alloc_msi_entry(>dev, nvec, affi_desc);
if (!entry)
goto out;
 
@@ -567,7 +566,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, const struct 
irq_affinity *affd)
pci_read_config_dword(dev, entry->mask_pos, >masked);
 
 out:
-   kfree(masks);
+   kfree(affi_desc);
return entry;
 }
 
@@ -672,15 +671,15 @@ static int msix_setup_entries(struct pci_dev *dev, void 
__iomem *base,
  struct msix_entry *entries, int nvec,
  const struct irq_affinity *affd)
 {
-   struct cpumask *curmsk, *masks = NULL;
+   struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
struct msi_desc *entry;
int ret, i;
 
if (affd)
-   masks = irq_create_affinity_masks(nvec, affd);
+   affi_desc = irq_create_affinity_desc(nvec, affd);
 
-   for (i = 0, curmsk = masks; i < nvec; i++) {
-   entry = alloc_msi_entry(>dev, 1, curmsk);
+   for (i = 0, cur_affi_desc = affi_desc; i < nvec; i++) {
+   entry = alloc_msi_entry(>dev, 1, cur_affi_desc);
if (!entry) {
if (!i)
iounmap(base);
@@ -701,12 +700,12 @@ static int msix_setup_entries(struct pci_dev *dev, void 
__iomem *base,
entry->mask_base= base;
 
list_add_tail(>list, dev_to_msi_list(>dev));
-   if (masks)
-   curmsk++;
+   if (affi_desc)
+   cur_affi_desc++;
}
ret = 0;
 out:
-   kfree(masks);
+   kfree(affi_desc);
return ret;
 }
 
@@ -1264,7 +1263,7 @@ const struct cpumask *pci_irq_get_affinity(struct pci_dev 
*dev, int nr)
 
for_each_pci_msi_entry(entry, dev) {
if (i == nr)
-   return entry->affinity;
+   return >affi_desc->masks;
i++;
}
WARN_ON_ONCE(1);
@@ -1272,11 +1271,11 @@ const struct cpumask *pci_irq_get_affinity(struct 
pci_dev *dev, int nr)
} else if (dev->msi_enabled) {
struct msi_desc *entry = first_pci_msi_entry(dev);
 
-   if (WARN_ON_ONCE(!entry || !entry->affinity ||
+   if (WARN_ON_ONCE(!entry || !entry->affi_desc ||
 nr >= entry->nvec_used))

[RFC PATCH v3] genirq/affinity: Create and transfer more irq desc info by a new structure

2018-11-28 Thread Dou Liyang
Now, Linux just spread the irq affinity to irqdesc core by a cpumask pointer.
if an Vector's affinity is not NULL, it will be marked as managed.

But, as Kashyap and Sumit reported, in MSI/-x subsystem, the pre/post vectors
may be used to some extra reply queues for performance. their affinities are
not NULL, but, they should be mapped as unmanaged interrupts. So, only
transfering the irq affinity assignments is not enough

Create a new structure named irq_affinity_desc, which include both the irq
affinity masks and flags. Replace the cpumask pointer with a irq_affinity_desc
pointer which allows to expand this in the future without touching all the
functions ever again, just modify the data irq_affinity_desc structure.

Reported-by: Kashyap Desai 
Reported-by: Sumit Saxena 
Suggested-by: Thomas Gleixner 
Signed-off-by: Dou Liyang 
---
Changelog:
  v2 --> v3
  - Create a new irq_affinity_desc pointer to transfer the info
suggested by tglx
  - rebase to the tip irq/core branch

  v1 --> v2
  - Add a bitmap for marking if an interrupt is managed or not.
  the size of bitmap is runtime allocation.

  - Need more tests, Just test this patch in QEmu.

  - v1: https://lkml.org/lkml/2018/9/13/366

 drivers/pci/msi.c | 29 ++--
 include/linux/interrupt.h | 19 ---
 include/linux/irq.h   |  3 ++-
 include/linux/irqdomain.h |  7 ---
 include/linux/msi.h   |  4 ++--
 kernel/irq/affinity.c | 40 +--
 kernel/irq/devres.c   | 23 --
 kernel/irq/irqdesc.c  | 32 +++
 kernel/irq/irqdomain.c| 14 +++---
 kernel/irq/msi.c  | 21 ++--
 10 files changed, 135 insertions(+), 57 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 265ed3e4c920..431449163316 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -534,16 +534,15 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
 static struct msi_desc *
 msi_setup_entry(struct pci_dev *dev, int nvec, const struct irq_affinity *affd)
 {
-   struct cpumask *masks = NULL;
+   struct irq_affinity_desc *affi_desc = NULL;
struct msi_desc *entry;
u16 control;
 
if (affd)
-   masks = irq_create_affinity_masks(nvec, affd);
-
+   affi_desc = irq_create_affinity_desc(nvec, affd);
 
/* MSI Entry Initialization */
-   entry = alloc_msi_entry(>dev, nvec, masks);
+   entry = alloc_msi_entry(>dev, nvec, affi_desc);
if (!entry)
goto out;
 
@@ -567,7 +566,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, const struct 
irq_affinity *affd)
pci_read_config_dword(dev, entry->mask_pos, >masked);
 
 out:
-   kfree(masks);
+   kfree(affi_desc);
return entry;
 }
 
@@ -672,15 +671,15 @@ static int msix_setup_entries(struct pci_dev *dev, void 
__iomem *base,
  struct msix_entry *entries, int nvec,
  const struct irq_affinity *affd)
 {
-   struct cpumask *curmsk, *masks = NULL;
+   struct irq_affinity_desc *cur_affi_desc, *affi_desc = NULL;
struct msi_desc *entry;
int ret, i;
 
if (affd)
-   masks = irq_create_affinity_masks(nvec, affd);
+   affi_desc = irq_create_affinity_desc(nvec, affd);
 
-   for (i = 0, curmsk = masks; i < nvec; i++) {
-   entry = alloc_msi_entry(>dev, 1, curmsk);
+   for (i = 0, cur_affi_desc = affi_desc; i < nvec; i++) {
+   entry = alloc_msi_entry(>dev, 1, cur_affi_desc);
if (!entry) {
if (!i)
iounmap(base);
@@ -701,12 +700,12 @@ static int msix_setup_entries(struct pci_dev *dev, void 
__iomem *base,
entry->mask_base= base;
 
list_add_tail(>list, dev_to_msi_list(>dev));
-   if (masks)
-   curmsk++;
+   if (affi_desc)
+   cur_affi_desc++;
}
ret = 0;
 out:
-   kfree(masks);
+   kfree(affi_desc);
return ret;
 }
 
@@ -1264,7 +1263,7 @@ const struct cpumask *pci_irq_get_affinity(struct pci_dev 
*dev, int nr)
 
for_each_pci_msi_entry(entry, dev) {
if (i == nr)
-   return entry->affinity;
+   return >affi_desc->masks;
i++;
}
WARN_ON_ONCE(1);
@@ -1272,11 +1271,11 @@ const struct cpumask *pci_irq_get_affinity(struct 
pci_dev *dev, int nr)
} else if (dev->msi_enabled) {
struct msi_desc *entry = first_pci_msi_entry(dev);
 
-   if (WARN_ON_ONCE(!entry || !entry->affinity ||
+   if (WARN_ON_ONCE(!entry || !entry->affi_desc ||
 nr >= entry->nvec_used))