Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs
On 22.11.2024 09:01, Roger Pau Monné wrote: > On Thu, Nov 21, 2024 at 12:39:06PM +0100, Jan Beulich wrote: >> On 21.11.2024 12:04, Roger Pau Monné wrote: >>> On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: On 20.11.2024 12:35, Roger Pau Monne wrote: > The current calculation of PV dom0 pIRQs uses: > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > The usage of fls() is wrong, as num_present_cpus() already returns the > number > of present CPUs, not the bitmap mask of CPUs. Hmm. Perhaps that use of fls() should have been accompanied by a comment, but I think it might have been put there intentionally, to avoid linear growth. Which isn't to say that I mind the adjustment, especially now that we don't use any clustered modes anymore for I/O interrupts. I'm merely questioning the Fixes: tag, and with that whether / how far to backport. >>> >>> Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap >>> dom0 vCPUs with the fls of the present CPUs number. For consistency, >>> if the intention was to use fls to limit growth, I would have expected >>> to also be applied to the dom0 number of vCPUs. >> >> FTR: My vague recollection (it has been nearly 10 years) is that I first had >> it there, too. Until I realized that it hardly ever would have any effect, >> because of the min(). And for Dom0-s with extremely few vCPU-s it seemed >> reasonable to not apply that cap there. > > I don't have a strong opinion regarding the fixes tag, but would like > to get this sorted as soon as possible. If you prefer just drop the > fixes tag. I think this wants backporting to all supported releases, > because it's an issue XenServer has hit on real servers when dom0 is > sized to use 16 vCPUs at most. In which case yes, I guess we want to keep the Fixes:. Jan
Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs
On Thu, Nov 21, 2024 at 12:39:06PM +0100, Jan Beulich wrote: > On 21.11.2024 12:04, Roger Pau Monné wrote: > > On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: > >> On 20.11.2024 12:35, Roger Pau Monne wrote: > >>> The current calculation of PV dom0 pIRQs uses: > >>> > >>> n = min(fls(num_present_cpus()), dom0_max_vcpus()); > >>> > >>> The usage of fls() is wrong, as num_present_cpus() already returns the > >>> number > >>> of present CPUs, not the bitmap mask of CPUs. > >> > >> Hmm. Perhaps that use of fls() should have been accompanied by a comment, > >> but > >> I think it might have been put there intentionally, to avoid linear growth. > >> Which isn't to say that I mind the adjustment, especially now that we don't > >> use any clustered modes anymore for I/O interrupts. I'm merely questioning > >> the Fixes: tag, and with that whether / how far to backport. > > > > Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap > > dom0 vCPUs with the fls of the present CPUs number. For consistency, > > if the intention was to use fls to limit growth, I would have expected > > to also be applied to the dom0 number of vCPUs. > > FTR: My vague recollection (it has been nearly 10 years) is that I first had > it there, too. Until I realized that it hardly ever would have any effect, > because of the min(). And for Dom0-s with extremely few vCPU-s it seemed > reasonable to not apply that cap there. I don't have a strong opinion regarding the fixes tag, but would like to get this sorted as soon as possible. If you prefer just drop the fixes tag. I think this wants backporting to all supported releases, because it's an issue XenServer has hit on real servers when dom0 is sized to use 16 vCPUs at most. Thanks, Roger.
Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs
On 21.11.2024 12:04, Roger Pau Monné wrote: > On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: >> On 20.11.2024 12:35, Roger Pau Monne wrote: >>> The current calculation of PV dom0 pIRQs uses: >>> >>> n = min(fls(num_present_cpus()), dom0_max_vcpus()); >>> >>> The usage of fls() is wrong, as num_present_cpus() already returns the >>> number >>> of present CPUs, not the bitmap mask of CPUs. >> >> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but >> I think it might have been put there intentionally, to avoid linear growth. >> Which isn't to say that I mind the adjustment, especially now that we don't >> use any clustered modes anymore for I/O interrupts. I'm merely questioning >> the Fixes: tag, and with that whether / how far to backport. > > Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap > dom0 vCPUs with the fls of the present CPUs number. For consistency, > if the intention was to use fls to limit growth, I would have expected > to also be applied to the dom0 number of vCPUs. FTR: My vague recollection (it has been nearly 10 years) is that I first had it there, too. Until I realized that it hardly ever would have any effect, because of the min(). And for Dom0-s with extremely few vCPU-s it seemed reasonable to not apply that cap there. Jan
Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs
On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: > On 20.11.2024 12:35, Roger Pau Monne wrote: > > The current calculation of PV dom0 pIRQs uses: > > > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > > > The usage of fls() is wrong, as num_present_cpus() already returns the > > number > > of present CPUs, not the bitmap mask of CPUs. > > Hmm. Perhaps that use of fls() should have been accompanied by a comment, but > I think it might have been put there intentionally, to avoid linear growth. > Which isn't to say that I mind the adjustment, especially now that we don't > use any clustered modes anymore for I/O interrupts. I'm merely questioning > the Fixes: tag, and with that whether / how far to backport. Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap dom0 vCPUs with the fls of the present CPUs number. For consistency, if the intention was to use fls to limit growth, I would have expected to also be applied to the dom0 number of vCPUs. And a comment would have been nice indeed :). In any case this is hurting XenServer now: we got reports of pIRQ exhaustion on some systems. Thanks, Roger.
Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs
On 20.11.2024 12:35, Roger Pau Monne wrote: > The current calculation of PV dom0 pIRQs uses: > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > The usage of fls() is wrong, as num_present_cpus() already returns the number > of present CPUs, not the bitmap mask of CPUs. Hmm. Perhaps that use of fls() should have been accompanied by a comment, but I think it might have been put there intentionally, to avoid linear growth. Which isn't to say that I mind the adjustment, especially now that we don't use any clustered modes anymore for I/O interrupts. I'm merely questioning the Fixes: tag, and with that whether / how far to backport. Jan
Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs
On 20/11/2024 11:35 am, Roger Pau Monne wrote: > The current calculation of PV dom0 pIRQs uses: > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > The usage of fls() is wrong, as num_present_cpus() already returns the number > of present CPUs, not the bitmap mask of CPUs. > > Fix by removing the usage of fls(). > > Fixes: 7e73a6e7f12a ('have architectures specify the number of PIRQs a > hardware domain gets') > Signed-off-by: Roger Pau Monné Yeah, that fls() fails the dimensional analysis sniff test. Acked-by: Andrew Cooper Is there any hint as to what the reasoning was? ~Andrew