Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
On 09/05/18 12:14, Mirela Simonovic wrote: Hi Julien, On Wed, May 9, 2018 at 1:01 PM, Julien Grallwrote: On 09/05/18 11:10, Mirela Simonovic wrote: On Fri, Apr 27, 2018 at 5:12 PM, Julien Grall wrote: On 27/04/18 15:38, Mirela Simonovic wrote: On Fri, Apr 27, 2018 at 4:15 PM, Tim Deegan wrote: At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: On 26/04/18 15:23, Tim Deegan wrote: At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: On 20/04/18 13:25, Mirela Simonovic wrote: While I guess this code makes no harm, it does not do what is expected (i.e draining the interrupt). I can't see any reason to keep wrong code, we should really aim to have code that match the architecture. And better to fix it when we discover the problem rather than waiting until we rediscovered it later. So at least a patch to update the code/comment should be done. I don't feel comfortable removing these 3 lines because I have no way to test and guarantee that the change will not introduce any issues. The work you are doing (suspend/resume, hotplug) is not easy to test and very subtle to get it right. Testing can only uncover obvious bug on your platform. IMHO, we can only rely on the specifications (ARM ARM, PSCI...) and extensive review of the series (and the code around). However, if despite all you really want me to remove these lines within this series I don't have a problem doing that in a separate patch. Please just confirm the plan. I am quite confident that this code should not be there or at least not in its current form. Do you want me to include the removal of that code in this series or not? It is not necessary to be in that series. Cheers, Thanks, Mirela Cheers, -- Julien Grall -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi Julien, On Wed, May 9, 2018 at 1:01 PM, Julien Grallwrote: > > > On 09/05/18 11:10, Mirela Simonovic wrote: >> >> On Fri, Apr 27, 2018 at 5:12 PM, Julien Grall >> wrote: >>> >>> On 27/04/18 15:38, Mirela Simonovic wrote: On Fri, Apr 27, 2018 at 4:15 PM, Tim Deegan wrote: > > At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: >> >> On 26/04/18 15:23, Tim Deegan wrote: >>> >>> At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: On 20/04/18 13:25, Mirela Simonovic wrote: >>> >>> While I guess this code makes no harm, it does not do what is expected >>> (i.e >>> draining the interrupt). I can't see any reason to keep wrong code, we >>> should really aim to have code that match the architecture. And better to >>> fix it when we discover the problem rather than waiting until we >>> rediscovered it later. >>> >>> So at least a patch to update the code/comment should be done. >>> >> >> I don't feel comfortable removing these 3 lines because I have no way >> to test and guarantee that the change will not introduce any issues. > > > The work you are doing (suspend/resume, hotplug) is not easy to test and > very subtle to get it right. Testing can only uncover obvious bug on your > platform. IMHO, we can only rely on the specifications (ARM ARM, PSCI...) > and extensive review of the series (and the code around). > >> However, if despite all you really want me to remove these lines >> within this series I don't have a problem doing that in a separate >> patch. Please just confirm the plan. > > > I am quite confident that this code should not be there or at least not in > its current form. > Do you want me to include the removal of that code in this series or not? Thanks, Mirela > Cheers, > > -- > Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
On 09/05/18 11:10, Mirela Simonovic wrote: On Fri, Apr 27, 2018 at 5:12 PM, Julien Grallwrote: On 27/04/18 15:38, Mirela Simonovic wrote: On Fri, Apr 27, 2018 at 4:15 PM, Tim Deegan wrote: At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: On 26/04/18 15:23, Tim Deegan wrote: At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: On 20/04/18 13:25, Mirela Simonovic wrote: While I guess this code makes no harm, it does not do what is expected (i.e draining the interrupt). I can't see any reason to keep wrong code, we should really aim to have code that match the architecture. And better to fix it when we discover the problem rather than waiting until we rediscovered it later. So at least a patch to update the code/comment should be done. I don't feel comfortable removing these 3 lines because I have no way to test and guarantee that the change will not introduce any issues. The work you are doing (suspend/resume, hotplug) is not easy to test and very subtle to get it right. Testing can only uncover obvious bug on your platform. IMHO, we can only rely on the specifications (ARM ARM, PSCI...) and extensive review of the series (and the code around). However, if despite all you really want me to remove these lines within this series I don't have a problem doing that in a separate patch. Please just confirm the plan. I am quite confident that this code should not be there or at least not in its current form. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi Julien, On Fri, Apr 27, 2018 at 5:12 PM, Julien Grallwrote: > > > On 27/04/18 15:38, Mirela Simonovic wrote: >> >> Hi, >> >> On Fri, Apr 27, 2018 at 4:15 PM, Tim Deegan wrote: >>> >>> Hi, >>> >>> At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: On 26/04/18 15:23, Tim Deegan wrote: > > At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: >> >> On 20/04/18 13:25, Mirela Simonovic wrote: This looks a bit weird. AFAIU, if you disable the CPU interface, then you should never receive interrupt after. So why would you re-enable them? I realize the code in __cpu_disbale do that, but this looks quite wrong to me. There are no way to receive queued timer interrupt afterwards. >>> >>> That is what I took as a reference, but I asked myself the same. >>> There is (extremely small, but it exists) time window between >>> disabling irq locally and disabling CPU interface. An interrupt >>> received in that time window would propagate to the CPU but I'm not >>> sure would happen after the GIC CPU interface is disabled and >>> interrupts are locally enabled. That is the only explanation I can >>> come up with, although I believe the answer is nothing. Since you're >>> at ARM you could check this internally. >> >> >> Speaking with Andre (in CC), the GIC CPU interface may have forwarded >> an >> interrupt to the processor before it gets disabled. So when the >> interrupt will be re-enabled, the processor will jump to the interrupt >> exception entry. >> >> However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read >> a >> spurious interrupt ID from GICC_IAR. So I am not sure what the point >> of >> that code. It looks like it has been taken from x86, but some bits are >> missing. >> >> AFAIU, x86 will only suspend the timer afterwards (see time_suspend). >> I >> am not fully sure why this code is there on Arm. Whether we expect a >> timer interrupt to come up. Stefano, Tim, do you have any insight on >> that code? > > > Sorry, no. I pretty clearly copied this logic from x86, which copied > it directly from Linux at some point in the past. I don't know why > x86 does it this way, and I haven't dived into linux to find out. :) > But draining the outstanding IRQs seems like a polite thing to do if > you're ever going to re-enable this CPU (at least without resetting > it first). I am not entirely sure what you mean by draining, do you mean they will serviced by Xen? If so, what kind of interrupts do you expect to be serviced (e.g PPI, SPIs) ? >>> >>> >>> All I mean is, when you disable the GICC (or APIC, or whatever), you >>> know that it won't send any more interrupts to the CPU. But you would >>> like to also be certain that any interrupts it already sent to the CPU >>> get processed now. Otherwise, if you bring the CPU up again later >>> that interrupt could still be there. Better to get it out of the way >>> now, right? >>> >>> AIUI that's what x86 is doing by re-enabling interrupts and waiting a >>> bit, which seems a bit crude but OK. ARM could maybe do the same >>> thing by disabling GICC, dsb(), then disable interrupts. But I don't >>> understand the interface between GICD, GICC and CPU well enough to >>> reason about it properly. >>> >>> It's also possible that there's some subtlety of the timer interrupt >>> handling that I don't know about -- I _think_ that the reason timer >>> interrupts are relevant is that they're generated inside the APIC, >>> so that even when no interrupts are routed to the core, the APIC could >>> still generate one as it's being shut down. >>> Clearly, this code does not seem to be doing what we are expecting. Speaking the Marc Z. (GIC maintainers in Linux), there are no need to disable the GIC CPU interface in the hypervisor/OS. You are going to shutdown the CPU and it will be reset when you are coming back. >>> >>> >> >> I don't think this assumption is guaranteed to be correct. In current >> implementation of ATF and if no additional security software runs the >> assumption would likely be correct, but it wouldn't be correct in >> general. Linux does disable GICC in such a scenario. > > Can you please give a pointer to code in Linux? The only place I see the > GICC disabled is when using versatile TC2. It looks like it is just because > they don't have PSCI support so Linux has to do the power management. That's > not a platform we are ever going to fully support in Xen. > You already found it. Good to clarify, thanks. >> So changing this could cause problems in some scenarios, while keeping >> it makes no harm. > > > While I guess this code makes no harm, it does not do what is expected (i.e >
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
On 27/04/18 15:38, Mirela Simonovic wrote: Hi, On Fri, Apr 27, 2018 at 4:15 PM, Tim Deeganwrote: Hi, At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: On 26/04/18 15:23, Tim Deegan wrote: At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: On 20/04/18 13:25, Mirela Simonovic wrote: This looks a bit weird. AFAIU, if you disable the CPU interface, then you should never receive interrupt after. So why would you re-enable them? I realize the code in __cpu_disbale do that, but this looks quite wrong to me. There are no way to receive queued timer interrupt afterwards. That is what I took as a reference, but I asked myself the same. There is (extremely small, but it exists) time window between disabling irq locally and disabling CPU interface. An interrupt received in that time window would propagate to the CPU but I'm not sure would happen after the GIC CPU interface is disabled and interrupts are locally enabled. That is the only explanation I can come up with, although I believe the answer is nothing. Since you're at ARM you could check this internally. Speaking with Andre (in CC), the GIC CPU interface may have forwarded an interrupt to the processor before it gets disabled. So when the interrupt will be re-enabled, the processor will jump to the interrupt exception entry. However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read a spurious interrupt ID from GICC_IAR. So I am not sure what the point of that code. It looks like it has been taken from x86, but some bits are missing. AFAIU, x86 will only suspend the timer afterwards (see time_suspend). I am not fully sure why this code is there on Arm. Whether we expect a timer interrupt to come up. Stefano, Tim, do you have any insight on that code? Sorry, no. I pretty clearly copied this logic from x86, which copied it directly from Linux at some point in the past. I don't know why x86 does it this way, and I haven't dived into linux to find out. :) But draining the outstanding IRQs seems like a polite thing to do if you're ever going to re-enable this CPU (at least without resetting it first). I am not entirely sure what you mean by draining, do you mean they will serviced by Xen? If so, what kind of interrupts do you expect to be serviced (e.g PPI, SPIs) ? All I mean is, when you disable the GICC (or APIC, or whatever), you know that it won't send any more interrupts to the CPU. But you would like to also be certain that any interrupts it already sent to the CPU get processed now. Otherwise, if you bring the CPU up again later that interrupt could still be there. Better to get it out of the way now, right? AIUI that's what x86 is doing by re-enabling interrupts and waiting a bit, which seems a bit crude but OK. ARM could maybe do the same thing by disabling GICC, dsb(), then disable interrupts. But I don't understand the interface between GICD, GICC and CPU well enough to reason about it properly. It's also possible that there's some subtlety of the timer interrupt handling that I don't know about -- I _think_ that the reason timer interrupts are relevant is that they're generated inside the APIC, so that even when no interrupts are routed to the core, the APIC could still generate one as it's being shut down. Clearly, this code does not seem to be doing what we are expecting. Speaking the Marc Z. (GIC maintainers in Linux), there are no need to disable the GIC CPU interface in the hypervisor/OS. You are going to shutdown the CPU and it will be reset when you are coming back. I don't think this assumption is guaranteed to be correct. In current implementation of ATF and if no additional security software runs the assumption would likely be correct, but it wouldn't be correct in general. Linux does disable GICC in such a scenario. Can you please give a pointer to code in Linux? The only place I see the GICC disabled is when using versatile TC2. It looks like it is just because they don't have PSCI support so Linux has to do the power management. That's not a platform we are ever going to fully support in Xen. So changing this could cause problems in some scenarios, while keeping it makes no harm. While I guess this code makes no harm, it does not do what is expected (i.e draining the interrupt). I can't see any reason to keep wrong code, we should really aim to have code that match the architecture. And better to fix it when we discover the problem rather than waiting until we rediscovered it later. So at least a patch to update the code/comment should be done. Thanks for the feedback, I really appreciate it. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi, On Fri, Apr 27, 2018 at 4:15 PM, Tim Deeganwrote: > Hi, > > At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: >> On 26/04/18 15:23, Tim Deegan wrote: >> > At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: >> >> On 20/04/18 13:25, Mirela Simonovic wrote: >> This looks a bit weird. AFAIU, if you disable the CPU interface, then >> you >> should never receive interrupt after. So why would you re-enable them? >> >> I realize the code in __cpu_disbale do that, but this looks quite wrong >> to >> me. There are no way to receive queued timer interrupt afterwards. >> >> >>> >> >>> That is what I took as a reference, but I asked myself the same. >> >>> There is (extremely small, but it exists) time window between >> >>> disabling irq locally and disabling CPU interface. An interrupt >> >>> received in that time window would propagate to the CPU but I'm not >> >>> sure would happen after the GIC CPU interface is disabled and >> >>> interrupts are locally enabled. That is the only explanation I can >> >>> come up with, although I believe the answer is nothing. Since you're >> >>> at ARM you could check this internally. >> >> >> >> Speaking with Andre (in CC), the GIC CPU interface may have forwarded an >> >> interrupt to the processor before it gets disabled. So when the >> >> interrupt will be re-enabled, the processor will jump to the interrupt >> >> exception entry. >> >> >> >> However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read a >> >> spurious interrupt ID from GICC_IAR. So I am not sure what the point of >> >> that code. It looks like it has been taken from x86, but some bits are >> >> missing. >> >> >> >> AFAIU, x86 will only suspend the timer afterwards (see time_suspend). I >> >> am not fully sure why this code is there on Arm. Whether we expect a >> >> timer interrupt to come up. Stefano, Tim, do you have any insight on >> >> that code? >> > >> > Sorry, no. I pretty clearly copied this logic from x86, which copied >> > it directly from Linux at some point in the past. I don't know why >> > x86 does it this way, and I haven't dived into linux to find out. :) >> > But draining the outstanding IRQs seems like a polite thing to do if >> > you're ever going to re-enable this CPU (at least without resetting >> > it first). >> >> I am not entirely sure what you mean by draining, do you mean they will >> serviced by Xen? If so, what kind of interrupts do you expect to be >> serviced (e.g PPI, SPIs) ? > > All I mean is, when you disable the GICC (or APIC, or whatever), you > know that it won't send any more interrupts to the CPU. But you would > like to also be certain that any interrupts it already sent to the CPU > get processed now. Otherwise, if you bring the CPU up again later > that interrupt could still be there. Better to get it out of the way > now, right? > > AIUI that's what x86 is doing by re-enabling interrupts and waiting a > bit, which seems a bit crude but OK. ARM could maybe do the same > thing by disabling GICC, dsb(), then disable interrupts. But I don't > understand the interface between GICD, GICC and CPU well enough to > reason about it properly. > > It's also possible that there's some subtlety of the timer interrupt > handling that I don't know about -- I _think_ that the reason timer > interrupts are relevant is that they're generated inside the APIC, > so that even when no interrupts are routed to the core, the APIC could > still generate one as it's being shut down. > >> Clearly, this code does not seem to be doing what we are expecting. >> Speaking the Marc Z. (GIC maintainers in Linux), there are no need to >> disable the GIC CPU interface in the hypervisor/OS. You are going to >> shutdown the CPU and it will be reset when you are coming back. > I don't think this assumption is guaranteed to be correct. In current implementation of ATF and if no additional security software runs the assumption would likely be correct, but it wouldn't be correct in general. Linux does disable GICC in such a scenario. So changing this could cause problems in some scenarios, while keeping it makes no harm. Thanks for the feedback, I really appreciate it. > Well that answers my question, then. If you know you're going to > reset the core (and not just put it in a deep sleep and wake it up > again) then I think this is all moot and you can just disable > interrupts once. > > Cheers, > > Tim. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi, At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote: > On 26/04/18 15:23, Tim Deegan wrote: > > At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote: > >> On 20/04/18 13:25, Mirela Simonovic wrote: > This looks a bit weird. AFAIU, if you disable the CPU interface, then you > should never receive interrupt after. So why would you re-enable them? > > I realize the code in __cpu_disbale do that, but this looks quite wrong > to > me. There are no way to receive queued timer interrupt afterwards. > > >>> > >>> That is what I took as a reference, but I asked myself the same. > >>> There is (extremely small, but it exists) time window between > >>> disabling irq locally and disabling CPU interface. An interrupt > >>> received in that time window would propagate to the CPU but I'm not > >>> sure would happen after the GIC CPU interface is disabled and > >>> interrupts are locally enabled. That is the only explanation I can > >>> come up with, although I believe the answer is nothing. Since you're > >>> at ARM you could check this internally. > >> > >> Speaking with Andre (in CC), the GIC CPU interface may have forwarded an > >> interrupt to the processor before it gets disabled. So when the > >> interrupt will be re-enabled, the processor will jump to the interrupt > >> exception entry. > >> > >> However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read a > >> spurious interrupt ID from GICC_IAR. So I am not sure what the point of > >> that code. It looks like it has been taken from x86, but some bits are > >> missing. > >> > >> AFAIU, x86 will only suspend the timer afterwards (see time_suspend). I > >> am not fully sure why this code is there on Arm. Whether we expect a > >> timer interrupt to come up. Stefano, Tim, do you have any insight on > >> that code? > > > > Sorry, no. I pretty clearly copied this logic from x86, which copied > > it directly from Linux at some point in the past. I don't know why > > x86 does it this way, and I haven't dived into linux to find out. :) > > But draining the outstanding IRQs seems like a polite thing to do if > > you're ever going to re-enable this CPU (at least without resetting > > it first). > > I am not entirely sure what you mean by draining, do you mean they will > serviced by Xen? If so, what kind of interrupts do you expect to be > serviced (e.g PPI, SPIs) ? All I mean is, when you disable the GICC (or APIC, or whatever), you know that it won't send any more interrupts to the CPU. But you would like to also be certain that any interrupts it already sent to the CPU get processed now. Otherwise, if you bring the CPU up again later that interrupt could still be there. Better to get it out of the way now, right? AIUI that's what x86 is doing by re-enabling interrupts and waiting a bit, which seems a bit crude but OK. ARM could maybe do the same thing by disabling GICC, dsb(), then disable interrupts. But I don't understand the interface between GICD, GICC and CPU well enough to reason about it properly. It's also possible that there's some subtlety of the timer interrupt handling that I don't know about -- I _think_ that the reason timer interrupts are relevant is that they're generated inside the APIC, so that even when no interrupts are routed to the core, the APIC could still generate one as it's being shut down. > Clearly, this code does not seem to be doing what we are expecting. > Speaking the Marc Z. (GIC maintainers in Linux), there are no need to > disable the GIC CPU interface in the hypervisor/OS. You are going to > shutdown the CPU and it will be reset when you are coming back. Well that answers my question, then. If you know you're going to reset the core (and not just put it in a deep sleep and wake it up again) then I think this is all moot and you can just disable interrupts once. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
(+ Andre and Tim) On 25/04/18 15:28, Mirela Simonovic wrote: Hi Julien, Hi, On Wed, Apr 25, 2018 at 3:23 PM, Julien Grallwrote: On 25/04/18 14:09, Mirela Simonovic wrote: On Mon, Apr 23, 2018 at 1:33 PM, Julien Grall wrote: On 20/04/18 13:25, Mirela Simonovic wrote: This looks a bit weird. AFAIU, if you disable the CPU interface, then you should never receive interrupt after. So why would you re-enable them? I realize the code in __cpu_disbale do that, but this looks quite wrong to me. There are no way to receive queued timer interrupt afterwards. That is what I took as a reference, but I asked myself the same. There is (extremely small, but it exists) time window between disabling irq locally and disabling CPU interface. An interrupt received in that time window would propagate to the CPU but I'm not sure would happen after the GIC CPU interface is disabled and interrupts are locally enabled. That is the only explanation I can come up with, although I believe the answer is nothing. Since you're at ARM you could check this internally. Speaking with Andre (in CC), the GIC CPU interface may have forwarded an interrupt to the processor before it gets disabled. So when the interrupt will be re-enabled, the processor will jump to the interrupt exception entry. However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read a spurious interrupt ID from GICC_IAR. So I am not sure what the point of that code. It looks like it has been taken from x86, but some bits are missing. AFAIU, x86 will only suspend the timer afterwards (see time_suspend). I am not fully sure why this code is there on Arm. Whether we expect a timer interrupt to come up. Stefano, Tim, do you have any insight on that code? Anyway, since we're taking the notifier approach the timer interrupt would be disabled before the GIC CPU interface, so I believe the mdelay and the surrounding local_irq_enable/disable will not be needed. Please lets do such a cleanup out of this series. Depending on the outcome of the discussions, do you plan to fix the code? then below these lines in the callback I would add release_irq(gic_hw_ops->info->maintenance_irq, NULL); This would have to be done because CPU_DYING notifiers execute before __cpu_disable(). How that sounds? If it's ok, should these changes be split into 2 patches (1) notifier based call to gic_disable_cpu + 2) release maintenance irq, I believe this is better) or should I merge them? I am not sure this is right to do. We want to disable the CPU interface very late (imagine we need to service interrupt). This doesn't mean that the gic_disable_cpu can't be done using notifiers, we would just first disable maintenance irq and then gic cpu interface. However, moving gic_disable_cpu in notifier would mean that interrupts should be disabled using notifiers (whose priority is higher than gic notifier's priority) as well. I would prefer to keep the gic_disable_cpu where it is at the moment. Please lets finalize the discussion and make an agreement on what should be done, I would like to get v3 ASAP. Thanks, Mirela Cheers, -- Julien Grall -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi Julien, On Wed, Apr 25, 2018 at 3:23 PM, Julien Grallwrote: > Hi, > > > On 25/04/18 14:09, Mirela Simonovic wrote: >> >> On Mon, Apr 23, 2018 at 1:33 PM, Julien Grall >> wrote: >>> >>> Hi Mirela, >>> >>> >>> On 20/04/18 13:25, Mirela Simonovic wrote: When a CPU is hot-unplugged the maintenance interrupt has to be released in order to free the memory that was allocated when the CPU was hotplugged and interrupt requested. The interrupt was requested using request_irq() which is called from start_secondary-> init_maintenance_interrupt. Signed-off-by: Mirela Simonovic --- CC: Stefano Stabellini CC: Julien Grall --- xen/arch/arm/gic.c| 5 + xen/arch/arm/smpboot.c| 7 +++ xen/include/asm-arm/gic.h | 1 + 3 files changed, 13 insertions(+) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 653a815127..e536b99e84 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -431,6 +431,11 @@ void init_maintenance_interrupt(void) "irq-maintenance", NULL); } +void deinit_maintenance_interrupt(void) +{ +release_irq(gic_hw_ops->info->maintenance_irq, NULL); +} + int gic_make_hwdom_dt_node(const struct domain *d, const struct dt_device_node *gic, void *fdt) diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index abc642804f..449fefc77d 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -375,11 +375,18 @@ void __cpu_disable(void) local_irq_disable(); gic_disable_cpu(); + >>> >>> >>> >>> Spurious change. >>> /* Allow any queued timer interrupts to get serviced */ local_irq_enable(); mdelay(1); local_irq_disable(); +/* + * Deinitialize interrupts (this will free the memory that was allocated + * in respective init interrupt functions called from start_secondary) + */ +deinit_maintenance_interrupt(); >>> >>> >>> >>> Can you have a look at using a notifier (see CPU_DIYING)? This would >>> avoid >>> exporting too much new function. >> >> >> I believe releasing of maintenance irq should happen after the dying >> CPU's GIC interface is disabled. > > > Why? The maintenance interrupt will only be fired when running in guest > context. Furthermore, it is initialized after the GIC has been initialized, > so it makes sense to disable before hand. > >> To make such ordering using notifiers I would need to move these lines >> from __cpu_disable into the notifier callback under the CPU_DYING >> case: >> local_irq_disable(); >> gic_disable_cpu(); >> local_irq_enable(); > > > This looks a bit weird. AFAIU, if you disable the CPU interface, then you > should never receive interrupt after. So why would you re-enable them? > > I realize the code in __cpu_disbale do that, but this looks quite wrong to > me. There are no way to receive queued timer interrupt afterwards. > That is what I took as a reference, but I asked myself the same. There is (extremely small, but it exists) time window between disabling irq locally and disabling CPU interface. An interrupt received in that time window would propagate to the CPU but I'm not sure would happen after the GIC CPU interface is disabled and interrupts are locally enabled. That is the only explanation I can come up with, although I believe the answer is nothing. Since you're at ARM you could check this internally. Anyway, since we're taking the notifier approach the timer interrupt would be disabled before the GIC CPU interface, so I believe the mdelay and the surrounding local_irq_enable/disable will not be needed. Please lets do such a cleanup out of this series. >> then below these lines in the callback I would add >> release_irq(gic_hw_ops->info->maintenance_irq, NULL); >> >> This would have to be done because CPU_DYING notifiers execute before >> __cpu_disable(). >> How that sounds? If it's ok, should these changes be split into 2 >> patches (1) notifier based call to gic_disable_cpu + 2) release >> maintenance irq, I believe this is better) or should I merge them? > > I am not sure this is right to do. We want to disable the CPU interface very > late (imagine we need to service interrupt). > This doesn't mean that the gic_disable_cpu can't be done using notifiers, we would just first disable maintenance irq and then gic cpu interface. However, moving gic_disable_cpu in notifier would mean that interrupts should be disabled using notifiers (whose priority is higher than gic notifier's priority) as well. Please lets finalize the discussion and make
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi, On 25/04/18 14:09, Mirela Simonovic wrote: On Mon, Apr 23, 2018 at 1:33 PM, Julien Grallwrote: Hi Mirela, On 20/04/18 13:25, Mirela Simonovic wrote: When a CPU is hot-unplugged the maintenance interrupt has to be released in order to free the memory that was allocated when the CPU was hotplugged and interrupt requested. The interrupt was requested using request_irq() which is called from start_secondary-> init_maintenance_interrupt. Signed-off-by: Mirela Simonovic --- CC: Stefano Stabellini CC: Julien Grall --- xen/arch/arm/gic.c| 5 + xen/arch/arm/smpboot.c| 7 +++ xen/include/asm-arm/gic.h | 1 + 3 files changed, 13 insertions(+) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 653a815127..e536b99e84 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -431,6 +431,11 @@ void init_maintenance_interrupt(void) "irq-maintenance", NULL); } +void deinit_maintenance_interrupt(void) +{ +release_irq(gic_hw_ops->info->maintenance_irq, NULL); +} + int gic_make_hwdom_dt_node(const struct domain *d, const struct dt_device_node *gic, void *fdt) diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index abc642804f..449fefc77d 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -375,11 +375,18 @@ void __cpu_disable(void) local_irq_disable(); gic_disable_cpu(); + Spurious change. /* Allow any queued timer interrupts to get serviced */ local_irq_enable(); mdelay(1); local_irq_disable(); +/* + * Deinitialize interrupts (this will free the memory that was allocated + * in respective init interrupt functions called from start_secondary) + */ +deinit_maintenance_interrupt(); Can you have a look at using a notifier (see CPU_DIYING)? This would avoid exporting too much new function. I believe releasing of maintenance irq should happen after the dying CPU's GIC interface is disabled. Why? The maintenance interrupt will only be fired when running in guest context. Furthermore, it is initialized after the GIC has been initialized, so it makes sense to disable before hand. To make such ordering using notifiers I would need to move these lines from __cpu_disable into the notifier callback under the CPU_DYING case: local_irq_disable(); gic_disable_cpu(); local_irq_enable(); This looks a bit weird. AFAIU, if you disable the CPU interface, then you should never receive interrupt after. So why would you re-enable them? I realize the code in __cpu_disbale do that, but this looks quite wrong to me. There are no way to receive queued timer interrupt afterwards. then below these lines in the callback I would add release_irq(gic_hw_ops->info->maintenance_irq, NULL); This would have to be done because CPU_DYING notifiers execute before __cpu_disable(). How that sounds? If it's ok, should these changes be split into 2 patches (1) notifier based call to gic_disable_cpu + 2) release maintenance irq, I believe this is better) or should I merge them? I am not sure this is right to do. We want to disable the CPU interface very late (imagine we need to service interrupt). Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi Julien, On Mon, Apr 23, 2018 at 1:33 PM, Julien Grallwrote: > Hi Mirela, > > > On 20/04/18 13:25, Mirela Simonovic wrote: >> >> When a CPU is hot-unplugged the maintenance interrupt has to be >> released in order to free the memory that was allocated when the CPU >> was hotplugged and interrupt requested. The interrupt was requested >> using request_irq() which is called from start_secondary-> >> init_maintenance_interrupt. >> >> Signed-off-by: Mirela Simonovic >> >> --- >> CC: Stefano Stabellini >> CC: Julien Grall >> --- >> xen/arch/arm/gic.c| 5 + >> xen/arch/arm/smpboot.c| 7 +++ >> xen/include/asm-arm/gic.h | 1 + >> 3 files changed, 13 insertions(+) >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >> index 653a815127..e536b99e84 100644 >> --- a/xen/arch/arm/gic.c >> +++ b/xen/arch/arm/gic.c >> @@ -431,6 +431,11 @@ void init_maintenance_interrupt(void) >> "irq-maintenance", NULL); >> } >> +void deinit_maintenance_interrupt(void) >> +{ >> +release_irq(gic_hw_ops->info->maintenance_irq, NULL); >> +} >> + >> int gic_make_hwdom_dt_node(const struct domain *d, >> const struct dt_device_node *gic, >> void *fdt) >> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c >> index abc642804f..449fefc77d 100644 >> --- a/xen/arch/arm/smpboot.c >> +++ b/xen/arch/arm/smpboot.c >> @@ -375,11 +375,18 @@ void __cpu_disable(void) >> local_irq_disable(); >> gic_disable_cpu(); >> + > > > Spurious change. > >> /* Allow any queued timer interrupts to get serviced */ >> local_irq_enable(); >> mdelay(1); >> local_irq_disable(); >> +/* >> + * Deinitialize interrupts (this will free the memory that was >> allocated >> + * in respective init interrupt functions called from >> start_secondary) >> + */ >> +deinit_maintenance_interrupt(); > > > Can you have a look at using a notifier (see CPU_DIYING)? This would avoid > exporting too much new function. I believe releasing of maintenance irq should happen after the dying CPU's GIC interface is disabled. To make such ordering using notifiers I would need to move these lines from __cpu_disable into the notifier callback under the CPU_DYING case: local_irq_disable(); gic_disable_cpu(); local_irq_enable(); then below these lines in the callback I would add release_irq(gic_hw_ops->info->maintenance_irq, NULL); This would have to be done because CPU_DYING notifiers execute before __cpu_disable(). How that sounds? If it's ok, should these changes be split into 2 patches (1) notifier based call to gic_disable_cpu + 2) release maintenance irq, I believe this is better) or should I merge them? Thanks, Mirela > >> + >> /* It's now safe to remove this processor from the online map */ >> cpumask_clear_cpu(cpu, _online_map); >> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h >> index 58b910fe6a..0db42e6cce 100644 >> --- a/xen/include/asm-arm/gic.h >> +++ b/xen/include/asm-arm/gic.h >> @@ -254,6 +254,7 @@ extern void gic_clear_pending_irqs(struct vcpu *v); >> extern int vgic_vcpu_pending_irq(struct vcpu *v); >> extern void init_maintenance_interrupt(void); >> +extern void deinit_maintenance_interrupt(void); >> extern void gic_raise_guest_irq(struct vcpu *v, unsigned int irq, >> unsigned int priority); >> extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int >> virtual_irq); >> > > Cheers, > > -- > Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
Hi Mirela, On 20/04/18 13:25, Mirela Simonovic wrote: When a CPU is hot-unplugged the maintenance interrupt has to be released in order to free the memory that was allocated when the CPU was hotplugged and interrupt requested. The interrupt was requested using request_irq() which is called from start_secondary-> init_maintenance_interrupt. Signed-off-by: Mirela Simonovic--- CC: Stefano Stabellini CC: Julien Grall --- xen/arch/arm/gic.c| 5 + xen/arch/arm/smpboot.c| 7 +++ xen/include/asm-arm/gic.h | 1 + 3 files changed, 13 insertions(+) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 653a815127..e536b99e84 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -431,6 +431,11 @@ void init_maintenance_interrupt(void) "irq-maintenance", NULL); } +void deinit_maintenance_interrupt(void) +{ +release_irq(gic_hw_ops->info->maintenance_irq, NULL); +} + int gic_make_hwdom_dt_node(const struct domain *d, const struct dt_device_node *gic, void *fdt) diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index abc642804f..449fefc77d 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -375,11 +375,18 @@ void __cpu_disable(void) local_irq_disable(); gic_disable_cpu(); + Spurious change. /* Allow any queued timer interrupts to get serviced */ local_irq_enable(); mdelay(1); local_irq_disable(); +/* + * Deinitialize interrupts (this will free the memory that was allocated + * in respective init interrupt functions called from start_secondary) + */ +deinit_maintenance_interrupt(); Can you have a look at using a notifier (see CPU_DIYING)? This would avoid exporting too much new function. + /* It's now safe to remove this processor from the online map */ cpumask_clear_cpu(cpu, _online_map); diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h index 58b910fe6a..0db42e6cce 100644 --- a/xen/include/asm-arm/gic.h +++ b/xen/include/asm-arm/gic.h @@ -254,6 +254,7 @@ extern void gic_clear_pending_irqs(struct vcpu *v); extern int vgic_vcpu_pending_irq(struct vcpu *v); extern void init_maintenance_interrupt(void); +extern void deinit_maintenance_interrupt(void); extern void gic_raise_guest_irq(struct vcpu *v, unsigned int irq, unsigned int priority); extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq); Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
When a CPU is hot-unplugged the maintenance interrupt has to be released in order to free the memory that was allocated when the CPU was hotplugged and interrupt requested. The interrupt was requested using request_irq() which is called from start_secondary-> init_maintenance_interrupt. Signed-off-by: Mirela Simonovic--- CC: Stefano Stabellini CC: Julien Grall --- xen/arch/arm/gic.c| 5 + xen/arch/arm/smpboot.c| 7 +++ xen/include/asm-arm/gic.h | 1 + 3 files changed, 13 insertions(+) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 653a815127..e536b99e84 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -431,6 +431,11 @@ void init_maintenance_interrupt(void) "irq-maintenance", NULL); } +void deinit_maintenance_interrupt(void) +{ +release_irq(gic_hw_ops->info->maintenance_irq, NULL); +} + int gic_make_hwdom_dt_node(const struct domain *d, const struct dt_device_node *gic, void *fdt) diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index abc642804f..449fefc77d 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -375,11 +375,18 @@ void __cpu_disable(void) local_irq_disable(); gic_disable_cpu(); + /* Allow any queued timer interrupts to get serviced */ local_irq_enable(); mdelay(1); local_irq_disable(); +/* + * Deinitialize interrupts (this will free the memory that was allocated + * in respective init interrupt functions called from start_secondary) + */ +deinit_maintenance_interrupt(); + /* It's now safe to remove this processor from the online map */ cpumask_clear_cpu(cpu, _online_map); diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h index 58b910fe6a..0db42e6cce 100644 --- a/xen/include/asm-arm/gic.h +++ b/xen/include/asm-arm/gic.h @@ -254,6 +254,7 @@ extern void gic_clear_pending_irqs(struct vcpu *v); extern int vgic_vcpu_pending_irq(struct vcpu *v); extern void init_maintenance_interrupt(void); +extern void deinit_maintenance_interrupt(void); extern void gic_raise_guest_irq(struct vcpu *v, unsigned int irq, unsigned int priority); extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq); -- 2.13.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel