Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On Mon, Dec 9, 2024 at 9:46 PM Xiang Xiao wrote: > > On Tue, Dec 10, 2024 at 5:53 AM Nathan Hartman > wrote: > > > Hi all, > > > > Unfortunately I missed the PR before it was merged, but PR-15073 has > > broken High Priority, Zero Latency Interrupts! Fortunately I caught it > > now. It was merged 17 hours ago. > > > > > Zero Latency Interrupts doesn't remove from PR-15073. > > > > This PR removed a very important Kconfig: config ARMV7M_USEBASEPRI, > > and much of the logic related to it. This config is *required* when > > using CONFIG_ARCH_HIPRI_INTERRUPT. > > > > > PR-15073 doesn't remove ARMV7M_USEBASEPRI, but makes it enabled > unconditionally, that's why this patch removes this config. Correct. I am writing to confirm that the PRs are correct and did not break the Zero Latency Interrupts. The PRs remove support for PRIMASK, not BASEPRI, on architectures that support BASEPRI. There is no need to support PRIMASK on these architectures because BASEPRI is a superset of the functionality. PRIMASK is not needed for Zero Latency Interrupts, so there is no need to maintain two different implementations. Thanks for your patience! Nathan
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On Tue, Dec 10, 2024 at 9:06 AM Gregory Nutt wrote: > On 12/10/2024 7:59 AM, Nathan Hartman wrote: > > On Tue, Dec 10, 2024 at 8:11 AM spudaneco wrote: > > > >> I wonder if we should also disable the prioritization APIs. In the > >> normal, default case, any reprioritization of an IRQ introduces a fatal, > >> mysterious bug. That is because nested interrupts will occur and may > not > >> be supported.Sent from my Galaxy > > > > Aren't the prioritization APIs needed for setting up the Zero Latency > > Interrupts? They exist "outside" the OS so they don't touch any OS > > structures, so no mysterious bug should occur. > > Yes, I tried to distinguish that case with some weasel words (normal, > default). Even if high priority interrupts are used (maybe enabled), > the same fatal error would occur if any other interrupts that share the > same handler were reprioritized True. With shared interrupt handlers, ALL the interrupts that share the same handler must have the same priority or else they might corrupt the shared data structures.
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On 12/10/2024 7:59 AM, Nathan Hartman wrote: On Tue, Dec 10, 2024 at 8:11 AM spudaneco wrote: I wonder if we should also disable the prioritization APIs. In the normal, default case, any reprioritization of an IRQ introduces a fatal, mysterious bug. That is because nested interrupts will occur and may not be supported.Sent from my Galaxy Aren't the prioritization APIs needed for setting up the Zero Latency Interrupts? They exist "outside" the OS so they don't touch any OS structures, so no mysterious bug should occur. Yes, I tried to distinguish that case with some weasel words (normal, default). Even if high priority interrupts are used (maybe enabled), the same fatal error would occur if any other interrupts that share the same handler were reprioritized
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On Tue, Dec 10, 2024 at 8:11 AM spudaneco wrote: > I wonder if we should also disable the prioritization APIs. In the > normal, default case, any reprioritization of an IRQ introduces a fatal, > mysterious bug. That is because nested interrupts will occur and may not > be supported.Sent from my Galaxy Aren't the prioritization APIs needed for setting up the Zero Latency Interrupts? They exist "outside" the OS so they don't touch any OS structures, so no mysterious bug should occur. Maybe the prioritization APIs should be enabled with CONFIG_ARCH_HIPRI_INTERRUPT? And the docstrings can explain not to use them for interrupts handled by the OS. Cheers, Nathan
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
I wonder if we should also disable the prioritization APIs. In the normal, default case, any reprioritization of an IRQ introduces a fatal, mysterious bug. That is because nested interrupts will occur and may not be supported.Sent from my Galaxy Original message From: Nathan Hartman Date: 12/9/24 10:11 PM (GMT-06:00) To: dev@nuttx.apache.org Subject: Re: SERIOUS BUG: Zero Latency Interrupts are broken! On Mon, Dec 9, 2024 at 9:46 PM Xiang Xiao wrote:>> On Tue, Dec 10, 2024 at 5:53 AM Nathan Hartman > wrote:>> > Hi all,> >> > Unfortunately I missed the PR before it was merged, but PR-15073 has> > broken High Priority, Zero Latency Interrupts! Fortunately I caught it> > now. It was merged 17 hours ago.> >> >> Zero Latency Interrupts doesn't remove from PR-15073.>>> > This PR removed a very important Kconfig: config ARMV7M_USEBASEPRI,> > and much of the logic related to it. This config is *required* when> > using CONFIG_ARCH_HIPRI_INTERRUPT.> >> >> PR-15073 doesn't remove ARMV7M_USEBASEPRI, but makes it enabled> unconditionally, that's why this patch removes this config.Thank you.I studied the changes of #15073 and #15086 and they seem correct, toenable BASEPRI always in armv7m and armv8m. So, the Zero LatencyInterrupts should work. I'll test it in the morning to be sure.tl;dr: Please don't revert those PRs.I am still studying the #14881 changes. This one is more complexbecause it changes how the recordkeeping is done.Unfortunately it is very late now so I need to continue in the morning.> https://github.com/apache/nuttx/pull/15102 removes ARMV7M_USEBASEPRI from> documentation to match the new code base.Thanks for the PR. I will review soon.Cheers,Nathan
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On Mon, Dec 9, 2024 at 9:46 PM Xiang Xiao wrote: > > On Tue, Dec 10, 2024 at 5:53 AM Nathan Hartman > wrote: > > > Hi all, > > > > Unfortunately I missed the PR before it was merged, but PR-15073 has > > broken High Priority, Zero Latency Interrupts! Fortunately I caught it > > now. It was merged 17 hours ago. > > > > > Zero Latency Interrupts doesn't remove from PR-15073. > > > > This PR removed a very important Kconfig: config ARMV7M_USEBASEPRI, > > and much of the logic related to it. This config is *required* when > > using CONFIG_ARCH_HIPRI_INTERRUPT. > > > > > PR-15073 doesn't remove ARMV7M_USEBASEPRI, but makes it enabled > unconditionally, that's why this patch removes this config. Thank you. I studied the changes of #15073 and #15086 and they seem correct, to enable BASEPRI always in armv7m and armv8m. So, the Zero Latency Interrupts should work. I'll test it in the morning to be sure. tl;dr: Please don't revert those PRs. I am still studying the #14881 changes. This one is more complex because it changes how the recordkeeping is done. Unfortunately it is very late now so I need to continue in the morning. > https://github.com/apache/nuttx/pull/15102 removes ARMV7M_USEBASEPRI from > documentation to match the new code base. Thanks for the PR. I will review soon. Cheers, Nathan
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On Tue, Dec 10, 2024 at 5:53 AM Nathan Hartman wrote: > Hi all, > > Unfortunately I missed the PR before it was merged, but PR-15073 has > broken High Priority, Zero Latency Interrupts! Fortunately I caught it > now. It was merged 17 hours ago. > > Zero Latency Interrupts doesn't remove from PR-15073. > This PR removed a very important Kconfig: config ARMV7M_USEBASEPRI, > and much of the logic related to it. This config is *required* when > using CONFIG_ARCH_HIPRI_INTERRUPT. > > PR-15073 doesn't remove ARMV7M_USEBASEPRI, but makes it enabled unconditionally, that's why this patch removes this config. > High Priority, Zero Latency Interrupts are documented here: > https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html > > https://github.com/apache/nuttx/pull/15102 removes ARMV7M_USEBASEPRI from documentation to match the new code base. > I depend strongly on Zero Latency Interrupts. It is *required* for > critical handling in my firmware. I cannot use NuttX at all if NuttX > does not have Zero Latency Interrupts anymore. > > PR-15073 was meant to fix a regression in PR-14881. PR-14881 attempted > to optimize handling of g_current_regs. The commit log of PR-15073 > explains this. [1] > > We have to find another way to fix it, without destroying the Zero > Latency Interrupts feature. Otherwise, it is a very serious problem. > > I suggest to revert PR-15073, and possibly revert PR-14881, and find > another way to implement PR-14881 that doesn't cause this problem. > > References: > > [1] PR-15073: > https://github.com/apache/nuttx/commit/0e1b432dd068880d353ae91c01e610f85b4a16ea#diff-24754055e0d9f2d702b8503e7f5b520394cb59affb5d24457fe0641c1d7678ceL27 > > [2] PR-14881: https://github.com/apache/nuttx/pull/14881 > > [3] High Priority Zero Latency Interrupts: > https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html > > Thanks, > Nathan >
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
On 12/9/2024 3:52 PM, Nathan Hartman wrote: Hi all, Unfortunately I missed the PR before it was merged, but PR-15073 has broken High Priority, Zero Latency Interrupts! Fortunately I caught it now. It was merged 17 hours ago. This PR removed a very important Kconfig: config ARMV7M_USEBASEPRI, and much of the logic related to it. This config is *required* when using CONFIG_ARCH_HIPRI_INTERRUPT. High Priority, Zero Latency Interrupts are documented here: https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html I depend strongly on Zero Latency Interrupts. It is *required* for critical handling in my firmware. I cannot use NuttX at all if NuttX does not have Zero Latency Interrupts anymore. PR-15073 was meant to fix a regression in PR-14881. PR-14881 attempted to optimize handling of g_current_regs. The commit log of PR-15073 explains this. [1] We have to find another way to fix it, without destroying the Zero Latency Interrupts feature. Otherwise, it is a very serious problem. I suggest to revert PR-15073, and possibly revert PR-14881, and find another way to implement PR-14881 that doesn't cause this problem. References: [1] PR-15073: https://github.com/apache/nuttx/commit/0e1b432dd068880d353ae91c01e610f85b4a16ea#diff-24754055e0d9f2d702b8503e7f5b520394cb59affb5d24457fe0641c1d7678ceL27 [2] PR-14881: https://github.com/apache/nuttx/pull/14881 [3] High Priority Zero Latency Interrupts: https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html Thanks, Nathan Using the basepri register would also be important if you ever want to implement nested interrupts. I purist would say that nested, priority interrupts is necessary for true real-time performance. More info: https://cwiki.apache.org/confluence/display/NUTTX/Nested+Interrupts Samsung has a working implementation of this in TizenRT: The are about 3 or 4 commits in May of 2019 that provides the full implementation. It should drop into those architectures pretty simply.
Re: SERIOUS BUG: Zero Latency Interrupts are broken!
Whoops, good catch Nathan! +1 to revert all breaking commits and thing for a better solution :-) I can see you already reported here: https://github.com/apache/nuttx/issues/15100 Tomek On Mon, Dec 9, 2024 at 10:53 PM Nathan Hartman wrote: > > Hi all, > > Unfortunately I missed the PR before it was merged, but PR-15073 has > broken High Priority, Zero Latency Interrupts! Fortunately I caught it > now. It was merged 17 hours ago. > > This PR removed a very important Kconfig: config ARMV7M_USEBASEPRI, > and much of the logic related to it. This config is *required* when > using CONFIG_ARCH_HIPRI_INTERRUPT. > > High Priority, Zero Latency Interrupts are documented here: > https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html > > I depend strongly on Zero Latency Interrupts. It is *required* for > critical handling in my firmware. I cannot use NuttX at all if NuttX > does not have Zero Latency Interrupts anymore. > > PR-15073 was meant to fix a regression in PR-14881. PR-14881 attempted > to optimize handling of g_current_regs. The commit log of PR-15073 > explains this. [1] > > We have to find another way to fix it, without destroying the Zero > Latency Interrupts feature. Otherwise, it is a very serious problem. > > I suggest to revert PR-15073, and possibly revert PR-14881, and find > another way to implement PR-14881 that doesn't cause this problem. > > References: > > [1] PR-15073: > https://github.com/apache/nuttx/commit/0e1b432dd068880d353ae91c01e610f85b4a16ea#diff-24754055e0d9f2d702b8503e7f5b520394cb59affb5d24457fe0641c1d7678ceL27 > > [2] PR-14881: https://github.com/apache/nuttx/pull/14881 > > [3] High Priority Zero Latency Interrupts: > https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html > > Thanks, > Nathan -- CeDeROM, SQ7MHZ, http://www.tomek.cedro.info