Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
Romain Lenglet wrote: Actually, Dmitry and I are discussing IRQ sharing between real-time driver, not across the RT/non-RT border. Sorry, I misunderstood. The latter case almost always a no-go and should rather be solved at hardware level by rearranging the IRQ usage (where possible...). The problem is that the non-RT IRQ handler has to be called just after the RT handler to make the non-RT hardware release the IRQ line. But this cannot be guaranteed due to other RT activity and creates an ugly priority inversion. That your system just crashes is likely due to the RT driver not being prepared to share IRQs with non-RT. What driver are you using? RTnet's 8169 gigabit ethernet driver, and RTnet's tulip driver. Could you try if returning RTDM_IRQ_PROPAGATE from the involved RT-driver improves the situation? Only enable the devices sharing IRQs in this case as I'm afraid returning this value in a non-shared case will also cause troubles as well. Note that this patch may solve the crashes but will not solve prio inversion. As I indicated, the problem is more complex. I once started a discussion about this with Philippe, my original mail is attached (I didn't find any RTAI-dev mailing list archive). We didn't really solve this issue, especially as a clean solution would require patched non-RT drivers. The problem, on common x86 hardware, is that with only one PCI bus and several devices (e.g. I use 4 PCI network cards drived by RTnet, in one PC), sharing IRQ seems unavoidable. The only possible workaround on my test machine is to disable as much as I can (USB, sound, serial ports, non-rt ethernet cards, etc.) to avoid sharing. Yea, I do understand. Wolfgang Grandegger recently reported me about a similar issue on a PPC board: 4 NICs built-in, but all sharing the same IRQ. If you want to create some RTnet gateway with such a really nice system, you are in the same troubles. I personally preferred to stop tormenting my brain with this after we solved our issue with some jumper (PC104+ board where NOT the jumper named IRQ but some other has to be changed...). Jan ---BeginMessage--- Hello, last week I already mentioned that I was thinking about a concept to allow real-time safe IRQ sharing between Linux and RTAI drivers (or in other words: between different ADEOS domains). The problem many (PC-)users have is that it is not always possible to separate the IRQ lines of extension cards and especially on-board components cleanly. I now found some time to write a proof-of-concept which you find attached to this mail. It consists of a rt-module which provides extended versions of the Linux functions request_irq/free_irq. Besides calling the old request_irq, request_shared_rtirq also registers a so-called IRQ suspend handler. In contrast to the normal interrupt handler, the suspend routine runs in the context of the RTAI IRQ routine. Its job is to disable any further IRQs from the registered device and release the IRQ line (i.e. acknowledge the IRQ sources in the device hardware). free_shared_irq unregisters such handlers again. As an example of an ADEOS/RTAI-aware Linux driver, I attached a patch for the eepro100 NIC driver. You can see, it is not that complicated to create such special drivers. The rt-module also contains a demo real-time IRQ handler (see comments how it may be integrated into arti.c) which simply triggers a real-time load task on every call. Together with the patched eepro100 you can run a test which will effectively delay every NIC IRQ by 50 ms, spent in the RTAI context. My suggestion is that this extension should become part of the ADEOS layer. It only causes very slight additional latency in case no suspend handler is registered, and of course a bit more when there is actually a bit of hardware to suspend. I measured about 5 us additional worst-case latency between the beginning of rt_irq_handler and rt_enable_irq (i.e. the later call of the real-time handler) on a Pentium MMX 266 MHz running the patched eepro100. One to-do remains for RTAI drivers (RTnet included...): many of them startup/shutdown or disable/enable IRQs in an inappropriate way. For instance, spdrv first disables its IRQ in init_module, then requests it, and re-enables it again not before the user program open the COM port. We need some kind of driver-writing-howto... Question: Does a RTAI driver require rt_startup_irq/rt_shutdown_irq at all? Or is this done by Linux and/or ADEOS/RTAI during bootup? We use it in RTnet, I think to remember, because we discovered some issues on PPC when rt_startup_irq was missing. Is this out-of-date? Jan /*** rt_test_module.c --- begin: Thr Dec 04 2003 copyright: (C) 2003 by Jan Kiszka email: [EMAIL PROTECTED]
Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
Romain Lenglet wrote: - A kernel option that causes Xenomai (or Adeos) to blatantly malfunction or even crash is a freaking BUG, and should be reported asap to the Xenomai-core list or the Adeos-main list. IOW, there is no such thing as options allowed to crash your box with Adeos/Xenomai because of some don't care attitude; would such bug happen, it must and will be fixed. All the people involved in contributing to both projects try to make sure that any option could be enabled without risking terminal damage to anyone's setup. The worst thing that should be allowed to happen is high latency spots, because some options might cause some hardware to interact badly with critical resources Adeos/Xenomai also happen manage. Here is a kernel-option-related bug. I am using a stock Debian-patched kernel with the standard Debian kernel config, on Pentium M and Pentium 4 machines, + the latest Adeos patch. The Debian kernel configuration file, that has every option enabled and everything as modules, works fine with Xenomai except for one single option which must be disabled: CONFIG_PCI_MSI This option messes with the oneshot timer (timer freezes). (thanks to Gilles to have found this out) Could you confirm that this issue still happens with adeos-ipipe-2.6.13-1.0-05 or higher? Otherwise, my biggest source of problems is IRQ sharing between realtime and non-realtime drivers: this predictably provokes kernel panics. But Jan seems to be working on it. Yes, this is another issue, more of a shortcoming of the current IRQ handling scheme than a bug. -- Philippe.
Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
Romain Lenglet wrote: ... Otherwise, my biggest source of problems is IRQ sharing between realtime and non-realtime drivers: this predictably provokes kernel panics. But Jan seems to be working on it. Actually, Dmitry and I are discussing IRQ sharing between real-time driver, not across the RT/non-RT border. The latter case almost always a no-go and should rather be solved at hardware level by rearranging the IRQ usage (where possible...). The problem is that the non-RT IRQ handler has to be called just after the RT handler to make the non-RT hardware release the IRQ line. But this cannot be guaranteed due to other RT activity and creates an ugly priority inversion. That your system just crashes is likely due to the RT driver not being prepared to share IRQs with non-RT. What driver are you using? Jan signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
Actually, Dmitry and I are discussing IRQ sharing between real-time driver, not across the RT/non-RT border. Sorry, I misunderstood. The latter case almost always a no-go and should rather be solved at hardware level by rearranging the IRQ usage (where possible...). The problem is that the non-RT IRQ handler has to be called just after the RT handler to make the non-RT hardware release the IRQ line. But this cannot be guaranteed due to other RT activity and creates an ugly priority inversion. That your system just crashes is likely due to the RT driver not being prepared to share IRQs with non-RT. What driver are you using? RTnet's 8169 gigabit ethernet driver, and RTnet's tulip driver. The problem, on common x86 hardware, is that with only one PCI bus and several devices (e.g. I use 4 PCI network cards drived by RTnet, in one PC), sharing IRQ seems unavoidable. The only possible workaround on my test machine is to disable as much as I can (USB, sound, serial ports, non-rt ethernet cards, etc.) to avoid sharing. -- Romain Lenglet
Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
Romain Lenglet wrote: Actually, Dmitry and I are discussing IRQ sharing between real-time driver, not across the RT/non-RT border. Sorry, I misunderstood. The latter case almost always a no-go and should rather be solved at hardware level by rearranging the IRQ usage (where possible...). The problem is that the non-RT IRQ handler has to be called just after the RT handler to make the non-RT hardware release the IRQ line. But this cannot be guaranteed due to other RT activity and creates an ugly priority inversion. That your system just crashes is likely due to the RT driver not being prepared to share IRQs with non-RT. What driver are you using? RTnet's 8169 gigabit ethernet driver, and RTnet's tulip driver. Could you try if returning RTDM_IRQ_PROPAGATE from the involved RT-driver improves the situation? Only enable the devices sharing IRQs in this case as I'm afraid returning this value in a non-shared case will also cause troubles as well. Note that this patch may solve the crashes but will not solve prio inversion. As I indicated, the problem is more complex. I once started a discussion about this with Philippe, my original mail is attached (I didn't find any RTAI-dev mailing list archive). We didn't really solve this issue, especially as a clean solution would require patched non-RT drivers. The problem, on common x86 hardware, is that with only one PCI bus and several devices (e.g. I use 4 PCI network cards drived by RTnet, in one PC), sharing IRQ seems unavoidable. The only possible workaround on my test machine is to disable as much as I can (USB, sound, serial ports, non-rt ethernet cards, etc.) to avoid sharing. Yea, I do understand. Wolfgang Grandegger recently reported me about a similar issue on a PPC board: 4 NICs built-in, but all sharing the same IRQ. If you want to create some RTnet gateway with such a really nice system, you are in the same troubles. I personally preferred to stop tormenting my brain with this after we solved our issue with some jumper (PC104+ board where NOT the jumper named IRQ but some other has to be changed...). Jan ---BeginMessage--- Hello, last week I already mentioned that I was thinking about a concept to allow real-time safe IRQ sharing between Linux and RTAI drivers (or in other words: between different ADEOS domains). The problem many (PC-)users have is that it is not always possible to separate the IRQ lines of extension cards and especially on-board components cleanly. I now found some time to write a proof-of-concept which you find attached to this mail. It consists of a rt-module which provides extended versions of the Linux functions request_irq/free_irq. Besides calling the old request_irq, request_shared_rtirq also registers a so-called IRQ suspend handler. In contrast to the normal interrupt handler, the suspend routine runs in the context of the RTAI IRQ routine. Its job is to disable any further IRQs from the registered device and release the IRQ line (i.e. acknowledge the IRQ sources in the device hardware). free_shared_irq unregisters such handlers again. As an example of an ADEOS/RTAI-aware Linux driver, I attached a patch for the eepro100 NIC driver. You can see, it is not that complicated to create such special drivers. The rt-module also contains a demo real-time IRQ handler (see comments how it may be integrated into arti.c) which simply triggers a real-time load task on every call. Together with the patched eepro100 you can run a test which will effectively delay every NIC IRQ by 50 ms, spent in the RTAI context. My suggestion is that this extension should become part of the ADEOS layer. It only causes very slight additional latency in case no suspend handler is registered, and of course a bit more when there is actually a bit of hardware to suspend. I measured about 5 us additional worst-case latency between the beginning of rt_irq_handler and rt_enable_irq (i.e. the later call of the real-time handler) on a Pentium MMX 266 MHz running the patched eepro100. One to-do remains for RTAI drivers (RTnet included...): many of them startup/shutdown or disable/enable IRQs in an inappropriate way. For instance, spdrv first disables its IRQ in init_module, then requests it, and re-enables it again not before the user program open the COM port. We need some kind of driver-writing-howto... Question: Does a RTAI driver require rt_startup_irq/rt_shutdown_irq at all? Or is this done by Linux and/or ADEOS/RTAI during bootup? We use it in RTnet, I think to remember, because we discovered some issues on PPC when rt_startup_irq was missing. Is this out-of-date? Jan /*** rt_test_module.c --- begin: Thr Dec 04 2003 copyright: (C) 2003 by Jan Kiszka email: [EMAIL PROTECTED]
Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
CONFIG_PCI_MSI This option messes with the oneshot timer (timer freezes). (thanks to Gilles to have found this out) Could you confirm that this issue still happens with adeos-ipipe-2.6.13-1.0-05 or higher? Ok. I have just checked adeos-ipipe-2.6.14-1.0-09 with CONFIG_PCI_MSI enabled, and the oneshot timer works fine. The issue appears to be solved, thanks! -- Romain Lenglet
Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions
Romain Lenglet wrote: - A kernel option that causes Xenomai (or Adeos) to blatantly malfunction or even crash is a freaking BUG, and should be reported asap to the Xenomai-core list or the Adeos-main list. IOW, there is no such thing as options allowed to crash your box with Adeos/Xenomai because of some don't care attitude; would such bug happen, it must and will be fixed. All the people involved in contributing to both projects try to make sure that any option could be enabled without risking terminal damage to anyone's setup. The worst thing that should be allowed to happen is high latency spots, because some options might cause some hardware to interact badly with critical resources Adeos/Xenomai also happen manage. Here is a kernel-option-related bug. I am using a stock Debian-patched kernel with the standard Debian kernel config, on Pentium M and Pentium 4 machines, + the latest Adeos patch. The Debian kernel configuration file, that has every option enabled and everything as modules, works fine with Xenomai except for one single option which must be disabled: CONFIG_PCI_MSI This option messes with the oneshot timer (timer freezes). (thanks to Gilles to have found this out) Could you confirm that this issue still happens with adeos-ipipe-2.6.13-1.0-05 or higher? Otherwise, my biggest source of problems is IRQ sharing between realtime and non-realtime drivers: this predictably provokes kernel panics. But Jan seems to be working on it. Yes, this is another issue, more of a shortcoming of the current IRQ handling scheme than a bug. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core