Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-11-01 Thread Jan Kiszka
Romain Lenglet wrote:
 Actually, Dmitry and I are discussing IRQ sharing between
 real-time driver, not across the RT/non-RT border.
 
 Sorry, I misunderstood.
 
 The latter case almost always a no-go and should rather be
 solved at hardware level by rearranging the IRQ usage (where
 possible...). The problem is that the non-RT IRQ handler has
 to be called just after the RT handler to make the non-RT
 hardware release the IRQ line. But this cannot be guaranteed
 due to other RT activity and creates an ugly priority
 inversion.

 That your system just crashes is likely due to the RT driver
 not being prepared to share IRQs with non-RT. What driver are
 you using?
 
 RTnet's 8169 gigabit ethernet driver, and RTnet's tulip driver.

Could you try if returning RTDM_IRQ_PROPAGATE from the involved
RT-driver improves the situation? Only enable the devices sharing IRQs
in this case as I'm afraid returning this value in a non-shared case
will also cause troubles as well. Note that this patch may solve the
crashes but will not solve prio inversion.

As I indicated, the problem is more complex. I once started a discussion
about this with Philippe, my original mail is attached (I didn't find
any RTAI-dev mailing list archive). We didn't really solve this issue,
especially as a clean solution would require patched non-RT drivers.

 
 The problem, on common x86 hardware, is that with only one PCI 
 bus and several devices (e.g. I use 4 PCI network cards drived 
 by RTnet, in one PC), sharing IRQ seems unavoidable.
 The only possible workaround on my test machine is to disable as 
 much as I can (USB, sound, serial ports, non-rt ethernet cards, 
 etc.) to avoid sharing.
 

Yea, I do understand. Wolfgang Grandegger recently reported me about a
similar issue on a PPC board: 4 NICs built-in, but all sharing the same
IRQ. If you want to create some RTnet gateway with such a really nice
system, you are in the same troubles. I personally preferred to stop
tormenting my brain with this after we solved our issue with some jumper
(PC104+ board where NOT the jumper named IRQ but some other has to be
changed...).

Jan

---BeginMessage---

Hello,

last week I already mentioned that I was thinking about a concept to 
allow real-time safe IRQ sharing between Linux and RTAI drivers (or in 
other words: between different ADEOS domains). The problem many 
(PC-)users have is that it is not always possible to separate the IRQ 
lines of extension cards and especially on-board components cleanly.


I now found some time to write a proof-of-concept which you find 
attached to this mail. It consists of a rt-module which provides 
extended versions of the Linux functions request_irq/free_irq. Besides 
calling the old request_irq, request_shared_rtirq also registers a 
so-called IRQ suspend handler. In contrast to the normal interrupt 
handler, the suspend routine runs in the context of the RTAI IRQ 
routine. Its job is to disable any further IRQs from the registered 
device and release the IRQ line (i.e. acknowledge the IRQ sources in the 
device hardware). free_shared_irq unregisters such handlers again.


As an example of an ADEOS/RTAI-aware Linux driver, I attached a patch 
for the eepro100 NIC driver. You can see, it is not that complicated to 
create such special drivers.


The rt-module also contains a demo real-time IRQ handler (see comments 
how it may be integrated into arti.c) which simply triggers a real-time 
load task on every call. Together with the patched eepro100 you can run 
a test which will effectively delay every NIC IRQ by 50 ms, spent in the 
RTAI context.


My suggestion is that this extension should become part of the ADEOS 
layer. It only causes very slight additional latency in case no suspend 
handler is registered, and of course a bit more when there is actually a 
bit of hardware to suspend. I measured about 5 us additional worst-case 
latency between the beginning of rt_irq_handler and rt_enable_irq (i.e. 
the later call of the real-time handler) on a Pentium MMX 266 MHz 
running the patched eepro100.


One to-do remains for RTAI drivers (RTnet included...): many of them 
startup/shutdown or disable/enable IRQs in an inappropriate way. For 
instance, spdrv first disables its IRQ in init_module, then requests it, 
and re-enables it again not before the user program open the COM port. 
We need some kind of driver-writing-howto...


Question: Does a RTAI driver require rt_startup_irq/rt_shutdown_irq at 
all? Or is this done by Linux and/or ADEOS/RTAI during bootup? We use it 
in RTnet, I think to remember, because we discovered some issues on PPC 
when rt_startup_irq was missing. Is this out-of-date?


Jan

/***
  rt_test_module.c
 ---
begin: Thr Dec 04 2003
copyright: (C) 2003 by Jan Kiszka
email: [EMAIL PROTECTED]
 

Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-11-01 Thread Philippe Gerum

Romain Lenglet wrote:

- A kernel option that causes Xenomai (or Adeos) to blatantly
malfunction or even crash is a freaking BUG, and should be
reported asap to the Xenomai-core list or the Adeos-main list.
IOW, there is no such thing as options allowed to crash your
box with Adeos/Xenomai because of some don't care attitude;
would such bug happen, it must and will be fixed. All the
people involved in contributing to both projects try to make
sure that any option could be enabled without risking terminal
damage to anyone's setup. The worst thing that should be
allowed to happen is high latency spots, because some options
might cause some hardware to interact badly with critical
resources Adeos/Xenomai also happen manage.



Here is a kernel-option-related bug.
I am using a stock Debian-patched kernel with the standard Debian 
kernel config, on Pentium M and Pentium 4 machines, + the latest 
Adeos patch.
The Debian kernel configuration file, that has every option 
enabled and everything as modules, works fine with Xenomai 
except for one single option which must be disabled:

CONFIG_PCI_MSI
This option messes with the oneshot timer (timer freezes).
(thanks to Gilles to have found this out)



Could you confirm that this issue still happens with adeos-ipipe-2.6.13-1.0-05
or higher?



Otherwise, my biggest source of problems is IRQ sharing between 
realtime and non-realtime drivers: this predictably provokes 
kernel panics. But Jan seems to be working on it.




Yes, this is another issue, more of a shortcoming of the current IRQ handling 
scheme than a bug.


--

Philippe.



Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-11-01 Thread Jan Kiszka
Romain Lenglet wrote:
 ...
 Otherwise, my biggest source of problems is IRQ sharing between 
 realtime and non-realtime drivers: this predictably provokes 
 kernel panics. But Jan seems to be working on it.
 

Actually, Dmitry and I are discussing IRQ sharing between real-time
driver, not across the RT/non-RT border.

The latter case almost always a no-go and should rather be solved at
hardware level by rearranging the IRQ usage (where possible...). The
problem is that the non-RT IRQ handler has to be called just after the
RT handler to make the non-RT hardware release the IRQ line. But this
cannot be guaranteed due to other RT activity and creates an ugly
priority inversion.

That your system just crashes is likely due to the RT driver not being
prepared to share IRQs with non-RT. What driver are you using?

Jan


signature.asc
Description: OpenPGP digital signature


Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-11-01 Thread Romain Lenglet
 Actually, Dmitry and I are discussing IRQ sharing between
 real-time driver, not across the RT/non-RT border.

Sorry, I misunderstood.

 The latter case almost always a no-go and should rather be
 solved at hardware level by rearranging the IRQ usage (where
 possible...). The problem is that the non-RT IRQ handler has
 to be called just after the RT handler to make the non-RT
 hardware release the IRQ line. But this cannot be guaranteed
 due to other RT activity and creates an ugly priority
 inversion.

 That your system just crashes is likely due to the RT driver
 not being prepared to share IRQs with non-RT. What driver are
 you using?

RTnet's 8169 gigabit ethernet driver, and RTnet's tulip driver.

The problem, on common x86 hardware, is that with only one PCI 
bus and several devices (e.g. I use 4 PCI network cards drived 
by RTnet, in one PC), sharing IRQ seems unavoidable.
The only possible workaround on my test machine is to disable as 
much as I can (USB, sound, serial ports, non-rt ethernet cards, 
etc.) to avoid sharing.

-- 
Romain Lenglet



Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-11-01 Thread Jan Kiszka
Romain Lenglet wrote:
 Actually, Dmitry and I are discussing IRQ sharing between
 real-time driver, not across the RT/non-RT border.
 
 Sorry, I misunderstood.
 
 The latter case almost always a no-go and should rather be
 solved at hardware level by rearranging the IRQ usage (where
 possible...). The problem is that the non-RT IRQ handler has
 to be called just after the RT handler to make the non-RT
 hardware release the IRQ line. But this cannot be guaranteed
 due to other RT activity and creates an ugly priority
 inversion.

 That your system just crashes is likely due to the RT driver
 not being prepared to share IRQs with non-RT. What driver are
 you using?
 
 RTnet's 8169 gigabit ethernet driver, and RTnet's tulip driver.

Could you try if returning RTDM_IRQ_PROPAGATE from the involved
RT-driver improves the situation? Only enable the devices sharing IRQs
in this case as I'm afraid returning this value in a non-shared case
will also cause troubles as well. Note that this patch may solve the
crashes but will not solve prio inversion.

As I indicated, the problem is more complex. I once started a discussion
about this with Philippe, my original mail is attached (I didn't find
any RTAI-dev mailing list archive). We didn't really solve this issue,
especially as a clean solution would require patched non-RT drivers.

 
 The problem, on common x86 hardware, is that with only one PCI 
 bus and several devices (e.g. I use 4 PCI network cards drived 
 by RTnet, in one PC), sharing IRQ seems unavoidable.
 The only possible workaround on my test machine is to disable as 
 much as I can (USB, sound, serial ports, non-rt ethernet cards, 
 etc.) to avoid sharing.
 

Yea, I do understand. Wolfgang Grandegger recently reported me about a
similar issue on a PPC board: 4 NICs built-in, but all sharing the same
IRQ. If you want to create some RTnet gateway with such a really nice
system, you are in the same troubles. I personally preferred to stop
tormenting my brain with this after we solved our issue with some jumper
(PC104+ board where NOT the jumper named IRQ but some other has to be
changed...).

Jan

---BeginMessage---

Hello,

last week I already mentioned that I was thinking about a concept to 
allow real-time safe IRQ sharing between Linux and RTAI drivers (or in 
other words: between different ADEOS domains). The problem many 
(PC-)users have is that it is not always possible to separate the IRQ 
lines of extension cards and especially on-board components cleanly.


I now found some time to write a proof-of-concept which you find 
attached to this mail. It consists of a rt-module which provides 
extended versions of the Linux functions request_irq/free_irq. Besides 
calling the old request_irq, request_shared_rtirq also registers a 
so-called IRQ suspend handler. In contrast to the normal interrupt 
handler, the suspend routine runs in the context of the RTAI IRQ 
routine. Its job is to disable any further IRQs from the registered 
device and release the IRQ line (i.e. acknowledge the IRQ sources in the 
device hardware). free_shared_irq unregisters such handlers again.


As an example of an ADEOS/RTAI-aware Linux driver, I attached a patch 
for the eepro100 NIC driver. You can see, it is not that complicated to 
create such special drivers.


The rt-module also contains a demo real-time IRQ handler (see comments 
how it may be integrated into arti.c) which simply triggers a real-time 
load task on every call. Together with the patched eepro100 you can run 
a test which will effectively delay every NIC IRQ by 50 ms, spent in the 
RTAI context.


My suggestion is that this extension should become part of the ADEOS 
layer. It only causes very slight additional latency in case no suspend 
handler is registered, and of course a bit more when there is actually a 
bit of hardware to suspend. I measured about 5 us additional worst-case 
latency between the beginning of rt_irq_handler and rt_enable_irq (i.e. 
the later call of the real-time handler) on a Pentium MMX 266 MHz 
running the patched eepro100.


One to-do remains for RTAI drivers (RTnet included...): many of them 
startup/shutdown or disable/enable IRQs in an inappropriate way. For 
instance, spdrv first disables its IRQ in init_module, then requests it, 
and re-enables it again not before the user program open the COM port. 
We need some kind of driver-writing-howto...


Question: Does a RTAI driver require rt_startup_irq/rt_shutdown_irq at 
all? Or is this done by Linux and/or ADEOS/RTAI during bootup? We use it 
in RTnet, I think to remember, because we discovered some issues on PPC 
when rt_startup_irq was missing. Is this out-of-date?


Jan

/***
  rt_test_module.c
 ---
begin: Thr Dec 04 2003
copyright: (C) 2003 by Jan Kiszka
email: [EMAIL PROTECTED]
 

Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-11-01 Thread Romain Lenglet
  CONFIG_PCI_MSI
  This option messes with the oneshot timer (timer freezes).
  (thanks to Gilles to have found this out)

 Could you confirm that this issue still happens with
 adeos-ipipe-2.6.13-1.0-05 or higher?

Ok. I have just checked adeos-ipipe-2.6.14-1.0-09 with 
CONFIG_PCI_MSI enabled, and the oneshot timer works fine.
The issue appears to be solved, thanks!

-- 
Romain Lenglet



Re: [Xenomai-core] Re: [Xenomai-help] General Xenomai / RTAI Skin Usage Questions

2005-10-31 Thread Philippe Gerum

Romain Lenglet wrote:

- A kernel option that causes Xenomai (or Adeos) to blatantly
malfunction or even crash is a freaking BUG, and should be
reported asap to the Xenomai-core list or the Adeos-main list.
IOW, there is no such thing as options allowed to crash your
box with Adeos/Xenomai because of some don't care attitude;
would such bug happen, it must and will be fixed. All the
people involved in contributing to both projects try to make
sure that any option could be enabled without risking terminal
damage to anyone's setup. The worst thing that should be
allowed to happen is high latency spots, because some options
might cause some hardware to interact badly with critical
resources Adeos/Xenomai also happen manage.



Here is a kernel-option-related bug.
I am using a stock Debian-patched kernel with the standard Debian 
kernel config, on Pentium M and Pentium 4 machines, + the latest 
Adeos patch.
The Debian kernel configuration file, that has every option 
enabled and everything as modules, works fine with Xenomai 
except for one single option which must be disabled:

CONFIG_PCI_MSI
This option messes with the oneshot timer (timer freezes).
(thanks to Gilles to have found this out)



Could you confirm that this issue still happens with adeos-ipipe-2.6.13-1.0-05
or higher?



Otherwise, my biggest source of problems is IRQ sharing between 
realtime and non-realtime drivers: this predictably provokes 
kernel panics. But Jan seems to be working on it.




Yes, this is another issue, more of a shortcoming of the current IRQ handling 
scheme than a bug.


--

Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core