RE: [Xenomai-core] Handling PCI MSI interrupts

2006-02-17 Thread Russell Johnson

> > I tested (and intended) the patch for MSI (w/o maskbits), not MSI-X.
> > What e1000 chip are you using exactly? Easiest way to tell is by using 
> > '/sbin/lspci'. I may be able to help you out with MSI-X as well, but in 
> > that case, I have no hardware platform to test on.

> > You can check whether or not MSI is actually being used by doing 
> > '/sbin/lspci -v' and look for the Capability: Message Signalled 
> > Interrupt. When the driver is running in MSI mode, it should read 
> > 'Enable+' instead of 'Enable-'.

This e1000 chip actually doesn't have MSI support.  I had assumed that since
the e1000 driver caused the hanging and disabling MSI in the kernel caused
the hang to go away that the problem was MSI in the e1000.  The e1000 driver
only enables MSI on newer chips than what are in the Dell 28xx machines.

> > As it's a Dell, I assume there's two Intel Penium CPU's 
> > inside. Are you running with SMP enabled ?

SMP is enabled.

> > The local (internal) CPU APIC hasn't been informed that the interrupt 
> > has been dealt with and it will therefore allow no other interrupts 
> > anymore to arrive in the CPU (including your keyboard's). 
> > In fact, your CPU is idle.

I have used a PCI analyzer to see infinite loops on this machine for past
similar kernel issues and assumed it would be the same due to the symptoms.

> > When I build a kernel with Adeos but disable MSI then the 
> > system works fine for the most part.  There is one scenario 
> > where the system will still hang
> > doing disk and network accesses under a moderate load of I/O. 
> > 
> > Hm. That may indicate another issue.
> 
> Indeed. This behaviour has not been reported yet with patches 
> from the Adeos I-pipe series. Does it also happen with SMP 
> disabled, or Hyperthreading disabled?

It did happen with SMP disabled and I have always left hyperthreading
disabled because it is my understanding that hyperthreading is not supported
by the adeos patch.

> > Try upgrading the kernel. The kernel usually comes with updated drivers 
> > as well. Currently I'm running 2.6.16-rc2, which I had to patch manually

> > for Adeos (about 3 'hunks' from the 2.6.15-i386-1.2-00 patch didn't 
> > apply properly). By using 2.6.16-rc2, I got much better Intel 
> > (especially i865 graphics) chipset support than 2.6.15. Note, however, 
> > that I did the bug fixing in this thread on a plain 2.6.15, though (and 
> > the msi.c code is nearly identical).
> > 
> > I would recommend upgrading to 2.6.15 with the latest Adeos patch and 
> > try to get a stable system before enabling MSI.

In short, MSI doesn't seem to have been my issue.  I now have a more stable
kernel.  Apparently this system had some other faults with the specific
configuration options I was using.  I had to patch to the 2.6.14.7 level
(was at .4) and change some of the options in my .config.  Specifically, I
had to leave ACPI enabled (I had disabled as a test a while back).  With
ACPI disabled, the machine would still hang if the USB was disabled in the
BIOS.

After learning how to check for MSI, no devices in my system seem to
actually be using MSI.  The code patches you provided were never actually
executed.  Time will tell if my system is stable.

Thanks for your help!
Russ





RE: [Xenomai-core] Handling PCI MSI interrupts

2006-02-17 Thread Russell Johnson

> > I tested (and intended) the patch for MSI (w/o maskbits), not MSI-X.
> > What e1000 chip are you using exactly? Easiest way to tell is by using 
> > '/sbin/lspci'. I may be able to help you out with MSI-X as well, but in 
> > that case, I have no hardware platform to test on.

> > You can check whether or not MSI is actually being used by doing 
> > '/sbin/lspci -v' and look for the Capability: Message Signalled 
> > Interrupt. When the driver is running in MSI mode, it should read 
> > 'Enable+' instead of 'Enable-'.

This e1000 chip actually doesn't have MSI support.  I had assumed that since
the e1000 driver caused the hanging and disabling MSI in the kernel caused
the hang to go away that the problem was MSI in the e1000.  The e1000 driver
only enables MSI on newer chips than what are in the Dell 28xx machines.

> > As it's a Dell, I assume there's two Intel Penium CPU's 
> > inside. Are you running with SMP enabled ?

SMP is enabled.

> > The local (internal) CPU APIC hasn't been informed that the interrupt 
> > has been dealt with and it will therefore allow no other interrupts 
> > anymore to arrive in the CPU (including your keyboard's). 
> > In fact, your CPU is idle.

I have used a PCI analyzer to see infinite loops on this machine for past
similar kernel issues and assumed it would be the same due to the symptoms.

> > When I build a kernel with Adeos but disable MSI then the 
> > system works fine for the most part.  There is one scenario 
> > where the system will still hang
> > doing disk and network accesses under a moderate load of I/O. 
> > 
> > Hm. That may indicate another issue.
> 
> Indeed. This behaviour has not been reported yet with patches 
> from the Adeos I-pipe series. Does it also happen with SMP 
> disabled, or Hyperthreading disabled?

It did happen with SMP disabled and I have always left hyperthreading
disabled because it is my understanding that hyperthreading is not supported
by the adeos patch.

> > Try upgrading the kernel. The kernel usually comes with updated drivers 
> > as well. Currently I'm running 2.6.16-rc2, which I had to patch manually

> > for Adeos (about 3 'hunks' from the 2.6.15-i386-1.2-00 patch didn't 
> > apply properly). By using 2.6.16-rc2, I got much better Intel 
> > (especially i865 graphics) chipset support than 2.6.15. Note, however, 
> > that I did the bug fixing in this thread on a plain 2.6.15, though (and 
> > the msi.c code is nearly identical).
> > 
> > I would recommend upgrading to 2.6.15 with the latest Adeos patch and 
> > try to get a stable system before enabling MSI.

In short, MSI doesn't seem to have been my issue.  I now have a more stable
kernel.  Apparently this system had some other faults with the specific
configuration options I was using.  I had to patch to the 2.6.14.7 level
(was at .4) and change some of the options in my .config.  Specifically, I
had to leave ACPI enabled (I had disabled as a test a while back).  With
ACPI disabled, the machine would still hang if the USB was disabled in the
BIOS.

After learning how to check for MSI, no devices in my system seem to
actually be using MSI.  The code patches you provided were never actually
executed.  Time will tell if my system is stable.

Thanks for your help!
Russ



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


RE: [Xenomai-core] Handling PCI MSI interrupts

2006-02-16 Thread Russell Johnson
> The latest patch was incomplete; you might be luckier with 
> this one. I've merged Jeroen's last observations on this issue and mine.

I tried this patch and it doesn't solve the issue I'm facing. With and
without this patch, my symptoms are the same.

I'm running a Dell 2850, dual CPU machine.  When I build a kernel without
Adeos then things are fine.  When I build with Adeos and MSI enabled the
following occurs:

1) If BIOS has USB disabled then the system will hang without even a
num-lock respose (i.e. tapping the num-lock key doesn't toggle the light).
The hang occurs just about the time the E1000 driver would load and enable
an MSI interrupt.

2) If BIOS has USB enabled then the system will run much longer but may hang
during heavy interrupt load on the E1000 driver.

My assumption based on past experience is that no num-lock response means an
infinite interrupt loop.

When I build a kernel with Adeos but disable MSI then the system works fine
for the most part.  There is one scenario where the system will still hang
doing disk and network accesses under a moderate load of I/O.

Both of these tests are just to get a stable kernel before I really start
using Adeos.  So Adeos is in its default configuration and I haven't loaded
Xenomai modules when these hangs occur.

I'm currently running the 2.6.14.4 kernel with the 2.6.14-1.0-12 patch of
adeos and then I included your msi.c patch from the previous e-mail.  If you
have any further hints or suggestions I'll try them.  Meanwhile I'm trying
different versions of various drivers (e1000 and scsi) as well as updating
the patch level of the kernel itself.

Russ





RE: [Xenomai-core] Handling PCI MSI interrupts

2006-02-16 Thread Russell Johnson
> The latest patch was incomplete; you might be luckier with 
> this one. I've merged Jeroen's last observations on this issue and mine.

I tried this patch and it doesn't solve the issue I'm facing. With and
without this patch, my symptoms are the same.

I'm running a Dell 2850, dual CPU machine.  When I build a kernel without
Adeos then things are fine.  When I build with Adeos and MSI enabled the
following occurs:

1) If BIOS has USB disabled then the system will hang without even a
num-lock respose (i.e. tapping the num-lock key doesn't toggle the light).
The hang occurs just about the time the E1000 driver would load and enable
an MSI interrupt.

2) If BIOS has USB enabled then the system will run much longer but may hang
during heavy interrupt load on the E1000 driver.

My assumption based on past experience is that no num-lock response means an
infinite interrupt loop.

When I build a kernel with Adeos but disable MSI then the system works fine
for the most part.  There is one scenario where the system will still hang
doing disk and network accesses under a moderate load of I/O.

Both of these tests are just to get a stable kernel before I really start
using Adeos.  So Adeos is in its default configuration and I haven't loaded
Xenomai modules when these hangs occur.

I'm currently running the 2.6.14.4 kernel with the 2.6.14-1.0-12 patch of
adeos and then I included your msi.c patch from the previous e-mail.  If you
have any further hints or suggestions I'll try them.  Meanwhile I'm trying
different versions of various drivers (e1000 and scsi) as well as updating
the patch level of the kernel itself.

Russ



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


RE: [Xenomai-core] Handling PCI MSI interrupts

2006-02-16 Thread Russell Johnson


> It's definitely an Adeos issue and msi.c needs fixing. What 
> about this patch, do 
> things improve with it (against 2.6.15-ipipe-1.2-00)?

I'm currently patching my setup which started with ipipe-2.6.14-i386-1.0-12.
I've been having no luck with any MSI devices in the system even if they
have supposedly had MSI disabled.  I'll post my testing results in the next
day or so.

Russ





RE: [Xenomai-core] Handling PCI MSI interrupts

2006-02-15 Thread Russell Johnson


> It's definitely an Adeos issue and msi.c needs fixing. What 
> about this patch, do 
> things improve with it (against 2.6.15-ipipe-1.2-00)?

I'm currently patching my setup which started with ipipe-2.6.14-i386-1.0-12.
I've been having no luck with any MSI devices in the system even if they
have supposedly had MSI disabled.  I'll post my testing results in the next
day or so.

Russ



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core