Romain Lenglet wrote:
>>Please let me know if you experience any problems, so I can
>>update the driver...
> 
> 
> Ok, thank you. I *do* experience problems, hence here is a report 
> as complete as possible.
> 
> 
> I have tried to use one of my 8169 interfaces as a capture 
> interface on a 100Mbps network, for testing, with rtnet capture.
> Only one of my 8169 interfaces is activated in rtnet.conf 
> (RT_DRIVER_OPTIONS="cards=1,0,0,0").
> 
> 
> After starting rtnet capture, the console periodically outputs a 
> lot of times (15 to 30 times) the following line:
> r8169: rtl8169_rx_interrupt: Allocate n_skb failed! 
> (priv->rx_buf_size = 1536)
> ...
> followed by:
> r8169: Leaving Interrupt Unhandled
> 

Klaus, any ideas? The IRQ handler seems to run out of buffers, but this
should not happen with normally loaded networks and a rather unloaded box.

> 
> When capturing packets (e.g. starting tethereal -i rteth0), it 
> captures a few packets, then nothing for 30 seconds, then 
> capture a few packets again, then nothing for 30 seconds, etc.
> (although there are constantly many broadcast packets sent on 
> this network)
> 
> I also see sometimes the line:
> rteth0: Too much work at interrupt!
> 
> 
> In addition, the RT driver seems to block the interrupts of 
> another ethernet interface, that is non-realtime and that is 
> managed by the standard Linux 3Com 3c905 driver. I get 
> periodically messages on the console from that driver, such as:
> NETDEV WATCHDOG: eth2: transmit timed out
> eth2: transmit timed out, tx_status 00 status e681.
>   diagnostics: net 0ccc media 8880 dma 0000003a fifo 8000
> eth2: Interrupt posted but not delivered -- IRQ blocked by 
> another device?
>   flags: bus-master 1, dirty 5419(11) current 5419(11)
>   Transmit list 00000000 vs. f719f8e0
>   0: @f719f200 length 8000002a status 0001002a
>   1: @f719f2a0 length 8000002a status 0001002a
> (etc. 16 lines like those)
> 
> (in fact, I have 3 interfaces on that PC: eth2 is a 3Com 3C905, 
> and rteth0 and rteth1 are 8169-based interfaces)

Oops, try to resolve this first. It is in NO WAY recommended to share
IRQs between real-time and non-real-time devices. You may even face
crashes as we recently did here with an older RTAI version. So far, I'm
not sure if the crash is due to RTAI or a bug in RTnet, but as IRQ
sharing makes no sense from the real-time point of view anyway, we did
not yet look at this effect closer.

> 
> And effectively, the traffic on the non-RT interface is stopped: 
> if I had opened a ssh connection, it becomes no more responsive 
> once I have started rtnet and rtcap.
> If I stop rtnet, i.e. if I put the RT interface down (no need to 
> unload the RT driver module), it depends on the state of RTAI:
> 
> - if RTAI becomes completely broken (i.e. /proc/rtai/sched says: 
> "Cannot allocate memory"), then unloading rtnet is not 
> sufficient: the non-RT network card interrupts are still 
> blocked, and I need to rmmod and insmod the rt_rtdm module again 
> to unblock the non-RT interface, and recover a 
> normal /proc/rtai/sched output.
> 
> - if RTAI is still not completely broken, it is sufficient to 
> stop rtnet.
> 
> 
> 
> Just after starting rtnet capture, here is the content of 
> my /proc/rtai/sched:
> CPU  PID    PRI  TIMEOUT  STAT       NAME
>   0  0      0    0        R          ROOT
>   0  0      98   0        W          f8b5b9c0
>   0  0      1    0        W          f8b5c120
> 
> But after a while, the content of /proc/rtai/sched becomes:
> /proc/rtai/sched: Cannot allocate memory
> 
> (the only way to solve this is to rmmod rtai_rtdm module, and 
> insmod again)
> 
> 
> My RT interface and my non-RT interface share the same IRQ 16.
> Cf the content of /proc/rtai/irq:
> IRQ         CPU0
>  16:          56
> 216:           1
> and the content of /proc/interrupts:
>            CPU0
>   0:    2751710    IO-APIC-edge  timer, rthal_broadcast_timer
>   1:       3784    IO-APIC-edge  i8042
>   7:          0    IO-APIC-edge  parport0
>   8:          1    IO-APIC-edge  rtc
>   9:          1   IO-APIC-level  acpi
>  14:       6025    IO-APIC-edge  ide0
>  15:         26    IO-APIC-edge  ide1
>  16:      18795   IO-APIC-level  eth2
>  17:          0   IO-APIC-level  uhci_hcd:usb1
>  18:          0   IO-APIC-level  Intel ICH2
>  19:          0   IO-APIC-level  uhci_hcd:usb2
> NMI:          0
> LOC:    2751621
> ERR:          0
> MIS:          0
> 
> 
> 
> So, as a conclusion: the good news is that the driver is somewhat 
> working (I can capture some packets with it...), but 1) 
> interrupt handling seems to be broken, and 2) there seems to be 
> a memory leak in RTAI/RTDM or in RTnet.
> Probably those two bugs are related.
> 

Resolve the IRQ conflict first and then check if the problems persist.
If yes, please report again.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to