Re: [Xenomai-core] [RTnet-users] Potential problem with rt_eepro100
Anders Blomdell wrote: Anders Blomdell wrote: Hi, I'm trying to use rt_eepro100, for sending raw ethernet packets, but I'm experincing occasionally weird behaviour. Versions of things: linux-2.6.34.5 xenomai-2.5.5.2 rtnet-39f7fcf The testprogram runs on two computers with Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) controller, where one computer acts as a mirror sending back packets received from the ethernet (only those two computers on the network), and the other sends packets and measures roundtrip time. Most packets comes back in approximately 100 us, but occasionally the reception times out (once in about 10 packets or more), but the packets gets immediately received when reception is retried, which might indicate a race between rt_dev_recvmsg and interrupt, but I might miss something obvious. Changing one of the ethernet cards to a Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05), while keeping everything else constant, changes behavior somewhat; after receiving a few 10 packets, reception stops entirely (-EAGAIN is returned), while transmission proceeds as it should (and mirror returns packets). Any suggestions on what to try? Since the problem disappears with 'maxcpus=1', I suspect I have a SMP issue (machine is a Core2 Quad), so I'll move to xenomai-core. (original message can be found at http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se ) Xenomai-core gurus: which is the corrrect way to debug SMP issues? Can I run I-pipe-tracer and expect to be able save at least 150 us of traces for all cpus? Any hints/suggestions/insigths are welcome... Regards Anders Blomdell ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RTnet-users] Potential problem with rt_eepro100
Am 28.10.2010 09:34, Anders Blomdell wrote: Anders Blomdell wrote: Anders Blomdell wrote: Hi, I'm trying to use rt_eepro100, for sending raw ethernet packets, but I'm experincing occasionally weird behaviour. Versions of things: linux-2.6.34.5 xenomai-2.5.5.2 rtnet-39f7fcf The testprogram runs on two computers with Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) controller, where one computer acts as a mirror sending back packets received from the ethernet (only those two computers on the network), and the other sends packets and measures roundtrip time. Most packets comes back in approximately 100 us, but occasionally the reception times out (once in about 10 packets or more), but the packets gets immediately received when reception is retried, which might indicate a race between rt_dev_recvmsg and interrupt, but I might miss something obvious. Changing one of the ethernet cards to a Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05), while keeping everything else constant, changes behavior somewhat; after receiving a few 10 packets, reception stops entirely (-EAGAIN is returned), while transmission proceeds as it should (and mirror returns packets). Any suggestions on what to try? Since the problem disappears with 'maxcpus=1', I suspect I have a SMP issue (machine is a Core2 Quad), so I'll move to xenomai-core. (original message can be found at http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se ) Xenomai-core gurus: which is the corrrect way to debug SMP issues? Can I run I-pipe-tracer and expect to be able save at least 150 us of traces for all cpus? Any hints/suggestions/insigths are welcome... The i-pipe tracer unfortunately only saves traces for a the CPU that triggered the freeze. To have a full pictures, you may want to try my ftrace port I posted recently for 2.6.35. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RTnet-users] Potential problem with rt_eepro100
Jan Kiszka wrote: Am 28.10.2010 09:34, Anders Blomdell wrote: Anders Blomdell wrote: Anders Blomdell wrote: Hi, I'm trying to use rt_eepro100, for sending raw ethernet packets, but I'm experincing occasionally weird behaviour. Versions of things: linux-2.6.34.5 xenomai-2.5.5.2 rtnet-39f7fcf The testprogram runs on two computers with Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) controller, where one computer acts as a mirror sending back packets received from the ethernet (only those two computers on the network), and the other sends packets and measures roundtrip time. Most packets comes back in approximately 100 us, but occasionally the reception times out (once in about 10 packets or more), but the packets gets immediately received when reception is retried, which might indicate a race between rt_dev_recvmsg and interrupt, but I might miss something obvious. Changing one of the ethernet cards to a Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05), while keeping everything else constant, changes behavior somewhat; after receiving a few 10 packets, reception stops entirely (-EAGAIN is returned), while transmission proceeds as it should (and mirror returns packets). Any suggestions on what to try? Since the problem disappears with 'maxcpus=1', I suspect I have a SMP issue (machine is a Core2 Quad), so I'll move to xenomai-core. (original message can be found at http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se ) Xenomai-core gurus: which is the corrrect way to debug SMP issues? Can I run I-pipe-tracer and expect to be able save at least 150 us of traces for all cpus? Any hints/suggestions/insigths are welcome... The i-pipe tracer unfortunately only saves traces for a the CPU that triggered the freeze. To have a full pictures, you may want to try my ftrace port I posted recently for 2.6.35. 2.6.35.7 ? /Anders ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RTnet-users] Potential problem with rt_eepro100
Am 28.10.2010 11:34, Anders Blomdell wrote: Jan Kiszka wrote: Am 28.10.2010 09:34, Anders Blomdell wrote: Anders Blomdell wrote: Anders Blomdell wrote: Hi, I'm trying to use rt_eepro100, for sending raw ethernet packets, but I'm experincing occasionally weird behaviour. Versions of things: linux-2.6.34.5 xenomai-2.5.5.2 rtnet-39f7fcf The testprogram runs on two computers with Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) controller, where one computer acts as a mirror sending back packets received from the ethernet (only those two computers on the network), and the other sends packets and measures roundtrip time. Most packets comes back in approximately 100 us, but occasionally the reception times out (once in about 10 packets or more), but the packets gets immediately received when reception is retried, which might indicate a race between rt_dev_recvmsg and interrupt, but I might miss something obvious. Changing one of the ethernet cards to a Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05), while keeping everything else constant, changes behavior somewhat; after receiving a few 10 packets, reception stops entirely (-EAGAIN is returned), while transmission proceeds as it should (and mirror returns packets). Any suggestions on what to try? Since the problem disappears with 'maxcpus=1', I suspect I have a SMP issue (machine is a Core2 Quad), so I'll move to xenomai-core. (original message can be found at http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se ) Xenomai-core gurus: which is the corrrect way to debug SMP issues? Can I run I-pipe-tracer and expect to be able save at least 150 us of traces for all cpus? Any hints/suggestions/insigths are welcome... The i-pipe tracer unfortunately only saves traces for a the CPU that triggered the freeze. To have a full pictures, you may want to try my ftrace port I posted recently for 2.6.35. 2.6.35.7 ? Exactly. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Potential problem with rt_eepro100
Jan Kiszka wrote: Am 28.10.2010 11:34, Anders Blomdell wrote: Jan Kiszka wrote: Am 28.10.2010 09:34, Anders Blomdell wrote: Anders Blomdell wrote: Anders Blomdell wrote: Hi, I'm trying to use rt_eepro100, for sending raw ethernet packets, but I'm experincing occasionally weird behaviour. Versions of things: linux-2.6.34.5 xenomai-2.5.5.2 rtnet-39f7fcf The testprogram runs on two computers with Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) controller, where one computer acts as a mirror sending back packets received from the ethernet (only those two computers on the network), and the other sends packets and measures roundtrip time. Most packets comes back in approximately 100 us, but occasionally the reception times out (once in about 10 packets or more), but the packets gets immediately received when reception is retried, which might indicate a race between rt_dev_recvmsg and interrupt, but I might miss something obvious. Changing one of the ethernet cards to a Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05), while keeping everything else constant, changes behavior somewhat; after receiving a few 10 packets, reception stops entirely (-EAGAIN is returned), while transmission proceeds as it should (and mirror returns packets). Any suggestions on what to try? Since the problem disappears with 'maxcpus=1', I suspect I have a SMP issue (machine is a Core2 Quad), so I'll move to xenomai-core. (original message can be found at http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se ) Xenomai-core gurus: which is the corrrect way to debug SMP issues? Can I run I-pipe-tracer and expect to be able save at least 150 us of traces for all cpus? Any hints/suggestions/insigths are welcome... The i-pipe tracer unfortunately only saves traces for a the CPU that triggered the freeze. To have a full pictures, you may want to try my ftrace port I posted recently for 2.6.35. 2.6.35.7 ? Well, 2.6.35.7/xenomai/rtnet without ftrace patch freezes after approx 8000 rounds (16000 packets). Time freshen up find serial port console debugging I guess (under the assumption that this is the same bug, but easier to reproduce). /Anders ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Potential problem with rt_eepro100
Am 28.10.2010 17:05, Anders Blomdell wrote: Current results: 1. 2.6.35.7, maxcpus=1; a few thousand rounds, freeze and this after some time: BUG: spinlock lockup on CPU#0, raw_test/2924, c0a1b540 Process raw_test (pid: 2924, ti=f18bc000 task=f1bdab00 task.ti=f18bc000) I-pipe domain Xenomai Stack: Call Trace: Code: d0 85 d8 74 0f ba 71 00 00 00 b8 78 a7 8f c0 e8 3b 03 02 00 a1 28 f6 9f c0 89 f2 8b 48 20 89 d8 e8 ad fd ff ff 57 9d 8d 65 f4 5b 5e 5f 5d c3 90 55 89 e5 0f 1f 44 00 00 8b 0d 28 f6 9f c0 ba 00 Please provide the full kernel log, ideally also with the I-pipe tracer (with panic tracing) enabled. 2. 2.6.35.7, maxcpus=4; no packets sent, this on console: e1000: rteth0: e1000_clean_tx_irq: Detected Tx Unit Hang Err, is this another NIC used on this box? If yes and when used as RTnet NIC instead, does it trigger the same issue? Tx Queue 0 TDH 0 TDT 10 next_to_use 10 next_to_clean0 buffer_info[next_to_clean] time_stamp 362d2 next_to_watch0 jiffies 368f8 next_to_watch.status 0 Anders Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Potential problem with rt_eepro100
Am 28.10.2010 17:18, Anders Blomdell wrote: On 2010-10-28 17.09, Jan Kiszka wrote: Am 28.10.2010 17:05, Anders Blomdell wrote: Current results: 1. 2.6.35.7, maxcpus=1; a few thousand rounds, freeze and this after some time: BUG: spinlock lockup on CPU#0, raw_test/2924, c0a1b540 Process raw_test (pid: 2924, ti=f18bc000 task=f1bdab00 task.ti=f18bc000) I-pipe domain Xenomai Stack: Call Trace: Code: d0 85 d8 74 0f ba 71 00 00 00 b8 78 a7 8f c0 e8 3b 03 02 00 a1 28 f6 9f c0 89 f2 8b 48 20 89 d8 e8 ad fd ff ff 57 9d 8d 65 f4 5b 5e 5f 5d c3 90 55 89 e5 0f 1f 44 00 00 8b 0d 28 f6 9f c0 ba 00 Please provide the full kernel log, ideally also with the I-pipe tracer (with panic tracing) enabled. Will reconfigure/recompile and do that, with full kernel log do you mean all bootup info? That's best to avoid missing some detail or doing QA ping-pong. 2. 2.6.35.7, maxcpus=4; no packets sent, this on console: e1000: rteth0: e1000_clean_tx_irq: Detected Tx Unit Hang Err, is this another NIC used on this box? If yes and when used as RTnet NIC instead, does it trigger the same issue? Switched the eepro100 to a e1000, and got same user program issues (indicating ipipe/rtdm/rtnet_stack issues and not a specific driver), have not switched back yet, will do if you rather want that. As both NICs are apparently affected, just stick with one. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] arm: Unprotected access to irq_desc field?
Gilles, I happened to come across rthal_mark_irq_disabled/enabled on arm. On first glance, it looks like these helpers manipulate irq_desc::status non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
Jan Kiszka wrote: Gilles, I happened to come across rthal_mark_irq_disabled/enabled on arm. On first glance, it looks like these helpers manipulate irq_desc::status non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile? I have no idea. How do the other architectures do? As far as I know, this code has been copied from there. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote: Jan Kiszka wrote: Gilles, I happened to come across rthal_mark_irq_disabled/enabled on arm. On first glance, it looks like these helpers manipulate irq_desc::status non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile? I have no idea. How do the other architectures do? As far as I know, this code has been copied from there. Other archs do the same, simply because once an irq is managed by the hal, it may not be shared in any way with the regular kernel. So locking is pointless. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
Jan Kiszka wrote: Gilles, I happened to come across rthal_mark_irq_disabled/enabled on arm. On first glance, it looks like these helpers manipulate irq_desc::status non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile? From my point of view, locking anything would be overkill on ARM: irq configurations are completely static as per the board, and so, ARMs can use proper irq demuxing, instead of the shared irqs workaround. So, in other word, I do not see why we would need any locking. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core