Hi Tom. When you got this oops, which e1000 driver were you using? Can you please post your click config if it's possible?
Thanks, Joonwoo On Mon, Mar 23, 2009 at 10:35 AM, Tom Gibson <[email protected]> wrote: > Hi Roman, > > Thanks for the E1000E patch link. I'll give that a try. > > Here's a copy of what shows up through the serial port when I do > click-install up through the point where the system freezes: > > [ 126.137863] click: starting router thread pid 3920 (ffff81021d5827c0) > [ 126.283549] Unable to handle kernel NULL pointer dereference at > 0000000000000008 RIP: > [ 126.289028] [<ffffffff803ca0d7>] pfifo_fast_dequeue+0x48/0x69 > [ 126.297312] PGD 21d4c6067 PUD 21b94c067 PMD 0 > [ 126.301791] Oops: 0002 [1] SMP > [ 126.304957] CPU 6 > [ 126.306980] Modules linked in: click proclikefs nls_utf8 nls_cp437 vfat > fat nls_base appletalk nfsd auth_rpcgss exportfs n > [ 126.360046] Pid: 0, comm: swapper Not tainted 2.6.24.7-click-amd64 #1 > [ 126.366471] RIP: 0010:[<ffffffff803ca0d7>] [<ffffffff803ca0d7>] > pfifo_fast_dequeue+0x48/0x69 > [ 126.374990] RSP: 0018:ffff81021f207eb8 EFLAGS: 00010246 > [ 126.380283] RAX: 0000000000000000 RBX: ffff81021b9fa000 RCX: > ffff81021c927a80 > [ 126.387400] RDX: ffff81021b8b42f0 RSI: ffff81021b9fa9c8 RDI: > ffff81021b8b4200 > [ 126.394517] RBP: ffff81021b9fa000 R08: 0000000000000000 R09: > ffffffff805a4180 > [ 126.401635] R10: 0000000000000001 R11: ffff81021f1f5278 R12: > 0000000000000000 > [ 126.408752] R13: 0000000000000009 R14: ffff81021b9fa300 R15: > ffff81021b9fa280 > [ 126.415870] FS: 0000000000000000(0000) GS:ffff81021f1b3f40(0000) > knlGS:0000000000000000 > [ 126.423939] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > [ 126.429672] CR2: 0000000000000008 CR3: 000000021d4b8000 CR4: > 00000000000006e0 > [ 126.436789] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 126.443906] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 126.451015] Process swapper (pid: 0, threadinfo ffff81021f1fe000, task > ffff81021f1faa90) > [ 126.459074] Stack: ffffffff803ca953 0000000000000040 00000000ffff256e > ffff81021b9fa040 > [ 126.467134] ffff81021b9fa000 ffff81021b9fa040 0000000000000000 > 0000000000000009 > [ 126.474572] 0000000000000006 0000000000000000 ffffffff803b7b11 > ffffffff80517120 > [ 126.481820] Call Trace: > [ 126.484448] <IRQ> [<ffffffff803ca953>] __qdisc_run+0x94/0x1e5 > [ 126.490376] [<ffffffff803b7b11>] net_tx_action+0xbc/0xe4 > [ 126.495763] [<ffffffff8023cedd>] __do_softirq+0x5c/0xc2 > [ 126.501063] [<ffffffff8020a000>] default_idle+0x0/0x3d > [ 126.506278] [<ffffffff8020d0fc>] call_softirq+0x1c/0x28 > [ 126.511580] [<ffffffff8020e784>] do_softirq+0x2c/0x7d > [ 126.516706] [<ffffffff8023ccbe>] irq_exit+0x3f/0x84 > [ 126.521653] [<ffffffff8020e9b0>] do_IRQ+0xb7/0xd4 > [ 126.526436] [<ffffffff8020a087>] mwait_idle+0x0/0x45 > [ 126.531478] [<ffffffff8020a087>] mwait_idle+0x0/0x45 > [ 126.536521] [<ffffffff8020c481>] ret_from_intr+0x0/0xa > [ 126.541733] <EOI> [<ffffffff8020a0c9>] mwait_idle+0x42/0x45 > [ 126.547486] [<ffffffff8020b0e6>] cpu_idle+0x95/0xde > [ 126.552446] > [ 126.553936] > [ 126.553936] Code: 48 89 50 08 48 c7 01 00 00 00 00 48 c7 41 08 00 00 00 > 00 8b > [ 126.562981] RIP [<ffffffff803ca0d7>] pfifo_fast_dequeue+0x48/0x69 > [ 126.569166] RSP <ffff81021f207eb8> > [ 126.572651] CR2: 0000000000000008 > [ 126.575962] ---[ end trace 32f8f92d27157251 ]--- > [ 126.580564] Kernel panic - not syncing: Aiee, killing interrupt handler! > > > I searched for pfifo_fast_dequeue in the Kernel source and I think it wound > up being in the networking code. I don't know what else to check right now. > > -Tom > > > On Mon, Mar 23, 2009 at 10:27 AM, Roman Chertov <[email protected]>wrote: > >> Tom, >> >> http://www.mail-archive.com/[email protected]/msg02730.html >> This is the e1000e driver that Joonwoo released. When you use it, you >> need to use PollDevice instead of FromDevice Click elements. As far as >> crashes go, it would help to see the dmesg output when the crash >> happens. You might be able to see the messages in /var/log/messages >> even after the reboot. The other way is to use a serial console. >> >> Roman >> >> Tom Gibson wrote: >> >>> Hi All, >>> >>> I'm not sure what versions of things I should be using and what versions >>> others use. >>> >>> For my Kernel I'm using the latest Click Kernel patch with 2.6.24.7 64bit >>> on >>> a dual E5410 Xeon server. My main (only) issue right now is that I get >>> Kernel lockups (system freezes and keyboard LEDs just blink) too often. >>> It >>> happens randomly when the system is idle sometimes. Also it happens >>> everytime I try and transmit data too close to line rate (4x 1Gig) using >>> the >>> fast UDP source element. I'm thinking maybe it's a bug that's fixed in a >>> newer Kernel version. I'm researching debuging this sort of thing over >>> the >>> serial port, so I'll probably have more details soon. >>> >>> For my E1000 driver I use the version that comes in the 2.6.24.7 Kernel w/ >>> NAPI enabled (no click polling mode patch). I also tried compiling the >>> latest stable E1000E driver (no click polling mode patches) and still got >>> the Kernel freeze when transmitting too fast. I ran into issues trying to >>> compile the patched E1000 driver in my Click directory. First it >>> complained >>> about the Makefile modifying CFLAGS, so I updated it to be more like the >>> Makefile of the current E1000 driver. That fixed that problem, but it >>> still >>> failed to compile complaining about unknown fields in some of the main >>> network struct's. >>> >>> I saw the latest stable Intel NIC drivers use an updated driver called >>> E1000E for newer PCIe cards. Would it be a good idea to use this new >>> driver >>> and migrate the Click polling mode patches to it? How does NAPI support >>> in >>> the Intel drivers relate to Click's custom polling mode patches? >>> >>> I haven't worked with patches in Linux before besides applying them. I'm >>> not sure how difficult it would be and what a good way would be to migrate >>> the Click supplied patches to the newer versions of the Kernel and Intel >>> NIC >>> drivers. Does anyone have some advice on how to go about this? >>> >>> Thanks, >>> >>> Tom >>> _______________________________________________ >>> click mailing list >>> [email protected] >>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click >>> >>> >> >> > _______________________________________________ > click mailing list > [email protected] > https://amsterdam.lcs.mit.edu/mailman/listinfo/click > _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
