Hi Tom.

When you got this oops, which e1000 driver were you using?
Can you please post your click config if it's possible?

Thanks,
Joonwoo

On Mon, Mar 23, 2009 at 10:35 AM, Tom Gibson <[email protected]> wrote:
> Hi Roman,
>
> Thanks for the E1000E patch link.  I'll give that a try.
>
> Here's a copy of what shows up through the serial port when I do
> click-install up through the point where the system freezes:
>
> [  126.137863] click: starting router thread pid 3920 (ffff81021d5827c0)
> [  126.283549] Unable to handle kernel NULL pointer dereference at
> 0000000000000008 RIP:
> [  126.289028]  [<ffffffff803ca0d7>] pfifo_fast_dequeue+0x48/0x69
> [  126.297312] PGD 21d4c6067 PUD 21b94c067 PMD 0
> [  126.301791] Oops: 0002 [1] SMP
> [  126.304957] CPU 6
> [  126.306980] Modules linked in: click proclikefs nls_utf8 nls_cp437 vfat
> fat nls_base appletalk nfsd auth_rpcgss exportfs n
> [  126.360046] Pid: 0, comm: swapper Not tainted 2.6.24.7-click-amd64 #1
> [  126.366471] RIP: 0010:[<ffffffff803ca0d7>]  [<ffffffff803ca0d7>]
> pfifo_fast_dequeue+0x48/0x69
> [  126.374990] RSP: 0018:ffff81021f207eb8  EFLAGS: 00010246
> [  126.380283] RAX: 0000000000000000 RBX: ffff81021b9fa000 RCX:
> ffff81021c927a80
> [  126.387400] RDX: ffff81021b8b42f0 RSI: ffff81021b9fa9c8 RDI:
> ffff81021b8b4200
> [  126.394517] RBP: ffff81021b9fa000 R08: 0000000000000000 R09:
> ffffffff805a4180
> [  126.401635] R10: 0000000000000001 R11: ffff81021f1f5278 R12:
> 0000000000000000
> [  126.408752] R13: 0000000000000009 R14: ffff81021b9fa300 R15:
> ffff81021b9fa280
> [  126.415870] FS:  0000000000000000(0000) GS:ffff81021f1b3f40(0000)
> knlGS:0000000000000000
> [  126.423939] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [  126.429672] CR2: 0000000000000008 CR3: 000000021d4b8000 CR4:
> 00000000000006e0
> [  126.436789] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  126.443906] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  126.451015] Process swapper (pid: 0, threadinfo ffff81021f1fe000, task
> ffff81021f1faa90)
> [  126.459074] Stack:  ffffffff803ca953 0000000000000040 00000000ffff256e
> ffff81021b9fa040
> [  126.467134]  ffff81021b9fa000 ffff81021b9fa040 0000000000000000
> 0000000000000009
> [  126.474572]  0000000000000006 0000000000000000 ffffffff803b7b11
> ffffffff80517120
> [  126.481820] Call Trace:
> [  126.484448]  <IRQ>  [<ffffffff803ca953>] __qdisc_run+0x94/0x1e5
> [  126.490376]  [<ffffffff803b7b11>] net_tx_action+0xbc/0xe4
> [  126.495763]  [<ffffffff8023cedd>] __do_softirq+0x5c/0xc2
> [  126.501063]  [<ffffffff8020a000>] default_idle+0x0/0x3d
> [  126.506278]  [<ffffffff8020d0fc>] call_softirq+0x1c/0x28
> [  126.511580]  [<ffffffff8020e784>] do_softirq+0x2c/0x7d
> [  126.516706]  [<ffffffff8023ccbe>] irq_exit+0x3f/0x84
> [  126.521653]  [<ffffffff8020e9b0>] do_IRQ+0xb7/0xd4
> [  126.526436]  [<ffffffff8020a087>] mwait_idle+0x0/0x45
> [  126.531478]  [<ffffffff8020a087>] mwait_idle+0x0/0x45
> [  126.536521]  [<ffffffff8020c481>] ret_from_intr+0x0/0xa
> [  126.541733]  <EOI>  [<ffffffff8020a0c9>] mwait_idle+0x42/0x45
> [  126.547486]  [<ffffffff8020b0e6>] cpu_idle+0x95/0xde
> [  126.552446]
> [  126.553936]
> [  126.553936] Code: 48 89 50 08 48 c7 01 00 00 00 00 48 c7 41 08 00 00 00
> 00 8b
> [  126.562981] RIP  [<ffffffff803ca0d7>] pfifo_fast_dequeue+0x48/0x69
> [  126.569166]  RSP <ffff81021f207eb8>
> [  126.572651] CR2: 0000000000000008
> [  126.575962] ---[ end trace 32f8f92d27157251 ]---
> [  126.580564] Kernel panic - not syncing: Aiee, killing interrupt handler!
>
>
> I searched for pfifo_fast_dequeue in the Kernel source and I think it wound
> up being in the networking code.  I don't know what else to check right now.
>
> -Tom
>
>
> On Mon, Mar 23, 2009 at 10:27 AM, Roman Chertov <[email protected]>wrote:
>
>> Tom,
>>
>> http://www.mail-archive.com/[email protected]/msg02730.html
>> This is the e1000e driver that Joonwoo released.  When you use it, you
>> need to use PollDevice instead of FromDevice Click elements.  As far as
>> crashes go, it would help to see the dmesg output when the crash
>> happens.  You might be able to see the messages in /var/log/messages
>> even after the reboot.  The other way is to use a serial console.
>>
>> Roman
>>
>> Tom Gibson wrote:
>>
>>> Hi All,
>>>
>>> I'm not sure what versions of things I should be using and what versions
>>> others use.
>>>
>>> For my Kernel I'm using the latest Click Kernel patch with 2.6.24.7 64bit
>>> on
>>> a dual E5410 Xeon server.  My main (only) issue right now is that I get
>>> Kernel lockups (system freezes and keyboard LEDs just blink) too often.
>>>  It
>>> happens randomly when the system is idle sometimes.  Also it happens
>>> everytime I try and transmit data too close to line rate (4x 1Gig) using
>>> the
>>> fast UDP source element.  I'm thinking maybe it's a bug that's fixed in a
>>> newer Kernel version.  I'm researching debuging this sort of thing over
>>> the
>>> serial port, so I'll probably have more details soon.
>>>
>>> For my E1000 driver I use the version that comes in the 2.6.24.7 Kernel w/
>>> NAPI enabled (no click polling mode patch).  I also tried compiling the
>>> latest stable E1000E driver (no click polling mode patches) and still got
>>> the Kernel freeze when transmitting too fast.  I ran into issues trying to
>>> compile the patched E1000 driver in my Click directory.  First it
>>> complained
>>> about the Makefile modifying CFLAGS, so I updated it to be more like the
>>> Makefile of the current E1000 driver.  That fixed that problem, but it
>>> still
>>> failed to compile complaining about unknown fields in some of the main
>>> network struct's.
>>>
>>> I saw the latest stable Intel NIC drivers use an updated driver called
>>> E1000E for newer PCIe cards.  Would it be a good idea to use this new
>>> driver
>>> and migrate the Click polling mode patches to it?  How does NAPI support
>>> in
>>> the Intel drivers relate to Click's custom polling mode patches?
>>>
>>> I haven't worked with patches in Linux before besides applying them.  I'm
>>> not sure how difficult it would be and what a good way would be to migrate
>>> the Click supplied patches to the newer versions of the Kernel and Intel
>>> NIC
>>> drivers.  Does anyone have some advice on how to go about this?
>>>
>>> Thanks,
>>>
>>> Tom
>>> _______________________________________________
>>> click mailing list
>>> [email protected]
>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>>>
>>
>>
> _______________________________________________
> click mailing list
> [email protected]
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>

_______________________________________________
click mailing list
[email protected]
https://amsterdam.lcs.mit.edu/mailman/listinfo/click

Reply via email to