Nuutti,

Thanks very much for these dumps and this config.  Pretty informative.

Here are some debugging suggestions.

(0) This distinctly looks like memory corruption, possibly within ToDevice.  I 
will look at Queue itself, as well, but this seems like an unlikely source of 
problems, since your Click is not installed with --enable-multithread.

(1) Perhaps the problem is with EtherSwitch, whose internal hash table may be 
causing problems in SMP settings.  Can you try again, replacing the 
EtherSwitch element with a Hub element?  This will do the same job, but 
without a table.  My expectation is this will also fail.

(2) To narrow down the problem, we can try very simple ToDevice and Queue 
configs.  This would involve:

- ia32
- either patch or fixincludes
- SMP kernel
- The following configs:

InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>)
-> ToDevice(eth0);

-*- OR

InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>)
-> Queue
-> ToDevice(eth0);

-*- OR

InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>)
-> ToDevice(eth0);
InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>)
-> ToDevice(eth1);

-*- OR

InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>)
-> Queue
-> ToDevice(eth0);
InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>)
-> Queue
-> ToDevice(eth1);


------

These configs test ToDevice with and without Queues, and with and without 
accessing two devices.

We'll look in parallel, but I'm interested in what you see.

Eddie




Nuutti Varis wrote:
> Hey, 
> 
> While trying to run throughput measurements with Click in a kernel, running a 
> simple EtherSwitch configuration (attached as etherswitch.click) in a 
> topology of:
> 
> EndHostA::ethI0 <==> ethI0::EtherSwitch1::ethI1 <==> 
> ethI1::EtherSwitch2::ethI0 <==> ethI0::EndHostB
> 192.168.2.1 
> ---------------------------------------------------------------------------> 
> 192.168.2.2
> FastUDPSrc w/ 64B packet, 300kpp/s
> 
> I stumbled upon a kernel crash, seemingly when the Queue elements started 
> dropping packets due to overflow. I tried this with two different kernel 
> versions (2.6.31.12 and 2.6.24.7) and with either 2.6.24.7 manual patch, or 
> with --enable-fixincludes. Interestingly, the kernel crash does not happen 
> when I disable SMP from the kernel. Additionally, normal linux bridging does 
> not crash the kernel on overflows. Partial/full crash dumps as attachments 
> from various days of testing.
> 
> Configuration stuff of the EtherSwitch{1,2}:
> - Dumps arch indicated in the filename, either amd64 or ia32
> - MTU of ethI1 is 1540 (tried with 1500 as well, no difference)
> - Click is configured with --enable-linuxmodule --enable-userlevel 
> --enable-etherswitch [--enable-fixincludes]
> - Kernel does not have any pre-empting enabled.
> - Both e1000e poll-patched and vanilla cause the problem
> - e1000e versions 0.4.1.7 and 1.0.2-k2 (comes with 2.6.31.12) cause the 
> problem
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 
> 
> 
> --
> Nuutti Varis ([email protected])
> PhD Student, Aalto University School of Science and Technology
> Department of Communications and Networking
> 
> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> click mailing list
> [email protected]
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
_______________________________________________
click mailing list
[email protected]
https://amsterdam.lcs.mit.edu/mailman/listinfo/click

Reply via email to