Hey, Fyi, commit http://read.cs.ucla.edu/gitweb?p=click;a=commit;h=f287e014f89a85276bb39c29d96b08600e2d1a49 probably fixed the issue, at least on a SMP kernel, no --enable-multithread. The setup has been running fine for a better part of an hour without crashing, whereas before the crashes (with e1000e -NAPI) came in seconds.
On Feb 10, 2010, at 10:36 PM, Eddie Kohler wrote: > Hi Nuutti, > > There is a small chance this commit may fix your issue: > > http://www.read.cs.ucla.edu/gitweb?p=click;a=commit;h=01c8f4e084036338e83a6bff7a8e74dc49caa014 > > If it does not, I think we need more input from you to narrow it down... > > Thanks so much, > Eddie > > > Eddie Kohler wrote: >> Nuutti, >> Thanks very much for these dumps and this config. Pretty informative. >> Here are some debugging suggestions. >> (0) This distinctly looks like memory corruption, possibly within ToDevice. >> I will look at Queue itself, as well, but this seems like an unlikely source >> of problems, since your Click is not installed with --enable-multithread. >> (1) Perhaps the problem is with EtherSwitch, whose internal hash table may >> be causing problems in SMP settings. Can you try again, replacing the >> EtherSwitch element with a Hub element? This will do the same job, but >> without a table. My expectation is this will also fail. >> (2) To narrow down the problem, we can try very simple ToDevice and Queue >> configs. This would involve: >> - ia32 >> - either patch or fixincludes >> - SMP kernel >> - The following configs: >> InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>) >> -> ToDevice(eth0); >> -*- OR >> InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>) >> -> Queue >> -> ToDevice(eth0); >> -*- OR >> InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>) >> -> ToDevice(eth0); >> InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>) >> -> ToDevice(eth1); >> -*- OR >> InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>) >> -> Queue >> -> ToDevice(eth0); >> InfiniteSource(DATA \<plausible-data-for-an-ethernet-packet>) >> -> Queue >> -> ToDevice(eth1); >> ------ >> These configs test ToDevice with and without Queues, and with and without >> accessing two devices. >> We'll look in parallel, but I'm interested in what you see. >> Eddie >> Nuutti Varis wrote: >>> Hey, >>> While trying to run throughput measurements with Click in a kernel, running >>> a simple EtherSwitch configuration (attached as etherswitch.click) in a >>> topology of: >>> >>> EndHostA::ethI0 <==> ethI0::EtherSwitch1::ethI1 <==> >>> ethI1::EtherSwitch2::ethI0 <==> ethI0::EndHostB >>> 192.168.2.1 >>> ---------------------------------------------------------------------------> >>> 192.168.2.2 >>> FastUDPSrc w/ 64B packet, 300kpp/s >>> >>> I stumbled upon a kernel crash, seemingly when the Queue elements started >>> dropping packets due to overflow. I tried this with two different kernel >>> versions (2.6.31.12 and 2.6.24.7) and with either 2.6.24.7 manual patch, or >>> with --enable-fixincludes. Interestingly, the kernel crash does not happen >>> when I disable SMP from the kernel. Additionally, normal linux bridging >>> does not crash the kernel on overflows. Partial/full crash dumps as >>> attachments from various days of testing. >>> >>> Configuration stuff of the EtherSwitch{1,2}: >>> - Dumps arch indicated in the filename, either amd64 or ia32 >>> - MTU of ethI1 is 1540 (tried with 1500 as well, no difference) >>> - Click is configured with --enable-linuxmodule --enable-userlevel >>> --enable-etherswitch [--enable-fixincludes] >>> - Kernel does not have any pre-empting enabled. >>> - Both e1000e poll-patched and vanilla cause the problem >>> - e1000e versions 0.4.1.7 and 1.0.2-k2 (comes with 2.6.31.12) cause the >>> problem >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> >>> >>> >>> -- >>> Nuutti Varis ([email protected]) >>> PhD Student, Aalto University School of Science and Technology >>> Department of Communications and Networking >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> click mailing list >>> [email protected] >>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click >> _______________________________________________ >> click mailing list >> [email protected] >> https://amsterdam.lcs.mit.edu/mailman/listinfo/click -- Nuutti Varis ([email protected]) PhD Student, Aalto University School of Science and Technology Department of Communications and Networking _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
