More cleanup: Ian, thanks for this. It seems reasonable to me now to just set _iters_per_os to 2 at userlevel. If that is a bad choice, someone else will complain!
Eddie On 3/22/10 8:50 AM, Ian Rose wrote: > Hi Eddie, > > I've been a bit bogged down with other work but plan to look into this a bit > more and respond to your email in more depth as soon as I can. Superficially, > at least, it seems to me that the difference between _iters_per_os = 2 and 64 > is that in the '2' case, work is being done on (approx) every other iteration > through the driver_loop in routerthread. By "work" I mean "kevent() is called, > then selected() is called on the FromDevice element, which calls > pcap_dispatch() with cnt=1, etc.". In the case where _iters_per_os=64, this > work is only being done every 1/64 loops. > > So the question is whether enough bookkeeping-type stuff occurs in every > single loop iteration (e.g. checking for signals, pending tasks, timer, > calculating scheduling, etc.) for it to make sense that these "idle" > iterations could account for the added CPU usage. I don't yet have a deep > enough understanding of all of these bookkeeping items to say whether or not > that explanation seems plausible to me. > > - Ian > > > Eddie Kohler wrote: >> Ian, >> >> That is quite weird. Here's how pcap dispatching is supposed to work. >> >> - When the fd is selected, pcap_dispatch() is called. >> - pcap_dispatch() calls a FromDevice callback function, which actually emits >> the packets. >> - pcap_dispatch() returns an integer that says exactly how many packets were >> emitted. >> - If that integer is > 0, FromDevice schedules a task. >> - The task goes through the same steps, *only rescheduling itself when there >> is more work to do*. >> >> So I really don't see why the setting of _iters_per_os would matter here. Do >> you have any idea what is taking up the CPU -- maybe calls to >> Timestamp::now()? Is pcap_dispatch() returning nonzero even if no packets >> were emitted? >> >> E >> >> >> On 3/19/10 10:53 AM, Ian Rose wrote: >>> Sorry yeah should have specified that - I forgot that FromDevice is >>> different on Linux. >>> >>> I am on FreeBSD 7.2-STABLE. >>> >>> Also, I am using Click v1.7.0rc1, but I compared FromDevice.u and the >>> relevant portions of routerthread.cc to HEAD and didn't see anything >>> different. >>> >>> - Ian >>> >>> Eddie Kohler wrote: >>>> Ian, >>>> >>>> What OS is this running on? Are you using pcap? >>>> >>>> Eddie >>>> >>>> >>>> On 3/19/10 10:45 AM, Ian Rose wrote: >>>>> Hi Eddie, >>>>> >>>>> I'd be more than glad to send along my "real" config, but its really big >>>>> and uses quite a lot of custom elements that won't mean anything to you >>>>> without the source code. >>>>> >>>>> However, just for testing these changes I used: >>>>> >>>>> FromDevice(ath1, ENCAP 802_11_RADIO, PROMISC true, HEADROOM 196) -> >>>>> Discard; >>>>> >>>>> I am seeing about a 2x CPU usage difference when I use _iters_per_os = 2 >>>>> vs 64. >>>>> >>>>> - Ian >>>>> >>>>> >>>>> Eddie Kohler wrote: >>>>>> Hi Ian, >>>>>> >>>>>> (1) I would completely appreciate seeing your config, just to see if >>>>>> there's anything that might cause the extra CPU usage. BUT: >>>>>> >>>>>> (2) _iters_per_os is set that way just, I think, as a random guess. >>>>>> ANd that guess is at least 5 years old and probably more. I think it >>>>>> would be OK to set it to 2 for everyone. >>>>>> >>>>>> Eddie >>>>>> >>>>>> >>>>>> On 3/18/10 8:06 PM, Ian Rose wrote: >>>>>>> Hi all - >>>>>>> >>>>>>> In lib/routerthread.cc there is the following code: >>>>>>> >>>>>>> #if CLICK_USERLEVEL >>>>>>> _iters_per_os = 64; /* iterations per select() */ >>>>>>> #else >>>>>>> _iters_per_os = 2; /* iterations per OS schedule() */ >>>>>>> #endif >>>>>>> >>>>>>> I'm curious if there is a particular rationale behind the value 64 for >>>>>>> userlevel click. Is it simply the case that this value works pretty >>>>>>> well for most of the typical click configurations that were tested? In >>>>>>> my (admittedly brief) testing, it appears that this parameter choice >>>>>>> imposes a CPU overhead of ~3x for [some?] select-heavy >>>>>>> applications, by >>>>>>> which I mean configs that spend most of their time calling >>>>>>> selected() on >>>>>>> elements, rather executing tasks or timers. For example, my particular >>>>>>> app uses around 15-20% CPU with the above values, but if I change >>>>>>> the 64 >>>>>>> to a 2, the CPU usage drops to 5-6%. >>>>>>> >>>>>>> Obviously this might simply be a case of the default parameters not >>>>>>> being particularly good for my specific situation, but I thought I'd >>>>>>> check since the performance difference seemed pretty significant. >>>>>>> >>>>>>> cheers, >>>>>>> - Ian >>>>>>> _______________________________________________ >>>>>>> click mailing list >>>>>>> [email protected] >>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
