I was thinking the very same thing. I will check this ASAP. On Tue, 2008-02-05 at 17:41, Eddie Kohler wrote:
> Hi Robert, > > I wonder if your observed weirdness with LinkUnqueue was due to the > 100%-CPU-on-DelayUnqueue problem recently reported. Maybe if you tried the > configuration now? > > Eddie > > > Robert Ross wrote: > > I'm not sure what this means, but we have been able to completely avoid > > this problem by using kernel-level Click with the experimental > > FromUserDevice, and a user-level click reading FromDump and pushing > > packets out on a custom ToRawFile element. > > > > I will gladly put together and test a simple configuration. It would be > > identical to the configuration I had attached except for switching the > > Socket() to a FromDump(). I will run some more tests and send you the > > monitor.csv output from our script elements. > > > > BTW, we used the monitor.csv output file in tandem with the Java-based > > LiveGraph to see real-time statistics on Click performance. You can > > also use Livegraph after the fact to open up and view our Monitor.csv > > file on your end once I send you output. It has been a very nice > > marriage of capabilities for real-time analysis with minimal coding. > > We've done something similar in kernel-level, but had to write a custom > > java application to output the monitor.csv since kernel configurations > > cannot output directly to files. > > > > > > Robert Ross > > DSCI Inc. > > Office: 732.542.3113 x173 > > Home: 609.702.8114 > > Cell: 609.509.5139 > > Fax: 253.550.6198 > > > > -----Original Message----- > > From: Eddie Kohler [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, January 29, 2008 2:39 PM > > To: Robert Ross > > Cc: Beyers Cronje; click@amsterdam.lcs.mit.edu > > Subject: Re: [Click] Userlevel performance issues > > > > Hi Robert, > > > > The *job* of LinkUnqueue is specifically to throttle performance. It is > > designed to output packets at the bandwidth specified. This will cause > > a lower rate, pinned to that bandwidth! > > > > The numbers you report are kind of reasonable. Click parses bandwidths > > as powers of 10, which is the networking standard as far as I can tell. > > So 512Kbps = 512000bps = 64000Bps; 190p/s at this rate implies 336B > > packets. So 1360p/s, for your highest bandwidth LinkUnqueue, assuming > > the same packet length, is roughly half what it "should" be. That's not > > great, but it's not terrible. > > > > I have not run your configuration with Sockets, but I have with > > InfiniteSources, and so forth, and have observed LinkUnqueue outputing > > packets at the correct rate. In fact I checked in an update to Counter, > > to give it bit_rate and byte_rate handlers, making this easier to see. > > > > LinkUnqueue should affect the upstream Socket elements only indirectly. > > LinkUnqueue stops pulling from its input when the emulated link is full. > > This will cause an upstream Queue to fill up. Some elements might > > notice that Queue's full state and stop producing packets (since those > > packets will only be dropped). The InfiniteSource and user-level > > FromHost elements have this behavior. However, your use of > > NotifierQueue (instead of Queue) would neutralize this effect, since > > NotifierQueue doesn't provide full notification. > > > > I am unsure in the end whether you are observing a bug or correct > > behavior. > > Here are a couple questions to help us figure it out. > > > > - Re: FromDump and ToDevice. Can you reduce the configuration as much > > as possible, and tell us what rates ToDevice achieves without FromDump, > > and what it achieves with FromDump? Your mail isn't specific about the > > configuration or the performance numbers. > > > > - Re: LinkUnqueue. Can you send the output of your configuration (cool > > use of define and Script btw), as well as the configuration? Again, > > with InfiniteSource I see expected behavior, and I would not expect > > LinkUnqueue to throttle Socket. > > > > It may be that you are finding an unfortunate interaction between > > Click's task handlers and its file descriptor handlers -- something we > > could potentially fix. But without specific numbers it's hard to tell. > > > > Eddie > > > > > > Robert Ross wrote: > >> The only clear item that seems to have a marked difference is the > >> LinkUnqueue element. The fact that our ToDevice and FromDevice/Socket > > > >> performance appears to be related somehow to the configuration of a > >> LinkUnqueue element sitting in the middle of our configuration is too > >> obvious to ignore. Does LinkUnqueue perform some kind of > >> upstream/downstream notification to these elements, causing them to > >> throttle their behavior based on LinkUnqueue? > >> > >> In our tests, with all other elements remaining the same, here is what > > > >> we found from two independent read handler counts: > >> > >> LinkUnqueue("512Kbps") = Maximum ~190 packets/second pushed from the > >> Socket element and pulled by the ToDevice element > >> LinkUnqueue("1Mbps") = Maxmum ~290 packets/second pushed from the > >> Socket element and pulled by the ToDevice element > >> LinkUnqueue("2Mbps") = Maximum ~490 packets/second pushed from the > >> Socket element and pulled by the ToDevice element > >> LinkUnqueue("4Mbps") = Maximum ~780 packets/second pushed from the > >> Socket element and pulled by the ToDevice element > >> LinkUnqueue("6Mbps") = Maximum ~980 packets/second pushed from the > >> Socket element and pulled by the ToDevice element > >> LinkUnqueue("8Mbps") = Maximum ~1360 packets/second pushed from the > >> Socket element and pulled by the ToDevice element > >> > >> It is also telling that independant handler counters corroborate > >> exactly the same maximum packets per second in two very different > >> places in the configuration. Clearly you can see that the limitation > >> on processing is completely artificial and not an actual performance > >> problem, since increasing LinkUnqueue increases the performance in a > >> very controlled and obvious manner. > >> > >> I have attached a simple configuration that examines specific handlers > > > >> and outputs values each second to a CSV file for analysis. The > >> configuration is scaled back to complete simplicity, yet has the same > >> performance as our actual configuration which has a much more > >> complicated configuration. Nevertheless, the performance is identical > > > >> and seems to point squarely at LinkUnqueue. > >> > >> What is LinkUnqueue doing that could be causing this type of effect on > > > >> FromHost, Socket and ToDevice? > >> > >> > >> ________________________________ > >> > >> From: Robert Ross > >> Sent: Friday, January 25, 2008 7:40 PM > >> To: 'Beyers Cronje' > >> Cc: [EMAIL PROTECTED] > >> Subject: RE: [Click] Userlevel performance issues > >> > >> > >> Sorry, I wasn't clear that the queues are necessary for our > >> configuration. The configuration is somewhat complex. I was only > >> attempting to highlight the important parts. > >> > >> > >> > >> > >> ________________________________ > >> > >> From: Beyers Cronje [mailto:[EMAIL PROTECTED] > >> Sent: Friday, January 25, 2008 7:31 PM > >> To: Robert Ross > >> Cc: [EMAIL PROTECTED] > >> Subject: Re: [Click] Userlevel performance issues > >> > >> > >> Hi Robert, > >> > >> > >> > >> > >> * We first found that when UserLevel Click started pulling > >> from a > >> PCAP file, the performance of the ToDevice() appeared to drop > >> sharply. > >> What I mean by this is that the ToDevice() pull handler reported > > > >> values > >> in the range of 200 packets/second once the PCAP file started > >> reading. > >> This resulted in the outbound queue just prior to the ToDevice() > > > >> filling > >> up and eventually overflowing because the packet rate in the > > PCAP > >> file > >> is far more than 200 packets/second. > >> > >> > >> You dont have to use a queue between FromDump and ToDevice as FromDump > > > >> is an agnostic element. In other words you can connect Todevice > >> directly to FromDump which should ensure that at least no packets are > >> dropped and you should see best ToDevice performance. > >> > >> Also there are a few tuning parameters. Try tuning your NIC TX Ring > >> size. On the e1000 driver the default TX ring size is 256, experiment > >> with different value to see if it makes a difference.ToDevice uses a > >> packet socket from transmit, so it might be worth experimenting with > >> /proc/sys/net/core/wmem_default /proc/sys/net/core/wmem_max > >> > >> > >> Beyers > >> > >> > >> > >> ---------------------------------------------------------------------- > >> -- > >> > >> _______________________________________________ > >> click mailing list > >> click@amsterdam.lcs.mit.edu > >> https://amsterdam.lcs.mit.edu/mailman/listinfo/click _______________________________________________ click mailing list click@amsterdam.lcs.mit.edu https://amsterdam.lcs.mit.edu/mailman/listinfo/click