Inline comments. On Mon, Aug 29, 2011 at 7:08 PM, George Porter <[email protected]> wrote: > Hi Mihai, > > I've attached some more latency results from RouteBricks. I was able > to vary the 5-tuple on the generated packets, and have latency > measurements for RouteBricks under a few different circumstances > (figure is attached): > - ixgbe configured with a batch size of 16 > - ixgbe configured with a batch size of 4 > - Click/minfwdtest.click with a BURST size of 16 > - Click/minfwdtest.click with a BURST size of 4 (I also tried BURST > size of 1 but the latency was really high so I didn't include here) > > The latency ranges from about 14us up to 100-120us in the 99th > percentile, for an ixgbe batch factor of 4. Setting the batch setting > lower than 4 for some reason caused packets to no longer be delivered > to Click. I'm still looking into that.
I would say that the numbers don't look bad. > > I also noticed that when I didn't randomize the 5-tuple (indicated by > the 'uniform' lines) the latency was much lower. Does anyone on this > list know why the latency would be higher with a randomized 5-tupe? I > would assume that it has to do with packets being spread across four > NIC queues instead of one queue, leading to those packets being > delivered by ixgbe up to Click 1/4th as often, leading to 4x the > latency. You might experience some software/hardware overhead when using multi-queuing. You can have a look at this PRESTO paper http://conferences.sigcomm.org/co-next/2010/Workshops/PRESTO/PRESTO_papers/05-Manesh.pdf > > This represents the lowest latency I've been able to achieve so far. > If you have any other suggestions let me know. Thanks again for all > your help--it has been great getting RouteBricks up and running. > Sending at the max loss-free rate might increase the queuing, thus the latency. Have you tried forwarding at a lower rate? Maybe change the size of the queues (e.g. from CPUQueue(1000) to CPUQueue(500)) if you can generate non-bursty traffic. Mihai > > On Fri, Aug 26, 2011 at 8:21 AM, George Porter <[email protected]> wrote: >> Thanks Mihai, >> >>>> Looking at the latency, it seems that for small bursts of packets >>>> (e.g., 8, 16, 32 and thereabouts) the latency through RouteBricks is >>>> very low--about 14us or so. >>> >>> That's good news :) >> >> Yep. I believe that this single packet or small groups of packet >> latency results show the benefits of a polling 10G driver (since none >> of the other RouteBricks-specific functionality is being engaged, such >> as the multi-queue support and/or the distributed overlay forwarding >> network). Adding NAPI support to the ixgbe driver would be really >> great to prevent packets from getting "stuck" in the ixgbe driver >> (which prevents things like ssh from working since the SYN packet >> isn't forwarded to the destination. >> >>>> When I try large packet bursts (e.g., 16,000 packets back to back) >>>> there is an interesting queueing-type behavior in which the latency >>>> for the earlier packets is 14us, and then it linearly climbs up to >>>> ~560us or so, and then jumps back down to 14 and repeats in a sawtooth >>>> pattern. >>> >>> There are 2 forms of batching: NIC batching and Click batching. The >>> described behavior might be related to Click batching. You could check >>> the "BURST" parameters in PollDevice and ToDevice to adjust the Click >>> batching. >> >> I will try that today. However the sawtooth is over 10,000+ packets, >> and so I don't think it is related to a small constant amount of >> batching. Otherwise I would expect latency of packets to go up for >> 8-16 packets then go back down. What I see now is that the first >> 10-20 packets have latencies in the 12-14us range, then it linearly >> goes up to ~500-1000us over the next 10,000 packets, then there is a >> discontinuous jump back to 14us and the process repeats itself. I >> think maybe the issue is that only one kernel thread is handling the >> packets at this time. >> >>> The RSS delivery to a particular queue is done based on the >>> (SrcIP,DstIP,SrcPort,DstPort) tuple. If you send packets using the >>> same headers, they will all end up in the same queue. >> >> I'm going to vary the 5-tuple and try again to see if I can spread the >> packets across the nic queues, which should uniformly spread the >> packets across those nic queues. >> >> Thanks, >> George >> > _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
