Random curiosity: Why would jumbo frames increases replies per sec?

Regards
KK

On 15 December 2010 11:45, Amin Tootoonchian <a...@cs.toronto.edu> wrote:
> I missed that. The single core throughput is ~250k replies/sec, two
> cores ~450k replies/sec, three cores ~650k replies/sec, four cores
> ~800 replies/sec. These numbers are higher than what I reported in my
> previous post. That is most probably because, right now, I am testing
> with MTU 9000 (jumbo frames) and with more user-space threads.
>
> Cheers,
> Amin
>
> On Wed, Dec 15, 2010 at 12:36 AM, Martin Casado <cas...@nicira.com> wrote:
>> Also, do you mind posting the single core throughput?
>>
>>> [cross-posting to nox-dev, openflow-discuss, ovs-discuss]
>>>
>>> I have prepared a patch based on NOX Zaku that improves its
>>> performance by a factor of>10. This implies that a single controller
>>> instance can run a large network with near a million flow initiations
>>> per second. I am writing to open up a discussion and get feedback from
>>> the community.
>>>
>>> Here are some preliminary results:
>>>
>>> - Benchmark configuration:
>>>   * Benchmark: Throughput test of cbench (controller benchmarker) with
>>> 64 switches. Cbench is a part of the OFlops package
>>> (http://www.openflowswitch.org/wk/index.php/Oflops). Under throughput
>>> mode, cbench sends a batch of ofp_packet_in messages to the controller
>>> and counts the number of replies it gets back.
>>>   * Benchmarker machine: HP ProLiant DL320 equipped with a 2.13GHz
>>> quad-core Intel Xeon processor (X3210), and 4GB RAM
>>>   * Controller machine: Dell PowerEdge 1950 equipped with two 2.00GHz
>>> quad-core Intel Xeon processor (E5405), and 4GB RAM
>>>   * Connectivity: 1Gbps
>>>
>>> - Benchmark results:
>>>   * NOX Zaku: ~60k replies/sec (NOX Zaku only utilizes a single core).
>>>   * Patched NOX: ~650k replies/sec (utilizing only 4 cores out of 8
>>> available cores). The sustained controller->benchmarker throughput is
>>> ~400Mbps.
>>>
>>> The patch updates the asynchronous harness of NOX to a standard
>>> library (boost asynchronous I/O library) which simplifies the code
>>> base. It fixes the code in several areas, including but not limited
>>> to:
>>>
>>> - Multi-threading: The patch enables having any number of worker
>>> threads running on multiple cores.
>>>
>>> - Batching: Serving requests individually and sending replies one by
>>> one is quite inefficient. The patch tries to batch requests together
>>> were possible, as well replies (which reduces the number of system
>>> calls significantly).
>>>
>>> - Memory allocation: The standard C++ memory allocator is not robust
>>> in multi-threaded environments. Google's Thread-Caching Malloc
>>> (TCMalloc) or Hoard memory allocator perform much better for NOX.
>>>
>>> - Fully asynchronous operation: The patched version avoids wasting CPU
>>> cycles polling sockets, or event/timer dispatchers when not necessary.
>>>
>>> I would like to add that the patched version should perform much
>>> better than what I reported above (the number reported is with a run
>>> on 4 CPU cores). I guess a single NOX instance running on a machine
>>> with 8 CPU cores should handle well above 1 million flow initiation
>>> requests per second. Also having a more capable machine should help to
>>> serve more requests! The code will be made available soon and I will
>>> post updates as well.
>>>
>>>
>>> Cheers,
>>> Amin
>>> _______________________________________________
>>> openflow-discuss mailing list
>>> openflow-disc...@lists.stanford.edu
>>> https://mailman.stanford.edu/mailman/listinfo/openflow-discuss
>>
>>
>
> _______________________________________________
> nox-dev mailing list
> nox-dev@noxrepo.org
> http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org
>

_______________________________________________
nox-dev mailing list
nox-dev@noxrepo.org
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org

Reply via email to