On Thu, 2014-11-20 at 14:10 -0500, Michael Goulish wrote:
> I recently finished switching over my proton-c programs psend & precv
> to the new event-based interface, and my first test of them was a
> 5 billion message soak test.
> The programs survived this test with no memory growth, and no gradual
> This test is meant to find the fastest possible speed of the proton-c
> code itself. (In future, we could make other similar tests designed
> to mimic realistic user scenarios.) In this test, I run both sender
> and receiver on one box, with the loopback interface. I have MTU ==
> 64K, I use a credit scheme of 600 initial credits, and 300 new credits
> whenever credit falls below 300. The messages are small: exactly 100
> bytes long.
> I am using two processors, both Intel Xeon E5420 @ 2.50GHz with 6144
> KB cache. (Letting the OS decide which processors to use for my two
> On that system, with the above credit scheme, the test is sustaining
> throughput of 408,500 messages per second . That's over a single link,
> between two singly-threaded processes.
That is an excellent result. It sets the context for doing performance
work on proton-based systems (which is nearly everything we do at this
point) At that rate, proton certainly doesn't sound like its the
bottleneck for any of the stuff I've been looking at, but I'd be
interested in seeing results for a range of larger message sizes.
> This is significantly faster than my previous, non-event-based code,
> and I find the code *much* easier to understand.
Yep, I think the signs are all pointing towards focus on making the
event based API easier to use as it seems to be more performant and
flexible than the Messenger API.
> This may still not be the maximum possible speed on my box. It looks
> like the limiting factor will be the receiver, and right now it is
> using only 74% of its CPU -- so if we could get it to use 100% we *might*
> see a performance gain to the neighborhood of 550,000 messages per second.
> But I have not been able to get closer to 100% just by fooling with the
> credit scheme. Hmm.
The C++ messenging perf clients all default to a capacity of 1000 so you
might try bumping up the credit to that (or higher in your small message
scenario, buffer those little guys up!)
> If you'd like to take a look, the code is here:
First thing I would suggest is adding command line parameters for
connection info, message size, credit etc. etc. Simple send/recieve
programs like this, when parameterized flexibly, are *extremely* useful
building blocks for a huge range of performance experiments. Take a look
at qpid-send --help and qpid-receive --help for the kind of features tha
are useful. Given such building blocks you can easily build drivers
similar to qpid-cpp-benchmark that can set up all kinds of interesting
The lesson from 2 iterations of performance testers on qpid is to build
flexible but relatively simple send/receive units and then script them
into complicated tests for measuring latency, throughput,
multi-sender/receiver, mutli-host, cluste etc. etc. The first generation
(perftest and latencytest) made the mistake of trying to build
everything into single test programs which is much more limiting and
fragile. The good thing here is you can write the critical
high-performance stuff in C, and script the setup and measurement code
in something easier like python without putting python on the critical
Other useful features:
- (optionally) reading/dumping message content to stdin/stdout.
- (optionally) adding timestamp/sequence number headers to messages in
sender, printing in receiver.
I really want these building blocks to build interesting dispatch
performance tests so will probably be showing an interest in scripting
these things in the near future :)