Re: Optimising Proton Messenger data transfers & msgr-send/msg-recv oddities

Rafael Schloming Fri, 31 Oct 2014 14:24:04 -0700

Hi,

I'll do my best to answer what I can... comments inline...

On Fri, Oct 31, 2014 at 8:10 AM, Fraser Adams <[email protected]
> wrote:

> Hey all,
> OK I'll 'fess up I have to admit that although I've been tinkering with
> Messenger for a while now I don't *really* understand some of the terms
> that get used such as credit, disposition, settlement. I think that I was
> OK with qpid::messaging's setCapacity stuff and how to use that to optimise
> prefetch and also using qpid::messaging link controls in the Address string
> to set/disable reliability but the settings on Messenger remain a mystery
> to me.
>
> What really brought that home in my mind was when I started playing with
> the msgr-send and msgr-recv applications in <proton>/tests/tools/apps to
> try to figure what sort of throughput I might get with Messenger.
>

First off let me just say that I'm not that familiar with msgr-send and
msgr-recv. I think Ken wrote those, and as I understand it they are command
line sending and receiving programs that happen to be implemented via
messenger, however they aren't intended to be command line interfaces to
messenger, so the terminology and parameters, may not line up 100% with the
messenger api-doc.

> A while back Gordon was involved in a performance conversation and
> mentioned testing with the settings below
> ./msgr-recv -c 1000000
> ./msgr-send -c 1000000 -b 64
>
> So sending/receiving a million 64 octet messages and seeing what the
> performance is - so far so good.
>
> But I then tinkered around and hacked in some code to display the count
> for the sent and received messages and then did:
>
> ./msgr-recv -c 100
> ./msgr-send -c 100 -b 1000000
>
> in other words sending & receiving 100 1MB messages - I actually used the
> large message size as much to slow things down as anything, but what I
> observed was that the 100 messages were all being sent before msgr-recv
> started to display any received count numbers.
>
> When I looked at the usage I noticed the -p option "Send batches of #
> messages" and sure enough if I do
> ./msgr-send -c 100 -p 10 -b 1000000
>
> I see msgr-recv catch up every 10 messages.
>
>
> What I *think* is going on is that when the count in the internal
> Messenger queue (pn_messenger_outgoing(messenger)) exceeds the batch size
> it calls pn_messenger_send(messenger, -1);
>

Yes, from my brief perusal of the code I would concur.

> But that makes me unclear in my mind what the differences between
> pn_messenger_put and pn_messenger_send actually are, I've certainly seen
> pn_messenger_put actually send messages. I realise that there's a comment
> "The message may also be sent if transmission would not cause blocking" but
> I'm not clear at exactly which point blocking would occur, I'm guessing
> that I'm noticing this because of my large messages? The problem of course
> is that if I use tiny little messages I can't actually see if any batching
> actually occurs or whether in the small message case pn_messenger_put
> merrily whizzes out the small messages without really needing
> pn_messenger_send to give them a helping kick.
>

I'm a little confused by this question. The pn_messenger_put operation
hands a new message over to the messenger and as stated in the API doc is
guaranteed not to block. The pn_messenger_send operation on the other hand
does not take a new message, and exists solely for the purpose of blocking
until previously "put" messages are actually sent. The pn_messenger_send
operation is the equivalent of
pn_mesenger_work_until_you_send_N_messages(N).

> Does that make sense? It'd be useful for someone who knows this stuff to
> explain how the Messenger store works and how the various API calls relate
> to credit, disposition & settlement (I'm pretty sure the latter relates to
> the tracker/window/accept/settle stuff but not so sure about the first
> two). I'd also quite like to know how this stuff relates to the
> capacity/reliability stuff on qpid::messaging.
>

Credit refers to credit based flow control. In a credit based flow control
scheme, the receiver maintains a "credit balance". The credit balance is
just a number that indicates how many messages the receiver is capable of
receiving at any given point in time. The receiver periodically informs the
sender of this number, and the sender guarantees never to send unless this
number is positive. The sender will also decrement its copy of the credit
balance whenever it sends a message. This guarantees that the sender will
never send more messages than the receiver has requested.

So the term credit can mean a bunch of different things in different
contexts, but when used as a "unit", e.g. 10 credits, a credit pretty much
translates into permission to send one message. So when the receiver issues
10 credits, it is issuing permission to send up to 10 messages.

Now depending on what exact policy you use to issue credits, you can
implement a lot of different semantics, e.g. you can send 1 credit to fetch
exactly one message and be done with it, or you can renew the sender's
credit whenever it falls below a certain threshold, and there are many more
options. The concept of capacity is just a specific policy for managing
credits. This policy assumes the receiver has a fixed buffer for holding
messages, and it issues credit to reflect the number of empty slots in that
buffer.

Settlement of a delivery refers to an endpoint being done with and
forgetting everything about a given delivery. Disposition refers to the
state of the delivery at the time of settlement, e.g. was it accepted vs
rejected. Sometimes the disposition is null because the state at the time
of settlement is unknown, e.g. the sender can choose to "pre-settle" the
message as it is sent, i.e. forget about it as soon as it hits the wire. In
this case the state when the receiver settles it will never be known. Just
like credit is a lower level/more general concept than capacity and can be
used to implement a greater variety of semantics, settlement is a lower
level/more general concept than reliability and can be used to implement a
variety of different QoS levels.

> Also I *think* that there is a problem with the python version of
> msgr-send.py I'd expect that to run more slowly than the C version, but
> when I did:
>
> ./msgr-recv.py -c 100
> ./msgr-send.py -c 100 -b 1000000
>
> it returned more or less immediately and when I increase the -c value I
> appear to be seeing the same throughput irrespective of the value of the -b
> value. I've not really looked too deeply at the code but I wonder if that
> rings any bells for anyone?
>

Doesn't ring a bell for me. I'd file a JIRA and maybe poke Ken.

> Sorry if these things are obvious to the people who know, but I figured I
> probably wasn't the only one who didn't actually know this stuff and as
> I've got no shame I thought I'd raise my head above the parapet and expose
> my ignorance to the world :-)
>

I hope this helps a bit, please follow up and let me know if you have
further questions.

--Rafael

Re: Optimising Proton Messenger data transfers & msgr-send/msg-recv oddities

Reply via email to