Re: put vs. send
- Original Message - From: Rafael Schloming r...@alum.mit.edu To: proton@qpid.apache.org Sent: Wednesday, March 6, 2013 2:27:19 PM Subject: Re: put vs. send On Wed, Mar 6, 2013 at 7:35 AM, Ted Ross tr...@redhat.com wrote: On 03/06/2013 10:09 AM, Rafael Schloming wrote: On Wed, Mar 6, 2013 at 6:52 AM, Ted Ross tr...@redhat.com wrote: On 03/06/2013 08:30 AM, Rafael Schloming wrote: On Wed, Mar 6, 2013 at 5:15 AM, Ted Ross tr...@redhat.com wrote: This is exactly right. The API behaves in a surprising way and causes reasonable programmers to write programs that don't work. For the sake of adoption, we should fix this, not merely document it. This seems like a bit of a leap to me. Have we actually seen anyone misusing or abusing the API due to this? Mick didn't come across it till I pointed it out and even then he had to construct an experiment where he's basically observing the over-the-wire behaviour in order to detect it. --Rafael The following code doesn't work: while (True) { wait_for_and_get_next_event(event); pn_messenger_put(event); } If I add a send after every put, I'm going to limit my maximum message rate. If I amortize my sends over every N puts, I may have arbitrarily/infinitely high latency on messages if the source of events goes quiet. I guess I'm questioning the mission of the Messenger API. Which is the more important design goal: general-purpose ease of use, or strict single-threaded asynchrony? I wouldn't say it's a goal to avoid background threads, more of a really nice thing to avoid if we can, and quite possibly a necessary mode of operation in certain environments. I don't think your example code will work though even if there is a background thread. What do you want to happen when things start backing up? Do you want messages to be dropped? Do you want put to start blocking? Do you just want memory to grow indefinitely? Good question. I would want to either block so I can stop consuming events or get an indication that I would-block so I can take other actions. I understand that this is what send is for, but it's not clear when, or how often I should call it. I think there are really two orthogonal issues being discussed: 1) how do you get messenger to perform the appropriate amount of outstanding work once enough has built up 2) should messenger be obligated to eventually complete any outstanding work even if you never call into it again The second one basically being the question of whether the API semantics mandate a background thread or not, and I think (1) needs a good solution regardless of how you answer (2). I think we can address (1) by adding an integer argument N to pn_messenger_send with the following semantics: - if N is -1, then pn_messenger_send behaves as it currently does, i.e. block until everything is sent - if N is 0, then pn_messenger_send will send whatever it can/should without blocking - if N is positive, then pn_messenger_send will block until N messages are sent I'd also suggest we modify pn_messenger_recv(0) to match this on the recv side, so the overall semantics of pn_messenger_recv(N) would be: - if N is -1, then pn_messenger_recv behaves as it currently does, i.e. block until something is received - if N is 0, then pn_messenger_recv will receive whatever it can/should without blocking - if N is positive, then pn_messenger_recv behaves as it currently does, i.e. block until something is received but with a limit of N +1 these proposed API changes. I like the fact that they provide more control over the blocking behavior. With the existing API, I've had to modify pn_messenger_set/get_timeout() if I want to switch between polling and blocking. The proposed approach is much cleaner. I think the above change would introduce some symmetry between send and recv and let you implement a fully pipelined asynchronous sender in a more obvious way than setting timeouts to zero, e.g. Ted's example could read: while (True) { wait_for_and_get_next_event(**event); pn_messenger_put(m, event); if (pn_messenger_outgoing() threshold) { pn_messenger_send(m, pn_messenger_outgoing() - threshold); } } We can also add a pn_messenger_work(timeout) that would block until it's done some kind of work (either sending or receiving). Calling it with a zero argument would be equivalent to simultaneously calling pn_messenger_send(0) and pn_messenger_recv(0). (Note that currently send and recv don't actually factor in any biases towards sending and receiving, they just do whatever outstanding work needs doing, however the API would admit an implementation that did apply some bias in those cases.) This would go some ways towards addressing issue (2) because
Re: put vs. send -- new doc
I like this. I don't think this is the same thing as high level conceptual intro, but may well be more useful right now, and it will always be very useful as recipe book style documentation. I think this is actually quite helpful for the ongoing API discussions. One of the tricky things about developing a simple API is that everyone has their own scenario that they want to be simple, and sometimes making one scenario simpler ends up making another one more difficult, so it's actually really useful to have them all laid out so concisely so that we can observe the overall effect of any changes. I'm thinking we should expand this with the API changes I proposed in the other thread and see how they work out. --Rafael On Wed, Mar 6, 2013 at 9:43 AM, Michael Goulish mgoul...@redhat.com wrote: OK, I'm trying here to express the spirit of Messenger I/O , greatly based on the conversation of the last 24 hrs. This probably needs some elaboration yet, but I want to see if I'm at least generally on the right track. Oh, please, give me feedback. Sending and Receiving Messages === The Proton Messenger API provides a mixture of synchronous and asynchronous operations to give you flexibility in deciding when you application should block waiting for I/O, and when it should not. When sending messages, you can: * send a message immediately, * enqueue a message to be sent later, * block until all enqueued messages are sent, * send enqueued messages until a timeout occurs, or * send all messages that can be sent without blocking. When receiving messages, you can: * receive messages that can be received without blocking, * block until at least one message is received, * receive no more than a fixed number of messages. Examples -- 1. send a message immediately put ( messenger, msg ); send ( messenger ); 2. enqueue a message to be sent later put ( messenger, msg ); note: The message will be sent whenever it is not blocked and the Messenger code has other I/O work to be done. 3. block until all enqueued messages are sent set_timeout ( messenger, -1 ); send ( messenger ); note: A negative timeout means 'forever'. That is the initial default for a messenger. 4. send enqueued messages until a timeout occurs set_timeout ( messenger, 100 ); /* 100 msec */ send ( messenger ); 5. send all messages that can be sent without blocking set_timeout ( messenger, 0 ); send ( messenger ); 6. receive messages that can be received without blocking set_timeout ( messenger, 0 ); recv ( messenger, -1 ); 7. block until at least one message is received set_timeout ( messenger, -1 ); recv ( messenger, -1 ); note: -1 is initial messenger default. If you didn't change it, you don't need to set it. 8. receive no more than a fixed number of messages recv ( messenger, 10 );
Re: put vs. send -- new doc
- Original Message - I like this. Good! I'm trying to get at the intention, as I understood it from online discussions, and make a doc that makes the intention easy to see. I don't think this is the same thing as high level conceptual intro, but may well be more useful right now, and it will always be very useful as recipe book style documentation. No, this isn't the intro -- I'm re-working that now. I guess I should have said what category of docs this is part of. Except I don't actually know. I guess I have an idea of a set of docs at this level that discuss different topics. i.e. addressing, message management, etc. I think this is actually quite helpful for the ongoing API discussions. One of the tricky things about developing a simple API is that everyone has their own scenario that they want to be simple, and sometimes making one scenario simpler ends up making another one more difficult, so it's actually really useful to have them all laid out so concisely so that we can observe the overall effect of any changes. I'm thinking we should expand this with the API changes I proposed in the other thread and see how they work out. Sure -- like a doc for your recent proposals for changes. That would be fun, and useful I agree. But -- probably shouldn't check that into tree, right? Since it's for hypothetical code. Just post to list ? --Rafael On Wed, Mar 6, 2013 at 9:43 AM, Michael Goulish mgoul...@redhat.com wrote: OK, I'm trying here to express the spirit of Messenger I/O , greatly based on the conversation of the last 24 hrs. This probably needs some elaboration yet, but I want to see if I'm at least generally on the right track. Oh, please, give me feedback. Sending and Receiving Messages === The Proton Messenger API provides a mixture of synchronous and asynchronous operations to give you flexibility in deciding when you application should block waiting for I/O, and when it should not. When sending messages, you can: * send a message immediately, * enqueue a message to be sent later, * block until all enqueued messages are sent, * send enqueued messages until a timeout occurs, or * send all messages that can be sent without blocking. When receiving messages, you can: * receive messages that can be received without blocking, * block until at least one message is received, * receive no more than a fixed number of messages. Examples -- 1. send a message immediately put ( messenger, msg ); send ( messenger ); 2. enqueue a message to be sent later put ( messenger, msg ); note: The message will be sent whenever it is not blocked and the Messenger code has other I/O work to be done. 3. block until all enqueued messages are sent set_timeout ( messenger, -1 ); send ( messenger ); note: A negative timeout means 'forever'. That is the initial default for a messenger. 4. send enqueued messages until a timeout occurs set_timeout ( messenger, 100 ); /* 100 msec */ send ( messenger ); 5. send all messages that can be sent without blocking set_timeout ( messenger, 0 ); send ( messenger ); 6. receive messages that can be received without blocking set_timeout ( messenger, 0 ); recv ( messenger, -1 ); 7. block until at least one message is received set_timeout ( messenger, -1 ); recv ( messenger, -1 ); note: -1 is initial messenger default. If you didn't change it, you don't need to set it. 8. receive no more than a fixed number of messages recv ( messenger, 10 );
Re: put vs. send -- new doc
On Thu, Mar 7, 2013 at 3:18 PM, Michael Goulish mgoul...@redhat.com wrote: I think this is actually quite helpful for the ongoing API discussions. One of the tricky things about developing a simple API is that everyone has their own scenario that they want to be simple, and sometimes making one scenario simpler ends up making another one more difficult, so it's actually really useful to have them all laid out so concisely so that we can observe the overall effect of any changes. I'm thinking we should expand this with the API changes I proposed in the other thread and see how they work out. Sure -- like a doc for your recent proposals for changes. That would be fun, and useful I agree. But -- probably shouldn't check that into tree, right? Since it's for hypothetical code. Just post to list ? Right, I wouldn't put it in the tree, just go for the list right now. --Rafael
Re: put vs. send
Hah! I think I get it! Your comments about asynchronicity were the key. Rewriting now. - Original Message - On Tue, Mar 5, 2013 at 1:50 PM, Michael Goulish mgoul...@redhat.com wrote: - Original Message - On Tue, Mar 5, 2013 at 3:20 PM, Rafael Schloming r...@alum.mit.edu wrote: On Tue, Mar 5, 2013 at 11:33 AM, Rajith Attapattu rajit...@gmail.comwrote: On Tue, Mar 5, 2013 at 2:24 PM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:14 PM, Rajith Attapattu wrote: This is a good explanation that we need to put in the docs, as Application developers certainly need to know how it behaves. If one were to use the current C impl, it certainly gives the impression that put() is meant to write messages into your internal buffer and send() will actually write it to the wire. Unfortunately some applications will depend on this behaviour, even though it's not advisable If we are to change from say #2 to #1 or even #3 we need to release note it prominently. I think the best solution is to make this behaviour configurable, and advertise the default very prominently. This way application developers will know exactly what they are getting instead of us making changes underneath. Rajith Making this configurable multiplies the size of the test matrix. Can't we make this simpler? I do understand your concern here, but the Java impl already does both #1 and #2 and Rafi wants to do #3 in the future. The old JMS client does something similar. I agree that if we just do option #2 (as you suggest below), then the application can easily do #1 and #3 on top of that. But I'm sure they will like if the library implements those strategies for them and they have the ability to pick a strategy. I don't see why we'd make this configurable. All three options actually fit the same general semantics. Even if you're optimistically trying to transmit every single time put is called it's entirely possible for the socket to be blocked every single time you try. If this were to happen the implementation of #1 would appear to behave precisely the same as #2 behaves. In other words if you're coding correctly against the API you can't assume that put will or won't have transmitted anything regardless of which strategy is used internally. I agree with you. You make a very good point. Perhaps we should explicitly make that clear in our docs to avoid applications written against wrong assumptions. I can certainly do that, but it seems to me that semantics should be simple, obvious, and orthogonal. What seems non-simple and non-obvious to me so far is: put() might send, or not. It doesn't send now, but it might later. This behaviour is fundamental to an asynchronous API. You're not actually doing things, you're scheduling things to be done asynchronously. This is why put() returns a tracker so you can come back and check on the status of your asynchronous operation. recv() can cause messages to be sent. send() can cause messages to be received. I don't think that's a correct way of describing what is going on. You scheduled an asynchronous operation via put(). That means it can occur at any point later on. The fact that it happens to be trigger by the recv() in the example I gave is simply because recv() is blocking waiting for the reply and so it is inevitably going to end up blocking until the request is sent because the reply won't be triggered until after the request is sent. As for send(), it's simply inaccurate to say that send causes messages to be received. Messages can be spontaneously sent by remote parties at any time (given sufficient credit has been previously granted). What caused them to be received is the other party actually sending them, and if message data happens to arrived while we're inside a call to send(), we can't simply throw those messages away, so they go onto the incoming queue just as if they had arrived during a call to recv(). I would think that 1. every verb should only mean one thing 2. there should be a simple mental model, against which every verb performs a predictable action. so for example: put( messenger, message ); // enqueue for sending send ( messenger, BLOCK ); // block till all sent. send ( messenger, DONT_BLOCK ); // send what you can. credit ( messenger, 10 ); // limit incoming queue size recv ( messenger, BLOCK ); // block till I get a message recv ( messenger, DONT_BLOCK ); // if no messages incoming, return. I'm not
Re: put vs. send
On Wed, Mar 6, 2013 at 1:44 AM, Rob Godfrey rob.j.godf...@gmail.com wrote: On 5 March 2013 21:10, Rafael Schloming r...@alum.mit.edu wrote: On Tue, Mar 5, 2013 at 11:24 AM, Ted Ross tr...@redhat.com wrote: [.. snip ..] It isn't really possible to have put cause messages to be eventually sent without a background thread, something we don't currently have. I think it's this that is what makes me find the API slightly odd. That put is an asynchronous operation is fine, but the fact that the only way to get work to occur is for a synchronous operation to be called seems a little screwy. If I understand correctly, right now an application programmer cannot actually write an asynchronous publisher, every so often they would have to call some form of synchronous operation. At the very least it would seem to suggest there might be call for a do some work but don't block function in the API. This could either take an aggressive strategy of flushing everything that it can to the wire, or it could attempt to optimize into larger transmission units. This is exactly what happens when you set the timeout to zero and call send (or recv). Are you saying you want some other way of doing the same thing or you want a background thread? --Rafael
Re: put vs. send
On Wed, Mar 6, 2013 at 5:15 AM, Ted Ross tr...@redhat.com wrote: This is exactly right. The API behaves in a surprising way and causes reasonable programmers to write programs that don't work. For the sake of adoption, we should fix this, not merely document it. This seems like a bit of a leap to me. Have we actually seen anyone misusing or abusing the API due to this? Mick didn't come across it till I pointed it out and even then he had to construct an experiment where he's basically observing the over-the-wire behaviour in order to detect it. --Rafael
Re: put vs. send
On 6 March 2013 13:26, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Mar 6, 2013 at 1:44 AM, Rob Godfrey rob.j.godf...@gmail.com wrote: On 5 March 2013 21:10, Rafael Schloming r...@alum.mit.edu wrote: On Tue, Mar 5, 2013 at 11:24 AM, Ted Ross tr...@redhat.com wrote: [.. snip ..] It isn't really possible to have put cause messages to be eventually sent without a background thread, something we don't currently have. I think it's this that is what makes me find the API slightly odd. That put is an asynchronous operation is fine, but the fact that the only way to get work to occur is for a synchronous operation to be called seems a little screwy. If I understand correctly, right now an application programmer cannot actually write an asynchronous publisher, every so often they would have to call some form of synchronous operation. At the very least it would seem to suggest there might be call for a do some work but don't block function in the API. This could either take an aggressive strategy of flushing everything that it can to the wire, or it could attempt to optimize into larger transmission units. This is exactly what happens when you set the timeout to zero and call send (or recv). Are you saying you want some other way of doing the same thing or you want a background thread? Surely though setting timeout to 0 and calling send results in something that looks like an error (this timed out). On a Java implementation I would expect this to throw an exception. That's not really the semantic I'm expecting. The semantic is do some work if you can without blocking. -- Rob --Rafael
Re: put vs. send
Whether that's reported as an error is really a choice of the bindings. In C it's all just return codes. We could add a separate non-blocking flag that causes the blocking operations to return distinct error codes, i.e. the equivalent of EWOULDBLOCK, but I don't think this makes a whole lot of sense in C. I can buy that in the higher level bindings the extra flag would tell the API whether to signal timeout by returning false vs throwing an exception. With the Java implementation (not just the binding) I would expect an (expensive) exception to be thrown here. I don't think you should be triggering an exception for a non-exceptional condition. I do agree that we'll want a work interface at some point, but I've been thinking that would not just do the work, but also tell you what work has been done, so you can, e.g., go check whatever tracker statuses may have been updated. Yeah - i think what you are currently suggesting is more of a you can get round the lack of an explicit API because this sort of does the same thing if you squint at it. Calling a blocking method with a zero timeout is a hack to cover the lack of a method for the desired semantic. Moreover if this is a recommended use case for send then I think you'd need to document it, which would really muddy the waters as to what send is. -- Rob
Re: put vs. send
On Wed, Mar 6, 2013 at 6:37 AM, Rob Godfrey rob.j.godf...@gmail.com wrote: Whether that's reported as an error is really a choice of the bindings. In C it's all just return codes. We could add a separate non-blocking flag that causes the blocking operations to return distinct error codes, i.e. the equivalent of EWOULDBLOCK, but I don't think this makes a whole lot of sense in C. I can buy that in the higher level bindings the extra flag would tell the API whether to signal timeout by returning false vs throwing an exception. With the Java implementation (not just the binding) I would expect an (expensive) exception to be thrown here. I don't think you should be triggering an exception for a non-exceptional condition. How do you decide whether it's an exceptional condition or not? It seems like it's really down to how the app is designed as to whether timing out is normal or exceptional. I do agree that we'll want a work interface at some point, but I've been thinking that would not just do the work, but also tell you what work has been done, so you can, e.g., go check whatever tracker statuses may have been updated. Yeah - i think what you are currently suggesting is more of a you can get round the lack of an explicit API because this sort of does the same thing if you squint at it. Calling a blocking method with a zero timeout is a hack to cover the lack of a method for the desired semantic. Moreover if this is a recommended use case for send then I think you'd need to document it, which would really muddy the waters as to what send is. I'm suggesting it as a way to avoid adding a do_work() call because I'm not actually clear on how you would use the latter without busy looping or what scenarios you would document its use for. I'm not saying there aren't any, but it's not obvious to me right now. If you imagine the split between the non blocking and blocking portions of the API, where all the blocking portions are of the form do_work_and_block_until_condition_X_is_met, we now have two conditions: - the outgoing queue is empty - the incoming queue is non empty What you're asking for is to add a third condition that is always true, and I can possibly buy that for logical completeness, but in terms of usefulness I actually think expanding the set of conditions is actually more interesting, e.g. adding something like the outgoing queue is N perhaps via an optional parameter to send would seem to have a direct and obvious use for pipelined publishing in a way that wouldn't require busy looping. --Rafael
Re: put vs. send
On Wed, Mar 6, 2013 at 10:09 AM, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Mar 6, 2013 at 6:52 AM, Ted Ross tr...@redhat.com wrote: On 03/06/2013 08:30 AM, Rafael Schloming wrote: On Wed, Mar 6, 2013 at 5:15 AM, Ted Ross tr...@redhat.com wrote: This is exactly right. The API behaves in a surprising way and causes reasonable programmers to write programs that don't work. For the sake of adoption, we should fix this, not merely document it. This seems like a bit of a leap to me. Have we actually seen anyone misusing or abusing the API due to this? Mick didn't come across it till I pointed it out and even then he had to construct an experiment where he's basically observing the over-the-wire behaviour in order to detect it. --Rafael The following code doesn't work: while (True) { wait_for_and_get_next_event(**event); pn_messenger_put(event); } If I add a send after every put, I'm going to limit my maximum message rate. If I amortize my sends over every N puts, I may have arbitrarily/infinitely high latency on messages if the source of events goes quiet. You can employ a timer along with your event count (or based on a byte count) to get around that problem. The timer will ensure you flush events when there isn't enough activity. Isn't that acceptable ? I guess I'm questioning the mission of the Messenger API. Which is the more important design goal: general-purpose ease of use, or strict single-threaded asynchrony? I wouldn't say it's a goal to avoid background threads, more of a really nice thing to avoid if we can, and quite possibly a necessary mode of operation in certain environments. I don't think your example code will work though even if there is a background thread. This is a key point I missed when I thought about the problem along the same lines as Ted. Having a background thread cannot guarantee that your messages will be written on to the wire as that thread can be blocked due to TCP buffers being full or the thread being suppressed in favour of another more higher priority thread (for longer than you desire) thus increasing your latency beyond acceptable limits. You will invariably have outliers in your latency graph. On the other hand the library code will be much more simpler without the background thread. What do you want to happen when things start backing up? Do you want messages to be dropped? Do you want put to start blocking? Do you just want memory to grow indefinitely? --Rafael
Re: put vs. send
On Wed, Mar 6, 2013 at 11:37 AM, Rajith Attapattu rajit...@gmail.com wrote: On Wed, Mar 6, 2013 at 10:09 AM, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Mar 6, 2013 at 6:52 AM, Ted Ross tr...@redhat.com wrote: On 03/06/2013 08:30 AM, Rafael Schloming wrote: On Wed, Mar 6, 2013 at 5:15 AM, Ted Ross tr...@redhat.com wrote: This is exactly right. The API behaves in a surprising way and causes reasonable programmers to write programs that don't work. For the sake of adoption, we should fix this, not merely document it. This seems like a bit of a leap to me. Have we actually seen anyone misusing or abusing the API due to this? Mick didn't come across it till I pointed it out and even then he had to construct an experiment where he's basically observing the over-the-wire behaviour in order to detect it. --Rafael The following code doesn't work: while (True) { wait_for_and_get_next_event(**event); pn_messenger_put(event); } If I add a send after every put, I'm going to limit my maximum message rate. If I amortize my sends over every N puts, I may have arbitrarily/infinitely high latency on messages if the source of events goes quiet. Having a background thread in the Messenger will only push this problem from your application to the Messenger implementation. Furthermore you will be at the mercy of the particulars of the client library implementation as to how this background thread will take care of the outstanding work. We could provide all kinds of knobs to tweak and tune this behaviour, but I'd be far more comfortable if I as the application developer can be in control of when the flush happens. Either way you will have arbitrarily/infinitely high latency due to complications at the TCP stack or the OS level. But you can at least help your case a bit by having the application issue the flush than letting the messenger doing it, bcos the application is in a better position to determine what are the optimal conditions for doing so and those conditions could be other than time, msg or byte count. You can employ a timer along with your event count (or based on a byte count) to get around that problem. The timer will ensure you flush events when there isn't enough activity. Isn't that acceptable ? I guess I'm questioning the mission of the Messenger API. Which is the more important design goal: general-purpose ease of use, or strict single-threaded asynchrony? I wouldn't say it's a goal to avoid background threads, more of a really nice thing to avoid if we can, and quite possibly a necessary mode of operation in certain environments. I don't think your example code will work though even if there is a background thread. This is a key point I missed when I thought about the problem along the same lines as Ted. Having a background thread cannot guarantee that your messages will be written on to the wire as that thread can be blocked due to TCP buffers being full or the thread being suppressed in favour of another more higher priority thread (for longer than you desire) thus increasing your latency beyond acceptable limits. You will invariably have outliers in your latency graph. On the other hand the library code will be much more simpler without the background thread. What do you want to happen when things start backing up? Do you want messages to be dropped? Do you want put to start blocking? Do you just want memory to grow indefinitely? --Rafael
Re: put vs. send
On Wed, Mar 6, 2013 at 9:01 AM, Michael Goulish mgoul...@redhat.com wrote: - Original Message - - Original Message - From: Ted Ross tr...@redhat.com To: proton@qpid.apache.org Sent: Wednesday, March 6, 2013 10:35:47 AM Subject: Re: put vs. send On 03/06/2013 10:09 AM, Rafael Schloming wrote: On Wed, Mar 6, 2013 at 6:52 AM, Ted Ross tr...@redhat.com wrote: On 03/06/2013 08:30 AM, Rafael Schloming wrote: On Wed, Mar 6, 2013 at 5:15 AM, Ted Ross tr...@redhat.com wrote: This is exactly right. The API behaves in a surprising way and causes reasonable programmers to write programs that don't work. For the sake of adoption, we should fix this, not merely document it. This seems like a bit of a leap to me. Have we actually seen anyone misusing or abusing the API due to this? Mick didn't come across it till I pointed it out and even then he had to construct an experiment where he's basically observing the over-the-wire behaviour in order to detect it. --Rafael The following code doesn't work: while (True) { wait_for_and_get_next_event(**event); pn_messenger_put(event); } If I add a send after every put, I'm going to limit my maximum message rate. If I amortize my sends over every N puts, I may have arbitrarily/infinitely high latency on messages if the source of events goes quiet. I guess I'm questioning the mission of the Messenger API. Which is the more important design goal: general-purpose ease of use, or strict single-threaded asynchrony? I wouldn't say it's a goal to avoid background threads, more of a really nice thing to avoid if we can, and quite possibly a necessary mode of operation in certain environments. I don't think your example code will work though even if there is a background thread. What do you want to happen when things start backing up? Do you want messages to be dropped? Do you want put to start blocking? Do you just want memory to grow indefinitely? Good question. I would want to either block so I can stop consuming events or get an indication that I would-block so I can take other actions. I understand that this is what send is for, but it's not clear when, or how often I should call it. This begs a question that was asked before (by Mick, I believe) - what happens if a put() message can _never_ be sent? The destination has gone away and will never come back. AFAIK, every further call to send() will block due to that stuck message. How should that be dealt with? Use a TTL? Well, I just tried it. Setup --- Set up a receiver to receive, get, and print out messages in a loop. ( receive blocking, i.e. timeout == default ) Then start a sender (default timeout) that will: 1. put-and-send message 1. 2. put message 2. 3. countdown 12 seconds before it decides to send message 2. 4. send message 2. While the sender is getting ready to call send() on msg2, I kill the receiver. Result --- I see the receiver print out message 1. good. When the sender has put() msg2 but not yet sent(), I kill receiver. Sender calls send() on message2. send() returns immediately, return code is 0. (success) Analysis -- Bummer. Dropped message. If you want reliability you need to check the status of the tracker you get when you call put(). You would also need to set a nonzero outgoing window so that messenger actually retains that status. --Rafael
Re: put vs. send
On 03/05/2013 02:01 PM, Rafael Schloming wrote: On Tue, Mar 5, 2013 at 10:42 AM, Michael Goulish mgoul...@redhat.comwrote: So, am I understanding correctly? -- I should be able to get messages from my sender to my receiver just by calling put() -- if the receiver is ready to receive? Not necessarily, the receiver being ready just means you are unblocked on AMQP level flow control. You could also potentially block on the socket write (i.e. TCP level flow control). You need to be unblocked on both for put to succeed. Certainly there is no TCP flow control happening in Mick's scenario. What I said was put is *allowed* to send optimistically, not that it is required to. It actually did send optimistically in a previous version of the code, however I commented that line out. I would say the documented semantics of put and send should allow the implementation the flexibility to do any of the following: 1) optimistically transmit whatever it can everytime so long as it doesn't block 2) never bother transmitting anything until you force it to by calling send 3) anything in between the first two, e.g. magically transmit once you've put enough messages to reach the optimal batch size The reason for the behaviour you are observing is that we currently do option 2 in the C impl, however we've done option 1 in the past (and I think we do option 1 still in the Java impl), and we will probably do option 3 in the future. If this is the case, then Mick's original view is correct. The application must assume that messages will not ever be sent unless send is called. There is no flowing, pipelined, non-blocking producer. --Rafael
Re: put vs. send
On Tue, Mar 5, 2013 at 2:01 PM, Rafael Schloming r...@alum.mit.edu wrote: On Tue, Mar 5, 2013 at 10:42 AM, Michael Goulish mgoul...@redhat.comwrote: quoth Rafi: The semantics of pn_messenger_put allow it to send if it can do so without blocking. So, am I understanding correctly? -- I should be able to get messages from my sender to my receiver just by calling put() -- if the receiver is ready to receive? Not necessarily, the receiver being ready just means you are unblocked on AMQP level flow control. You could also potentially block on the socket write (i.e. TCP level flow control). You need to be unblocked on both for put to succeed. The only transmission difference between put() and send() is that send() will actually block until they're all sent (or timeout hits). put() should get rid of all the messages that aren't blocked, and leave all that are. . . . Because what I'm seeing is -- with my receiver hanging in recv(), I put 5 messages. Sender sits there for a while. No messages arrive at receiver. Then sender calls send() -- and all 5 messages arrive at the receiver. This is true whether on the receiver side, I use pn_messenger_recv ( messenger, 100 ); pn_messenger_recv ( messenger, 5 ); pn_messenger_recv ( messenger, 1 ); or pn_messenger_recv ( messenger, -1 ); That's why it seemed two-stage to me. put() seems to gets them staged, send() seems to shove them out the door. No? Or is this a bug? What I said was put is *allowed* to send optimistically, not that it is required to. It actually did send optimistically in a previous version of the code, however I commented that line out. I would say the documented semantics of put and send should allow the implementation the flexibility to do any of the following: 1) optimistically transmit whatever it can everytime so long as it doesn't block 2) never bother transmitting anything until you force it to by calling send 3) anything in between the first two, e.g. magically transmit once you've put enough messages to reach the optimal batch size The reason for the behaviour you are observing is that we currently do option 2 in the C impl, however we've done option 1 in the past (and I think we do option 1 still in the Java impl), and we will probably do option 3 in the future. This is a good explanation that we need to put in the docs, as Application developers certainly need to know how it behaves. If one were to use the current C impl, it certainly gives the impression that put() is meant to write messages into your internal buffer and send() will actually write it to the wire. Unfortunately some applications will depend on this behaviour, even though it's not advisable If we are to change from say #2 to #1 or even #3 we need to release note it prominently. I think the best solution is to make this behaviour configurable, and advertise the default very prominently. This way application developers will know exactly what they are getting instead of us making changes underneath. Rajith --Rafael
Re: put vs. send
On 03/05/2013 02:14 PM, Rajith Attapattu wrote: This is a good explanation that we need to put in the docs, as Application developers certainly need to know how it behaves. If one were to use the current C impl, it certainly gives the impression that put() is meant to write messages into your internal buffer and send() will actually write it to the wire. Unfortunately some applications will depend on this behaviour, even though it's not advisable If we are to change from say #2 to #1 or even #3 we need to release note it prominently. I think the best solution is to make this behaviour configurable, and advertise the default very prominently. This way application developers will know exactly what they are getting instead of us making changes underneath. Rajith Making this configurable multiplies the size of the test matrix. Can't we make this simpler? To me, this sounds like an I/O facility in which your output lines may never get sent if you don't call fflush(). This will be a surprise to most programmers, who rarely use fflush(). I think most programmers would be happier if put caused the messages to be eventually sent and send was used only for blocking until messages were flushed out. -Ted
Re: put vs. send
On Tue, Mar 5, 2013 at 2:24 PM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:14 PM, Rajith Attapattu wrote: This is a good explanation that we need to put in the docs, as Application developers certainly need to know how it behaves. If one were to use the current C impl, it certainly gives the impression that put() is meant to write messages into your internal buffer and send() will actually write it to the wire. Unfortunately some applications will depend on this behaviour, even though it's not advisable If we are to change from say #2 to #1 or even #3 we need to release note it prominently. I think the best solution is to make this behaviour configurable, and advertise the default very prominently. This way application developers will know exactly what they are getting instead of us making changes underneath. Rajith Making this configurable multiplies the size of the test matrix. Can't we make this simpler? I do understand your concern here, but the Java impl already does both #1 and #2 and Rafi wants to do #3 in the future. The old JMS client does something similar. I agree that if we just do option #2 (as you suggest below), then the application can easily do #1 and #3 on top of that. But I'm sure they will like if the library implements those strategies for them and they have the ability to pick a strategy. To me, this sounds like an I/O facility in which your output lines may never get sent if you don't call fflush(). This will be a surprise to most programmers, who rarely use fflush(). I think most programmers would be happier if put caused the messages to be eventually sent and send was used only for blocking until messages were flushed out. -Ted
Re: put vs. send
On Tue, Mar 5, 2013 at 11:10 AM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:01 PM, Rafael Schloming wrote: On Tue, Mar 5, 2013 at 10:42 AM, Michael Goulish mgoul...@redhat.com wrote: So, am I understanding correctly? -- I should be able to get messages from my sender to my receiver just by calling put() -- if the receiver is ready to receive? Not necessarily, the receiver being ready just means you are unblocked on AMQP level flow control. You could also potentially block on the socket write (i.e. TCP level flow control). You need to be unblocked on both for put to succeed. Certainly there is no TCP flow control happening in Mick's scenario. What I said was put is *allowed* to send optimistically, not that it is required to. It actually did send optimistically in a previous version of the code, however I commented that line out. I would say the documented semantics of put and send should allow the implementation the flexibility to do any of the following: 1) optimistically transmit whatever it can everytime so long as it doesn't block 2) never bother transmitting anything until you force it to by calling send 3) anything in between the first two, e.g. magically transmit once you've put enough messages to reach the optimal batch size The reason for the behaviour you are observing is that we currently do option 2 in the C impl, however we've done option 1 in the past (and I think we do option 1 still in the Java impl), and we will probably do option 3 in the future. If this is the case, then Mick's original view is correct. The application must assume that messages will not ever be sent unless send is called. There is no flowing, pipelined, non-blocking producer. It's not correct as documentation of the API semantics. It's also not correct to say that messages will never be sent unless send is called, e.g. the following code will work fine: Client: m.put(request); m.recv(); // wait for reply m.get(reply); Server: while True: m.recv(); // wait for request m.get(request) m.put(reply); As for there being no flowing, pipelined, non-blocking producer, that seems like an orthogonal issue, and depending on what you mean I wouldn't say that's necessarily true either. You can certainly set the messenger's timeout to zero and then call put followed by send to get the exact same semantics you would get if put were to optimistically send every time. --Rafael
Re: put vs. send
- Original Message - On Tue, Mar 5, 2013 at 11:10 AM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:01 PM, Rafael Schloming wrote: On Tue, Mar 5, 2013 at 10:42 AM, Michael Goulish mgoul...@redhat.com wrote: So, am I understanding correctly? -- I should be able to get messages from my sender to my receiver just by calling put() -- if the receiver is ready to receive? Not necessarily, the receiver being ready just means you are unblocked on AMQP level flow control. You could also potentially block on the socket write (i.e. TCP level flow control). You need to be unblocked on both for put to succeed. Certainly there is no TCP flow control happening in Mick's scenario. What I said was put is *allowed* to send optimistically, not that it is required to. It actually did send optimistically in a previous version of the code, however I commented that line out. I would say the documented semantics of put and send should allow the implementation the flexibility to do any of the following: 1) optimistically transmit whatever it can everytime so long as it doesn't block 2) never bother transmitting anything until you force it to by calling send 3) anything in between the first two, e.g. magically transmit once you've put enough messages to reach the optimal batch size The reason for the behaviour you are observing is that we currently do option 2 in the C impl, however we've done option 1 in the past (and I think we do option 1 still in the Java impl), and we will probably do option 3 in the future. If this is the case, then Mick's original view is correct. The application must assume that messages will not ever be sent unless send is called. There is no flowing, pipelined, non-blocking producer. It's not correct as documentation of the API semantics. It's also not correct to say that messages will never be sent unless send is called, e.g. the following code will work fine: Client: m.put(request); m.recv(); // wait for reply m.get(reply); That recv() just caused an enqueued message to be sent? I just tried it in a C program and it worked, more or less. At least the receiver got his 5 messages without send() ever having been called. Does a call to recv() also cause all unblocked messages to be sent? Is that symmetric? Does a call to send() also cause messages to be received ? Server: while True: m.recv(); // wait for request m.get(request) m.put(reply); As for there being no flowing, pipelined, non-blocking producer, that seems like an orthogonal issue, and depending on what you mean I wouldn't say that's necessarily true either. You can certainly set the messenger's timeout to zero and then call put followed by send to get the exact same semantics you would get if put were to optimistically send every time. --Rafael
Re: put vs. send
On Tue, Mar 5, 2013 at 3:20 PM, Rafael Schloming r...@alum.mit.edu wrote: On Tue, Mar 5, 2013 at 11:33 AM, Rajith Attapattu rajit...@gmail.comwrote: On Tue, Mar 5, 2013 at 2:24 PM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:14 PM, Rajith Attapattu wrote: This is a good explanation that we need to put in the docs, as Application developers certainly need to know how it behaves. If one were to use the current C impl, it certainly gives the impression that put() is meant to write messages into your internal buffer and send() will actually write it to the wire. Unfortunately some applications will depend on this behaviour, even though it's not advisable If we are to change from say #2 to #1 or even #3 we need to release note it prominently. I think the best solution is to make this behaviour configurable, and advertise the default very prominently. This way application developers will know exactly what they are getting instead of us making changes underneath. Rajith Making this configurable multiplies the size of the test matrix. Can't we make this simpler? I do understand your concern here, but the Java impl already does both #1 and #2 and Rafi wants to do #3 in the future. The old JMS client does something similar. I agree that if we just do option #2 (as you suggest below), then the application can easily do #1 and #3 on top of that. But I'm sure they will like if the library implements those strategies for them and they have the ability to pick a strategy. I don't see why we'd make this configurable. All three options actually fit the same general semantics. Even if you're optimistically trying to transmit every single time put is called it's entirely possible for the socket to be blocked every single time you try. If this were to happen the implementation of #1 would appear to behave precisely the same as #2 behaves. In other words if you're coding correctly against the API you can't assume that put will or won't have transmitted anything regardless of which strategy is used internally. I agree with you. You make a very good point. Perhaps we should explicitly make that clear in our docs to avoid applications written against wrong assumptions. Rajith --Rafael
Re: put vs. send
- Original Message - On Tue, Mar 5, 2013 at 3:20 PM, Rafael Schloming r...@alum.mit.edu wrote: On Tue, Mar 5, 2013 at 11:33 AM, Rajith Attapattu rajit...@gmail.comwrote: On Tue, Mar 5, 2013 at 2:24 PM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:14 PM, Rajith Attapattu wrote: This is a good explanation that we need to put in the docs, as Application developers certainly need to know how it behaves. If one were to use the current C impl, it certainly gives the impression that put() is meant to write messages into your internal buffer and send() will actually write it to the wire. Unfortunately some applications will depend on this behaviour, even though it's not advisable If we are to change from say #2 to #1 or even #3 we need to release note it prominently. I think the best solution is to make this behaviour configurable, and advertise the default very prominently. This way application developers will know exactly what they are getting instead of us making changes underneath. Rajith Making this configurable multiplies the size of the test matrix. Can't we make this simpler? I do understand your concern here, but the Java impl already does both #1 and #2 and Rafi wants to do #3 in the future. The old JMS client does something similar. I agree that if we just do option #2 (as you suggest below), then the application can easily do #1 and #3 on top of that. But I'm sure they will like if the library implements those strategies for them and they have the ability to pick a strategy. I don't see why we'd make this configurable. All three options actually fit the same general semantics. Even if you're optimistically trying to transmit every single time put is called it's entirely possible for the socket to be blocked every single time you try. If this were to happen the implementation of #1 would appear to behave precisely the same as #2 behaves. In other words if you're coding correctly against the API you can't assume that put will or won't have transmitted anything regardless of which strategy is used internally. I agree with you. You make a very good point. Perhaps we should explicitly make that clear in our docs to avoid applications written against wrong assumptions. I can certainly do that, but it seems to me that semantics should be simple, obvious, and orthogonal. What seems non-simple and non-obvious to me so far is: put() might send, or not. It doesn't send now, but it might later. recv() can cause messages to be sent. send() can cause messages to be received. I would think that 1. every verb should only mean one thing 2. there should be a simple mental model, against which every verb performs a predictable action. so for example: put( messenger, message ); // enqueue for sending send ( messenger, BLOCK ); // block till all sent. send ( messenger, DONT_BLOCK ); // send what you can. credit ( messenger, 10 ); // limit incoming queue size recv ( messenger, BLOCK ); // block till I get a message recv ( messenger, DONT_BLOCK ); // if no messages incoming, return. Rajith --Rafael
Re: put vs. send
On Tue, Mar 5, 2013 at 12:39 PM, Michael Goulish mgoul...@redhat.comwrote: - Original Message - On Tue, Mar 5, 2013 at 11:10 AM, Ted Ross tr...@redhat.com wrote: On 03/05/2013 02:01 PM, Rafael Schloming wrote: On Tue, Mar 5, 2013 at 10:42 AM, Michael Goulish mgoul...@redhat.com wrote: So, am I understanding correctly? -- I should be able to get messages from my sender to my receiver just by calling put() -- if the receiver is ready to receive? Not necessarily, the receiver being ready just means you are unblocked on AMQP level flow control. You could also potentially block on the socket write (i.e. TCP level flow control). You need to be unblocked on both for put to succeed. Certainly there is no TCP flow control happening in Mick's scenario. What I said was put is *allowed* to send optimistically, not that it is required to. It actually did send optimistically in a previous version of the code, however I commented that line out. I would say the documented semantics of put and send should allow the implementation the flexibility to do any of the following: 1) optimistically transmit whatever it can everytime so long as it doesn't block 2) never bother transmitting anything until you force it to by calling send 3) anything in between the first two, e.g. magically transmit once you've put enough messages to reach the optimal batch size The reason for the behaviour you are observing is that we currently do option 2 in the C impl, however we've done option 1 in the past (and I think we do option 1 still in the Java impl), and we will probably do option 3 in the future. If this is the case, then Mick's original view is correct. The application must assume that messages will not ever be sent unless send is called. There is no flowing, pipelined, non-blocking producer. It's not correct as documentation of the API semantics. It's also not correct to say that messages will never be sent unless send is called, e.g. the following code will work fine: Client: m.put(request); m.recv(); // wait for reply m.get(reply); That recv() just caused an enqueued message to be sent? I just tried it in a C program and it worked, more or less. At least the receiver got his 5 messages without send() ever having been called. Does a call to recv() also cause all unblocked messages to be sent? When a messenger is blocked it will do whatever outstanding work it can. This won't necessarily cause all messages to be sent, but it may cause some. Is that symmetric? Does a call to send() also cause messages to be received ? Yes it is symmetric, messages can arrive while you're blocking for send(). --Rafael