A recent bug (PROTON-222) has exposed an issue with the transport interface. The details of the bug are documented in the JIRA, but the basic issue is that given the current transport interface there is no way for an application to discover via the engine interface when data has been actually written over the wire vs just sitting there waiting to be written over the wire.
Note that by "written over the wire" I mean copied from an in-process buffer that will vanish as soon as the process exits, to a kernel buffer that will continue to (re)-transmit the data for as long as the machine is alive, connected to the network, and the remote endpoint is still listening. To understand the issue, imagine implementing a driver for the current transport interface. It might allocate a buffer of 1K and then call pn_transport_output to fill that buffer. The transport might then put say 750 bytes of data into that buffer. Now imagine what happens if the driver can only write 500 of those bytes to the socket. This will leave the driver buffering 250 bytes. The engine of course has no way of knowing this and can only assume that all 750 bytes will get written out. A somewhat related issue is the buffering ownership/strategy between the transport and the driver. There are really three basic choices here: 1) the driver owns the buffer, and the transport does no buffering 2) the transport owns the buffer, and the driver does no buffering 3) they both own their own buffers and we copy from one to the other Now the division between these isn't always static, there are hybrid strategies and what not, however it's useful to think of these basic cases. The current transport interface (pn_transport_input/output) and initial implementation was designed around option (1), the idea behind the pn_transport_output signature was that the engine could directly encode into a driver-owned buffer. This, however, turned out to introduce some unfriendly coupling. Imagine what happens in our hypothetical scenario above if the driver has a 1K buffer and the engine negotiates a 4K frame size. The engine might end up getting stuck with a frame that is too large to encode directly into the driver's buffer. So to make the interface more friendly, we modified the implementation to do buffering internally if necessary, thus ending up in some ways closer to option (3). Now the reason this buffering issue is related to PROTON-222 is that one way to allow the engine to know whether data is buffered or not is to redefine the interface around option (2), thereby allowing the engine to always have visibility into what is/isn't on the wire. This would also in some cases eliminate some of the extra copying that occurs currently due to our evolution towards option (3). Such an interface would look something like this: // deprecated and internally implemented with capacity, buffer, and push ssize_t pn_transport_input(pn_transport_t *transport, const char *bytes, size_t available); // deprecated and internally implemented with pending, peek, and pop ssize_t pn_transport_output(pn_transport_t *transport, char *bytes, size_t size); /** Report the amount of free space in a transport's input buffer. If * the engine is in an error state or has reached the end of stream * state, a negative value will be returned indication the condition. * * @param[in] transport the transport * @return the free space in the transport */ ssize_t pn_transport_capacity(pn_transport_t *transport); /** Return a pointer to a transport's input buffer. This pointer may * change when calls into the engine are made. The amount of space in * this buffer is reported by ::pn_transport_capacity. Calls to * ::pn_transport_push may change the value of this pointer and the * amount of free space reported by ::pn_transport_capacity. * * @param[in] transport the transport * @return a pointer to the transport's input buffer */ char *pn_transport_buffer(pn_transport_t *transport); /** Push data from a transport's input buffer into the engine and * return how much data was succesfully pushed. * * @param[in] transport the transport * @param[size] the amount of data in the transport's input buffer * @return the amount of data consumed */ size_t pn_transport_push(pn_transport_t *transport, size_t size); /** Return the number of pending output bytes for a transport, or a * negative error code if the engine is in an error state. * * @param[in] the transport * @return the number of pending output bytes, or an error code */ ssize_t pn_transport_pending(pn_transport_t *transport); /** Return a pointer to a transport's output buffer. Any calls into * the engine may change the value of this pointer and it's contents. * * @param[in] the transport * @return a pointer to the transport's output buffer */ const char *pn_transport_peek(pn_transport_t *transport); /** Remove bytes from the head of the output buffer for a transport. * * @param[in] the transport * @param[size] the number of bytes to remove */ void pn_transport_pop(pn_transport_t *transport, size_t size); This new interface basically models the transport as a queue of bytes. On the input side, pn_transport_input is decomposed into three separate parts: capacity, buffer, and push. On the output side, pn_transport_output is decomposed into three separate parts: pending, peek, and pop. These changes would simplify the current driver implementation as it would no longer need to do it's own buffering, and usage of the peek/pop style interface on the output side would be a much safer interface for driver's in general and allow the engine to know what has/hasn't made it onto the wire. To revisit our hypothetical scenario, the driver no longer needs to bother with its 1K buffer, it simply queries the transport for pending bytes, offers them all to the socket write, and then pops only those bytes that have actually been written. This also moves the transport interface firmly into the camp of option (2). It's somewhat unavoidable that the transport needs to do it's own buffering, particularly when you consider things like SSL, and exposing that fact at least allows the option of a driver that avoids the copy. It should be noted though that there are some reasons that a driver may want to own it's own buffers, and in these cases an extra copy will be necessary, resulting again in something that looks like option (3). There are ways we might address this in the future by providing hooks into the allocation that the transport does, however I don't believe that affects the part of the interface presented above. Please have a look over this and let me know what your thoughts are. I'd appreciate some comments from the Java folks on how to address this problem for the Java transport interface. I don't think the buffer management needs to be identical to the C as the whole buffer ownership problem is a bit different with garbage collection (e.g. it's less of a problem to create buffers in one component and pass ownership to another component), however I believe the interface does need to address the same thing that the peek/pop portion of this proposal is addressing, i.e. adding some means of confirming when the bytes are actually "on the wire". --Rafael