Hi Everyone, I've been doing a bit more exploration around some of the codec strategies I posted about a few weeks ago and I'd like to share some results.
There are still some gaps to fill in, but all the complex data types (lists, maps, arrays, described types, etc) are dealt with for both encode/decode. Those are important for evaluating performance as they are the most complex to encode/decode and as such they significantly impact performance. You can look at what I've done here: - https://github.com/rhs/qpid-proton/tree/codec/proton-j/src/main/java/org/apache/qpid/proton/codec2 Note the codec2 package name is temporary, it's just so it could live alongside the existing codec in the same codebase. I've put together a basic benchmark to compare against the existing codec performance here: - https://github.com/rhs/qpid-proton/blob/codec/proton-j/src/main/java/org/apache/qpid/proton/codec2/Benchmark.java The benchmark encodes and decodes a list of 10 integers and a UUID. My hope is that this is a reasonable approximation of what is in a common frame, e.g. a transfer or flow frame. So far the results are encouraging. On my system the new codec is roughly 8 to 9 times faster than the existing codec on encode, and about 5 times faster than the existing codec for decode: [rhs@venture build]$ java -cp proton-j/proton-j.jar org.apache.qpid.proton.codec2.Benchmark 100000000 all new encode: 9270 millis new decode: 7764 millis existing encode: 78725 millis existing decode: 40175 millis The above Benchmark invocation is running through 100 million encode/decodes and you can see the timing results for a typical run. In addition to the raw performance considerations demonstrated by the Benchmark, there are some interesting and potentially key aspects of the design that would enable higher performance usage patterns. The way the decoder works it scans the encoded byte stream and calls into the data handler when types are encountered. The data handler is not actually passed the decoded type, but instead it is passed a Decoder (which is just a reference into the stream). The decoder can then be used by the handler to extract the desired value from the data stream. This design allows for a couple of nice things. For one thing there is zero intermediate garbage created by the decoding process itself, the only garbage produced is at the request of the handler, e.g. if the handler wants a to extract a string as a full blown String object it is free to do that and it will incur the associated overhead, but the handler could also just choose to copy the utf8 bytes directly to some final destination and avoid any conversion overhead. This also provides an added measure of convenience and robustness since the ''type on the wire' can be converted directly to the desired Java type, e.g. if it's an integral type on the wire, your handler can just call getInt() or getLong() and the decoder will convert/coerce automatically. Another nice thing about this design is that there is minimal decode overhead if the handler doesn't decode the type. This makes it possible to quite efficiently scan for particular value(s) deep inside an encoded stream. For example it should be possible to write a handler that extremely efficiently evaluates a predicate against the message properties for things like selectors/content based routing rules. It should also be possible to write a handler that very efficiently copies a data stream while modifying only a few values, e.g. copy a message from an input buffer to an output buffer while updating just the ttl and delivery count and adding some sort of trace header. We could even extend the design to allow extremely efficient in-place modification of fixed width values if we find that to be useful. In addition to these lower level usage scenarios, it is also quite straightforward to transform an encoded data stream into a full blown object representation if performance is less critical. The codec includes a POJOBuilder which implements the DataHandler interface and transforms an AMQP byte stream into simple java objects. I've put together an example usage here: - https://github.com/rhs/qpid-proton/blob/codec/proton-j/src/main/java/org/apache/qpid/proton/codec2/Example.java I'd like to get people's feedback on these ideas. I would like this codec layer to be usable/useful as a first class API in it's own right, and not just an implementation detail of the protocol engine. If people are happy with the design and the API, I think it would be a relatively straightforward process to generate some DataHandler implementations from the protocol XML that would effectively replace the existing codec layer in the engine and hopefully provide a significant performance improvement as a result. --Rafael