This seems like a pretty thorough reply overall. I'll address some issues inline.

1. The performance figures that thrift-protobuf-compare provides for dynamic serialization systems like JSON are not currently valid since the tests do
not really test them as a fully general serialization/deserialization
framework.

You mean that since they coded their json serialization directly to their test data, the performance data is inaccurate? This might be true in the sense that their serialization doesn't do some things that Thrift does, but since we generate code, I don't think it's *that* different.

2. Using TCompactProtocol, Thrift serialization speed and serialized size
are basically equivalent to protocol buffers

Cool! I bet that we could find other datasets that exercise both protos better and could turn up slight differences one way or another.

I updated to the trunk of Thrift (rev 773454) and changed the Thrift
serializer to use TCompactProtocol instead of TBinaryProtocol. I also
corrected ThriftSerializer's create() so that the same data was being sent for image2 as in ProtobufSerializer. Finally, I updated the formatting in BenchmarkRunner and commented out all the serializers except Thrift and protocol buffers. Here are the benchmark results from 3 consecutive runs:

, Create, Ser, Deser, Total, Size thrift , 267.37, 8314.00, 8546.00, 17127.36, 220 protobuf , 412.98, 12642.00, 5217.50, 18272.48, 217

, Create, Ser, Deser, Total, Size thrift , 266.87, 10905.50, 8526.50, 19698.86, 220 protobuf , 415.21, 11880.50, 4930.00, 17225.71, 217

, Create, Ser, Deser, Total, Size thrift , 264.95, 11059.50, 8701.50, 20025.95, 220 protobuf , 417.45, 11125.00, 5203.50, 16745.95, 217

It's interesting that we're that much faster at object creation, but it would seem from this test that it's basically only like 1% of the total time of the test, so probably not that significant. We should definitely address our deserialization time, though.

Looking a little further at deserialization since Thrift seemed to be
performing worse than protocol buffers there, the problem may be related to
the fact that protocol buffers provides APIs that support direct
serialization to and deserialization from byte arrays which Thrift does not provide. The test harness is set up such that the output of serialize() and the input of deserialize() is a byte array, so this means that Thrift needs to do more work to match up with the test harness. I am still investigating
this.

Are you saying that you think there is significant overhead to TSerializer? Or rather, how *is* the test going from objects to byte[]?

It sounds like it might be interesting to strap a profiler to this test code and see if it shows us anything we can fix.

Reply via email to