Evan Jones wrote:
Problem 2: Using the NIO encoders/decoders can be faster than String.getBytes, but only if it is used >= 4 times. If used only once, it is worse. The same is approximately true about decoding. Lame results: http://evanjones.ca/software/java-string-encoding.html


I'm revisiting this old issue, thanks to being reminded about it by an earlier message. I've tested this with "recent" JVMs, and it still seems to hold true: using the NIO encoders and decoders can be faster than using String.getBytes(). My numbers show that encoding is approximately 40% faster, while decoding shows a smaller improvement. See the following for details on this microbenchmark:

http://evanjones.ca/software/java-string-encoding.html


This surprises me, since it suggests that Sun/Oracle could replace their implementation of String.getBytes() with something similar to my code, and get a performance improvement. In fact, with privileged access to the internals of a String, they should be able to do even better.

I've integrated this change into protobuf and with the microbenchmark in the protobuf source tree, it shows a performance improvement of ~20% (numbers below). I'll send a code review with the code shortly, once I've cleaned it up a bit, in case anyone wants to look at it.


Pros:
+ Faster protocol buffer encoding.

Cons:
- Far more code to handle encoding (like an extra hundred lines or so) which means there could be bugs.
- Extra memory (~2 kB per thread for encoding)


I'm unsure if the benefits outweigh the costs, particularly since the fact this appears faster seems to be a surprising result, and I wouldn't be shocked to find that future JVM / JDK releases could make this "optimization" useless. I'll leave that decision to the protocol buffer maintainers.

Evan



Results: All the serialize results are better. The deserialized results are unchanged (as expected).


SpeedMessage1 (which is small):
Original: Serialize to byte string: 12360424 iterations in 29.83s; 90.097984MB/s Optimized: Serialize to byte string: 15911623 iterations in 29.997s; 115.337776MB/s

SpeedMessage2 (larger):
Serialize to byte string: 33482 iterations in 29.754s; 90.757484MB/s
Serialize to byte string: 40381 iterations in 30.031s; 108.44853MB/s


Raw results on my Macbook (Core2 Duo CPU):

ORIGINAL
Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file google_message1.dat
Serialize to byte string: 12360424 iterations in 29.83s; 90.097984MB/s
Serialize to byte array: 12244951 iterations in 30.303s; 87.86307MB/s
Serialize to memory stream: 8699469 iterations in 23.732s; 79.70642MB/s
Serialize to /dev/null with FileOutputStream: 6075179 iterations in 27.764s; 47.578636MB/s Serialize to /dev/null reusing FileOutputStream: 6975006 iterations in 29.99s; 50.571175MB/s Serialize to /dev/null with FileChannel: 10375092 iterations in 29.864s; 75.54034MB/s Serialize to /dev/null reusing FileChannel: 11166943 iterations in 30.699s; 79.09427MB/s Deserialize from byte string: 14463117 iterations in 30.06s; 104.618355MB/s Deserialize from byte array: 14436567 iterations in 30.007s; 104.61074MB/s Deserialize from memory stream: 6221772 iterations in 28.024s; 48.274624MB/s

Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file google_message2.dat
Serialize to byte string: 33482 iterations in 29.754s; 90.757484MB/s
Serialize to byte array: 33103 iterations in 29.517s; 90.45062MB/s
Serialize to memory stream: 28872 iterations in 29.939s; 77.77786MB/s
Serialize to /dev/null with FileOutputStream: 32934 iterations in 29.927s; 88.756MB/s Serialize to /dev/null reusing FileOutputStream: 32979 iterations in 29.887s; 88.99622MB/s Serialize to /dev/null with FileChannel: 32447 iterations in 29.921s; 87.46108MB/s Serialize to /dev/null reusing FileChannel: 32585 iterations in 29.903s; 87.88594MB/s Deserialize from byte string: 38388 iterations in 29.879s; 103.620544MB/s
Deserialize from byte array: 38677 iterations in 29.866s; 104.446075MB/s
Deserialize from memory stream: 37879 iterations in 29.954s; 101.990585MB/s

OPTIMIZED
Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file google_message1.dat
Serialize to byte string: 15911623 iterations in 29.997s; 115.337776MB/s
Serialize to byte array: 16152646 iterations in 30.008s; 117.041954MB/s
Serialize to memory stream: 14859367 iterations in 29.551s; 109.33597MB/s Serialize to /dev/null with FileOutputStream: 7224915 iterations in 29.954s; 52.446056MB/s Serialize to /dev/null reusing FileOutputStream: 7479144 iterations in 30.081s; 54.062305MB/s Serialize to /dev/null with FileChannel: 12730586 iterations in 30.025s; 92.193504MB/s Serialize to /dev/null reusing FileChannel: 14024645 iterations in 30.399s; 100.31538MB/s Deserialize from byte string: 14390338 iterations in 29.958s; 104.44631MB/s Deserialize from byte array: 14496442 iterations in 30.142s; 104.57414MB/s Deserialize from memory stream: 6653401 iterations in 29.989s; 48.24104MB/s

Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file google_message2.dat
Serialize to byte string: 40381 iterations in 30.031s; 108.44853MB/s
Serialize to byte array: 39447 iterations in 29.678s; 107.20024MB/s
Serialize to memory stream: 33647 iterations in 30.029s; 90.36951MB/s
Serialize to /dev/null with FileOutputStream: 39071 iterations in 30.142s; 104.543945MB/s Serialize to /dev/null reusing FileOutputStream: 39321 iterations in 29.969s; 105.82024MB/s Serialize to /dev/null with FileChannel: 38460 iterations in 29.955s; 103.55149MB/s Serialize to /dev/null reusing FileChannel: 38690 iterations in 29.969s; 104.12209MB/s Deserialize from byte string: 37440 iterations in 29.312s; 103.016495MB/s
Deserialize from byte array: 38208 iterations in 29.862s; 103.193375MB/s
Deserialize from memory stream: 37315 iterations in 29.944s; 100.505554MB/s

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Reply via email to