My patch to improve string encoding performance is now available as
the following code review. The result: 13%-27% improvement on the
ProtoBench files included in SVN. This is faster than the JDK because
it significantly reduces memory allocations (JDK best case: 5X string
length; my best case: string length + 64 bytes). It also eliminates a
copy, but it also adds a copy of the String data, so that probably is
about equal.
http://codereview.appspot.com/949044
This patch was designed to not change the lite runtime at all, so
there is this weird hacky class called FastStringEncoder, that really
contains methods that should be added to CodedOutputStream.
I think it would be a good idea to include this patch in the protocol
buffer library, although there is a risk that my UTF-8 encoding code
may have bugs in it. Hence, I won't be disappointed if this is
rejected for the protocol buffer distribution, but I will try to
maintain the patch.
I have more detailed performance results, if anyone cares.
Evan
Detailed results for speed messages:
ORIGINAL
Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file
google_message1.dat
Serialize to byte string: 21006530 iterations in 32.088s; 142.34642MB/s
Serialize to byte array: 19310791 iterations in 29.529s; 142.19565MB/s
Serialize to memory stream: 19679249 iterations in 32.203s;
132.87619MB/s
Serialize to /dev/null with FileOutputStream: 15728640 iterations in
29.929s; 114.27044MB/s
Serialize to /dev/null reusing FileOutputStream: 14796462 iterations
in 27.534s; 116.848595MB/s
Serialize to /dev/null with FileChannel: 18961591 iterations in
31.51s; 130.84625MB/s
Serialize to /dev/null reusing FileChannel: 19157904 iterations in
30.755s; 135.44632MB/s
Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file
google_message2.dat
Serialize to byte string: 46108 iterations in 26.724s; 139.15257MB/s
Serialize to byte array: 50547 iterations in 28.874s; 141.19029MB/s
Serialize to memory stream: 48282 iterations in 29.776s; 130.77818MB/s
Serialize to /dev/null with FileOutputStream: 50505 iterations in
28.799s; 141.44037MB/s
Serialize to /dev/null reusing FileOutputStream: 51478 iterations in
30.064s; 138.09926MB/s
Serialize to /dev/null with FileChannel: 51328 iterations in 29.668s;
139.53477MB/s
Serialize to /dev/null reusing FileChannel: 48454 iterations in
27.46s; 142.31332MB/s
OPTIMIZED
Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file
google_message1.dat
Serialize to byte string: 24207218 iterations in 29.098s; 180.89088MB/s
Serialize to byte array: 24480373 iterations in 29.937s; 177.8053MB/s
Serialize to memory stream: 22928046 iterations in 30.515s;
163.37613MB/s
Serialize to /dev/null with FileOutputStream: 20242779 iterations in
29.626s; 148.57033MB/s
Serialize to /dev/null reusing FileOutputStream: 19803135 iterations
in 27.7s; 155.44943MB/s
Serialize to /dev/null with FileChannel: 25135661 iterations in
34.242s; 159.61221MB/s
Serialize to /dev/null reusing FileChannel: 22421439 iterations in
29.61s; 164.64934MB/s
Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file
google_message2.dat
Serialize to byte string: 58071 iterations in 29.694s; 157.72736MB/s
Serialize to byte array: 56888 iterations in 29.112s; 157.60321MB/s
Serialize to memory stream: 53171 iterations in 29.709s; 144.34547MB/s
Serialize to /dev/null with FileOutputStream: 58154 iterations in
29.968s; 156.5086MB/s
Serialize to /dev/null reusing FileOutputStream: 57880 iterations in
29.779s; 156.75984MB/s
Serialize to /dev/null with FileChannel: 55803 iterations in 28.881s;
155.83382MB/s
Serialize to /dev/null reusing FileChannel: 59563 iterations in
30.668s; 156.64175MB/s
Size messages:
ORIGINAL
Benchmarking benchmarks.GoogleSize$SizeMessage1 with file
google_message1.dat
Serialize to byte string: 2789755 iterations in 29.686s; 20.433807MB/s
Serialize to byte array: 2748801 iterations in 29.597s; 20.194382MB/s
Serialize to memory stream: 2702515 iterations in 28.65s; 20.510603MB/s
Serialize to /dev/null with FileOutputStream: 2716518 iterations in
29.376s; 20.107351MB/s
Serialize to /dev/null reusing FileOutputStream: 2507755 iterations in
28.299s; 19.268545MB/s
Serialize to /dev/null with FileChannel: 2809689 iterations in
31.171s; 19.599386MB/s
Serialize to /dev/null reusing FileChannel: 2764260 iterations in
29.827s; 20.151354MB/s
Benchmarking benchmarks.GoogleSize$SizeMessage2 with file
google_message2.dat
Serialize to byte string: 6530 iterations in 27.688s; 19.021206MB/s
Serialize to byte array: 7303 iterations in 30.9s; 19.061596MB/s
Serialize to memory stream: 6918 iterations in 30.389s; 18.360332MB/s
Serialize to /dev/null with FileOutputStream: 7154 iterations in
31.094s; 18.556187MB/s
Serialize to /dev/null reusing FileOutputStream: 6707 iterations in
28.757s; 18.810535MB/s
Serialize to /dev/null with FileChannel: 6887 iterations in 28.743s;
19.324774MB/s
Serialize to /dev/null reusing FileChannel: 7373 iterations in
31.919s; 18.629936MB/s
OPTIMIZED
Benchmarking benchmarks.GoogleSize$SizeMessage1 with file
google_message1.dat
Serialize to byte string: 3432701 iterations in 29.986s; 24.891575MB/s
Serialize to byte array: 3455325 iterations in 30.373s; 24.73638MB/s
Serialize to memory stream: 3398582 iterations in 30.742s; 24.038122MB/s
Serialize to /dev/null with FileOutputStream: 2932259 iterations in
28.331s; 22.504812MB/s
Serialize to /dev/null reusing FileOutputStream: 2779893 iterations in
26.785s; 22.566872MB/s
Serialize to /dev/null with FileChannel: 3129454 iterations in
28.526s; 23.854078MB/s
Serialize to /dev/null reusing FileChannel: 3183935 iterations in
28.779s; 24.056MB/s
Benchmarking benchmarks.GoogleSize$SizeMessage2 with file
google_message2.dat
Serialize to byte string: 6497 iterations in 26.656s; 19.657772MB/s
Serialize to byte array: 7231 iterations in 29.827s; 19.552631MB/s
Serialize to memory stream: 6643 iterations in 27.582s; 19.424726MB/s
Serialize to /dev/null with FileOutputStream: 7078 iterations in
27.844s; 20.501957MB/s
Serialize to /dev/null reusing FileOutputStream: 7434 iterations in
30.969s; 19.360287MB/s
Serialize to /dev/null with FileChannel: 6988 iterations in 29.144s;
19.338385MB/s
Serialize to /dev/null reusing FileChannel: 7279 iterations in
30.338s; 19.3509MB/s
Deserialize from byte string: 5254 iterations in 29.942s; 14.152257MB/s
Deserialize from byte array: 5429 iterations in 30.481s; 14.3650465MB/s
Deserialize from memory stream: 6156 iterations in 32.337s;
15.353779MB/s
--
Evan Jones
http://evanjones.ca/
--
You received this message because you are subscribed to the Google Groups "Protocol
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/protobuf?hl=en.