Hey, it's great that you're trying things.  I think there's room for
improvement in the Java implementation (as opposed to C++), and it tends to
take some trial-and-error.
You note that small messages seem faster with smaller buffer sizes, but
larger messages are slower.  I am guessing that by "small messages" you mean
ones which are significantly smaller than the buffer size, and "large
messages" means larger than the buffer size.  One thing you might try:  if
the message is smaller than 4096 (or whatever the buffer size constant is),
then allocate a buffer exactly as big as the message to avoid waste.  You
can call getSerializedSize() to find out the message size ahead of time.
 Note that calling this doesn't actually waste any time since the result is
cached, and it would have to be called during serialization anyway.

Once you do that, then increasing the buffer size constant (which is now the
*maximum* buffer size) might make more sense.

On Thu, Oct 22, 2009 at 6:41 AM, Evan Jones <ev...@mit.edu> wrote:

> I've been wasting my time doing various microbenchmarks when I should
> be doing "real" work. This message describes some "failed" attempted
> optimizations, ideally so others don't waste their time.
> I was looking at the Java CodedOutputStream implementation, and was
> interested that it uses an internal byte[] array buffer, since this is
> what BufferedOutputStream does. Additionally, the JVM internally uses
> 8192 as the "magic" buffer size inside BufferedOutputStream, and the
> native code that actually writes data to/from files and sockets. I
> tried two tweaks that are both worse than the existing code. I'm
> reporting this here so others don't waste their time:
> a) Change the default buffer size from 4096 to 8192 bytes.
> b) Remove the internal buffer and rely on OutputStream.
> System: Intel Xeon E5540 (Core i7/Nehalem) @ 2.53 GHz, Linux 2.6.29
> Java: Both Sun 1.6.0_16-b01 and 1.7.0-ea-b74; 64-bit; always using -
> serve
> Benchmark: Using ProtoBench, with my own extensions to write to /dev/
> null using FileOutputStream, and BufferedOutputStream(FileOutputStream)
> Summary of results:
> a) Bigger buffer size: small messages are slightly slower, large
> messages are slightly faster. The difference is ~1-2% at most, so this
> could just be "noise." I also tried a 2048 byte buffer, and it also
> makes approximately no difference.
> b) Using OutputStream instead of internal buffer: For the small
> message serializing to byte[] is slower, but serializing to /dev/null
> is much faster (~ +30%). However, for the large message, it makes
> everything a fair bit slower (at least 10% worse).
> bonus) jdk7 has the same results, except it is generally faster than
> jdk6
> Conclusions:
> * None of these optimizations is a clear win.
> * 8192 is not always the right buffer size for Java (although it
> should be a maximum for anything that might call
> OutputStream.write()). I'm guessing the reason making the buffer
> bigger hurts performance is due to the extra allocation/deallocation
> cost for all the temporary CodedOutputStreams.
> * Hotspot doesn't magically optimize as much as you might like: using
> BufferedOutputStream does the same thing as CodedOutputStream's
> internal byte[] buffer, but hotspot can't optimize the code as well.
> I'm guessing this is because the dynamic dispatch on OutputStream
> prevents aggressive inlining?
> * Results are somewhat variable, and are of course data dependent.
> More benchmarks should be done before making a performance related
> code change.
> Evan
> --
> Evan Jones
> http://evanjones.ca/
> >

