On Nov 30, 2010, at 15:58 , Blair Zajac wrote:
"""Added lazy conversion of UTF-8 encoded strings to String objects to improve performance."""

Is the lazyness thread safe?

Without looking at the implementation, then if it isn't thread safe, I would guess this isn't much overhead, but if it is thread safe and you know you're going to use all the string fields, then does it hurt performance instead?

Interesting! I looked at this sort of thing a bit, since I have a patch that makes string encoding somewhat faster, although it is quite intrusive, so probably not appropriate for including in the main source tree.

Guesses based on my knowledge of the Java implementation:

* It will be thread-safe, since that is the guarantee provided by the current protocol buffers implementation.

* I'll guess that it will not be slower if you access all the strings. Currently, the parsing process copies the raw bytes from the input buffer into an individual byte array, then converts that to a String. This is, sadly, the most efficient thing you can do, since you need "special" code to create Strings. Therefore, doing "lazy" conversion isn't going to be slower. The objects already have both byte[] and String fields for each string due to an encoding improvement I contributed, so this should be nearly a pure win.


Evan Jones

You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to