On Wed, Dec 23, 2009 at 11:14 AM, Kenton Varda <ken...@google.com> wrote:

> On Tue, Dec 22, 2009 at 7:06 PM, David Yu <david.yu....@gmail.com> wrote:
>> There should be a writeByteArray(int fieldNumber, byte[] value) in
>> CodedOutputStream so that the cached bytes of strings would
>> be written directly.  The ByteString would not help, it adds more memory
>> since it creates a copy of the byte array.
> We could cache the bytes as a ByteString.  Converting a String to a
> ByteString does not require a redundant copy, as ByteString has methods for
> this.
> I think it would be better to do it this way because, in the long run, we
> actually want to extend ByteString to allow avoiding copies in some cases.
>  For example, if you are serializing a message to a ByteString (you caleld
> toByteString()) or parsing from a ByteString, then handling "bytes" fields
> should require any copy.  Instead, it should be possible to construct a
> ByteString which is a substring of some other ByteString in O(1) time, as
> well as concatenate ByteStrings in O(1) time.
> So this way, if the size-computation step converted the String to a
> ByteString and cached that, no further copy of the bytes would ever be
> needed in many cases.

Btw, the ByteString's snippet is:
 return new ByteString(text.getBytes("UTF-

Another improvement would be avoiding the lookup and instead cache the
Charset.forName("UTF-8") object and use it.
I believe you google guys have also been evangelizing this :-) (PDF from

When the cat is away, the mouse is alone.
- David Yu


You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to