On Jan 11, 2011, at 0:45 , Nicolae Mihalache wrote:
But I have noticed in java that it is impossible to create a message
containing a "bytes" fields without copying some buffers around. For
example if I have a encoded message of 1MB with a few regular fields
and one big bytes field, decoding the message will make a copy of the
entire buffer instead of keeping a reference to it.
By "decoding" I'm assuming you mean deserializing the message from a
file or something.
This is a disadvantage, but it makes things much easier: it means the
buffer used to read data can be recycled for the next message. Without
this copy, the library would need to do complicated tracking of chunks
of memory to determine if they are "in use" or not.
However, now that you mention it: in the case of big buffers,
CodedInputStream.readBytes() gets called, which currently makes 2
copies of the data (it calls readRawBytes() then calls
ByteString.copyFrom()). This could probably be "fixed" in
CodedInputStream.readBytes(), which might improve performance a fair
bit. I'll put this on my TODO list of things to look at, since I think
my code does this pretty frequently.
Even worse when encoding: if I read some data from file, does not seem
possible to put it directly into a ByteString so I have to make first
a byte, then copy it into the ByteString and when encoding, it makes
yet another byte.
The copy cannot be avoided because it makes the API simpler (thread-
safety, don't need to worry about the ByteBuffer being accidentally
changed, etc). The latest version of Protocol Buffers in Subversion
has ByteString.copyFrom(ByteBuffer) which will do what you want
You received this message because you are subscribed to the Google Groups "Protocol
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to
For more options, visit this group at