Hi,

I have a suggestion fr improving the protobuf encoding.
Is proto3 final?

I like the simplicity of the encoding of protobuf.
But I think it has one issue with serialization, using streams.
The problem is with length delimited fields and the fact that they require 
knowing the length ahead of time.
If we have a very long string, we need to encode the entire string before 
we know its length, so we basically duplicate the data in memory.
Same is true for embedded messages, where we need to encode the entire 
embedded message before we can append it to the stream.

I think there is a simple solution for both issues.

For strings and byte arrays, a simple solution is to use "chunked encoding".
Which means that the byte array is split into chunks and every chunk starts 
with the chunk length. End of array is indicated by length zero.

For embedded messages, the solution is to have an "start embedding" tag and 
an "end embedding tag".
Everything in between is the embedded message.

By adding these two new features, serialization can be fully streamable and 
there is no need to pre-serialize big chunks in memory before writing them 
to the stream.

Hope you'll find this suggestion useful and incorporate it into the 
protocol.

Thanks,
Yoav.


-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to