> I'm trying to consume data from an app that generates output
> serialized via Protocol Buffers but do not have the original spec for
> the specific structures that have been encoded. Is there a relatively
> straight-forward path to deserializing, or even just decoding, the
> serialized data stream without knowing its structure in advance?
There is no straight-forward path.  The wire format is not self-describing.
You get the outermost field numbers and wire types and data chunks for free.
But the numeric wire types do not tell you how to interpret them : 
signed vs unsigned, double/float vs integer, whether it is zigzag 
encoded, the byte-size of the field, whether it is an enum (never mind 
which one).
The length-encoded fields are slightly better.  If the data chunk parses 
as a valid message then it is probably a message.  If it parses as valid 
UTF8 then it is probably a string.  Otherwise it must be a byte array.

If the same sat of field numbers + wire types comes up repeatedly then 
they may be same message type.  Many identical fields
in a row are probably for a repeated field, and as such you can assume 
the contents are the same type.  The multiple values gives you a hand in 
picking how to decode them.

For a self-describing binary type you have to look elsewhere (e.g. HDF5).


You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 

Reply via email to