[EMAIL PROTECTED] wrote:
> I'm trying to consume data from an app that generates output
> serialized via Protocol Buffers but do not have the original spec for
> the specific structures that have been encoded. Is there a relatively
> straight-forward path to deserializing, or even just decoding, the
> serialized data stream without knowing its structure in advance?
>   
There is no straight-forward path.  The wire format is not self-describing.
You get the outermost field numbers and wire types and data chunks for free.
But the numeric wire types do not tell you how to interpret them : 
signed vs unsigned, double/float vs integer, whether it is zigzag 
encoded, the byte-size of the field, whether it is an enum (never mind 
which one).
The length-encoded fields are slightly better.  If the data chunk parses 
as a valid message then it is probably a message.  If it parses as valid 
UTF8 then it is probably a string.  Otherwise it must be a byte array.

If the same sat of field numbers + wire types comes up repeatedly then 
they may be same message type.  Many identical fields
in a row are probably for a repeated field, and as such you can assume 
the contents are the same type.  The multiple values gives you a hand in 
picking how to decode them.

For a self-describing binary type you have to look elsewhere (e.g. HDF5).

-- 
Chris


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to