[protobuf] Re: Regd: Resolving Wire type ambiguities

2009-11-16 Thread Kenton Varda
It sounds like you actually have the message class available, not just a
serialized instance of the message.  In that case, you can derive the
original type by reading the generated code.  If you only have a compiled
copy of the class, you can derive the type from its descriptor -- use
MessageType.getDescriptor() in Java or MessageType::descriptor() in C++ to
get it.  In C++ you can even call descriptor->file()->DebugString() to
generate a .proto-syntax representation of the file.

On Mon, Nov 16, 2009 at 1:32 PM, rahul prasad  wrote:

> Hi,
>
> Thanks for the clarification. I did try one dirty method of finding the
> original types, because of my ".proto"-less situation. I relied on exception
> statements thrown out when I iterated through the protobuffers by trying to
> extract a known wiretype with a wrong-type getter. I know it sucks, but it
> worked for me. Thanks.
>
> Regards,
> Rahul Prasad
>
>
>
>
> On Mon, Nov 16, 2009 at 1:42 PM, Jason Hsueh  wrote:
>
>> You can decode the protocol buffer with just wire type + tag number, but
>> you won't know the original types without a proto definition. Everything
>> would be treated as an unknown field. You could access these by iterating
>> through the UnknownFieldSet, but again, you can't recover the original
>> types.
>>
>> On Sat, Nov 14, 2009 at 1:10 PM, rahul prasad  wrote:
>>
>>> Hi Marc,
>>>
>>> Thanks for the clarification. If the actual .proto was there, i would not
>>> have posted that question [?] at the first place. Anyways, to decode a
>>> protocol buffer, is it not enough to have just the wire type + tag number
>>> combination? (except of course, handling of the sub-messages-ness and other
>>> ambiguities you mentioned below have to be done manually though)
>>>
>>> Regards,
>>> Rahul Prasad
>>>
>>>
>>>
>>> On Sat, Nov 14, 2009 at 3:57 PM, Marc Gravell wrote:
>>>
 If you treat it as a string (UTF8), you are likely to get garbage. If
 you treat it as a byte[], then you just get a BLOB - you don't lose
 anything, but you might not be showing some more detail that you could 
 show.

 You could, however, check for likely-sub-message-ness - i.e. after
 getting the length, you could try decoding the next few bytes as a variant,
 and do the shift trick; see if it looks likely to be a sub-message etc; you
 could try to validate the entire "string", see if it makes sense. Note that
 you don't have to store any of the data - just follow the rules for each
 wire-format until something doesn't look right or you've checked the 
 string.

 Easiest, though, is to have the .proto available ;-p

 Marc

 2009/11/14 rahul prasad 

> Hi,
>
> As seen from the below wire types table from protobuf documentation, if
> i try to extract a value from a protobuf that is of type 2, it could 
> either
> be a string, byte array or a embedded message etc, If I cast the value as
> bytes or string on the decoding side, while on the encoding side it was
> actually an embedded message, what would this result in? Will I be able to
> retrieve the actual value, someway or the other doing it this way?
>
> The available wire types are as follows:
>  Type Meaning Used For 0 Varint int32, int64, uint32, uint64, sint32,
> sint64, bool, enum 1 64-bit fixed64, sfixed64, double 
> 2Length-delimitedstring, bytes, embedded messages, packed repeated 
> fields3Start groupgroups (deprecated)4End groupgroups 
> (deprecated)532-bitfixed32, sfixed32, float
> Regards,
> Rahul Prasad
>
>
>
>


 --
 Regards,

 Marc

>>>
>>>
>>>
>>>
>>
>
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

<>

[protobuf] Re: Regd: Resolving Wire type ambiguities

2009-11-16 Thread Jason Hsueh
You can decode the protocol buffer with just wire type + tag number, but you
won't know the original types without a proto definition. Everything would
be treated as an unknown field. You could access these by iterating through
the UnknownFieldSet, but again, you can't recover the original types.

On Sat, Nov 14, 2009 at 1:10 PM, rahul prasad  wrote:

> Hi Marc,
>
> Thanks for the clarification. If the actual .proto was there, i would not
> have posted that question [?] at the first place. Anyways, to decode a
> protocol buffer, is it not enough to have just the wire type + tag number
> combination? (except of course, handling of the sub-messages-ness and other
> ambiguities you mentioned below have to be done manually though)
>
> Regards,
> Rahul Prasad
>
>
>
> On Sat, Nov 14, 2009 at 3:57 PM, Marc Gravell wrote:
>
>> If you treat it as a string (UTF8), you are likely to get garbage. If you
>> treat it as a byte[], then you just get a BLOB - you don't lose anything,
>> but you might not be showing some more detail that you could show.
>>
>> You could, however, check for likely-sub-message-ness - i.e. after getting
>> the length, you could try decoding the next few bytes as a variant, and do
>> the shift trick; see if it looks likely to be a sub-message etc; you could
>> try to validate the entire "string", see if it makes sense. Note that you
>> don't have to store any of the data - just follow the rules for each
>> wire-format until something doesn't look right or you've checked the string.
>>
>> Easiest, though, is to have the .proto available ;-p
>>
>> Marc
>>
>> 2009/11/14 rahul prasad 
>>
>>> Hi,
>>>
>>> As seen from the below wire types table from protobuf documentation, if i
>>> try to extract a value from a protobuf that is of type 2, it could either be
>>> a string, byte array or a embedded message etc, If I cast the value as bytes
>>> or string on the decoding side, while on the encoding side it was actually
>>> an embedded message, what would this result in? Will I be able to retrieve
>>> the actual value, someway or the other doing it this way?
>>>
>>> The available wire types are as follows:
>>>  Type Meaning Used For 0 Varint int32, int64, uint32, uint64, sint32,
>>> sint64, bool, enum 1 64-bit fixed64, sfixed64, double 2 
>>> Length-delimitedstring, bytes, embedded messages, packed repeated 
>>> fields3Start groupgroups (deprecated)4End groupgroups 
>>> (deprecated)532-bitfixed32, sfixed32, float
>>> Regards,
>>> Rahul Prasad
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>>
>> Marc
>>
>
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

<>

[protobuf] Re: Regd: Resolving Wire type ambiguities

2009-11-14 Thread rahul prasad
Hi Marc,

Thanks for the clarification. If the actual .proto was there, i would not
have posted that question [?] at the first place. Anyways, to decode a
protocol buffer, is it not enough to have just the wire type + tag number
combination? (except of course, handling of the sub-messages-ness and other
ambiguities you mentioned below have to be done manually though)

Regards,
Rahul Prasad



On Sat, Nov 14, 2009 at 3:57 PM, Marc Gravell wrote:

> If you treat it as a string (UTF8), you are likely to get garbage. If you
> treat it as a byte[], then you just get a BLOB - you don't lose anything,
> but you might not be showing some more detail that you could show.
>
> You could, however, check for likely-sub-message-ness - i.e. after getting
> the length, you could try decoding the next few bytes as a variant, and do
> the shift trick; see if it looks likely to be a sub-message etc; you could
> try to validate the entire "string", see if it makes sense. Note that you
> don't have to store any of the data - just follow the rules for each
> wire-format until something doesn't look right or you've checked the string.
>
> Easiest, though, is to have the .proto available ;-p
>
> Marc
>
> 2009/11/14 rahul prasad 
>
>> Hi,
>>
>> As seen from the below wire types table from protobuf documentation, if i
>> try to extract a value from a protobuf that is of type 2, it could either be
>> a string, byte array or a embedded message etc, If I cast the value as bytes
>> or string on the decoding side, while on the encoding side it was actually
>> an embedded message, what would this result in? Will I be able to retrieve
>> the actual value, someway or the other doing it this way?
>>
>> The available wire types are as follows:
>>  Type Meaning Used For 0 Varint int32, int64, uint32, uint64, sint32,
>> sint64, bool, enum 1 64-bit fixed64, sfixed64, double 2 
>> Length-delimitedstring, bytes, embedded messages, packed repeated 
>> fields3Start groupgroups (deprecated)4End groupgroups 
>> (deprecated)532-bitfixed32, sfixed32, float
>> Regards,
>> Rahul Prasad
>>
>>
>> >>
>>
>
>
> --
> Regards,
>
> Marc
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

<>

[protobuf] Re: Regd: Resolving Wire type ambiguities

2009-11-14 Thread Marc Gravell
If you treat it as a string (UTF8), you are likely to get garbage. If you
treat it as a byte[], then you just get a BLOB - you don't lose anything,
but you might not be showing some more detail that you could show.

You could, however, check for likely-sub-message-ness - i.e. after getting
the length, you could try decoding the next few bytes as a variant, and do
the shift trick; see if it looks likely to be a sub-message etc; you could
try to validate the entire "string", see if it makes sense. Note that you
don't have to store any of the data - just follow the rules for each
wire-format until something doesn't look right or you've checked the string.

Easiest, though, is to have the .proto available ;-p

Marc

2009/11/14 rahul prasad 

> Hi,
>
> As seen from the below wire types table from protobuf documentation, if i
> try to extract a value from a protobuf that is of type 2, it could either be
> a string, byte array or a embedded message etc, If I cast the value as bytes
> or string on the decoding side, while on the encoding side it was actually
> an embedded message, what would this result in? Will I be able to retrieve
> the actual value, someway or the other doing it this way?
>
> The available wire types are as follows:
>  Type Meaning Used For 0 Varint int32, int64, uint32, uint64, sint32,
> sint64, bool, enum 1 64-bit fixed64, sfixed64, double 2 
> Length-delimitedstring, bytes, embedded messages, packed repeated 
> fields3Start groupgroups (deprecated)4End groupgroups 
> (deprecated)532-bitfixed32, sfixed32, float
> Regards,
> Rahul Prasad
>
>
> >
>


-- 
Regards,

Marc

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---