Re: [protobuf] How to tell a regular string or embedded message ?
d'oh! Engage brain, then talk! Yes, you're entirely right; testing it as a sub-message and trying UTF8 as a fallback makes a lot more sense. Marc On 16 April 2010 03:23, Kenton Varda wrote: > On Thu, Apr 15, 2010 at 2:55 PM, Marc Gravell wrote: > >> In the case you don't have the definition (perhaps a network inspector), >> you could always *try* to parse it as UTF8 - if it passes as UTF8 the >> chances are fairly good that it truly is a UTF8 string - and the confidence >> should go up as the string gets longer (since the chance of a random .proto >> meeting the UFT8 spec isn't great). > > > Actually, it's very easy for a random proto to meet UTF-8, since it is easy > for a random proto to have no bytes with the top bit set. Also, this > technique doesn't help you distinguish between messages and "bytes". The > opposite heuristic may work better -- assume it is a message if it > successfully parses as one. Although, this isn't a whole lot better -- the > encoding is dense enough that random bytes can often parse successfully. > > >> >> Marc >> >> >> On 15 April 2010 21:08, Daniel Wright wrote: >> >>> You can't tell without the .proto file. Protocol buffers are really >>> intended to be used in conjunction with the .proto file, and it's only in >>> rare cases (usually for debugging) that you'd ever want to try to understand >>> them without the .proto file (or better yet, the protocol descriptor). >>> >>> Note that most users shouldn't write their own decoding code -- ideally >>> you should use one of the existing APIs. >>> >>> >>> On Thu, Apr 15, 2010 at 12:58 PM, ssk wrote: >>> Hi gurus, I've just dived into protocol buffers and been reading "http:// code.google.com/apis/protocolbuffers/docs/encoding.html". I was trying to decode encoded messages but I'm totally stuck. Wondering how I can tell an element is a regular string or an embedded message if its element type is 2. If I have its proto file, probably I could. But if I don't have one ??? Is there any way I can figure it out ? or impossible ? Thanks in advance. -ssk -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Protocol Buffers" group. >>> To post to this group, send email to proto...@googlegroups.com. >>> To unsubscribe from this group, send email to >>> protobuf+unsubscr...@googlegroups.com >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/protobuf?hl=en. >>> >> >> >> >> -- >> Regards, >> >> Marc >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Protocol Buffers" group. >> To post to this group, send email to proto...@googlegroups.com. >> To unsubscribe from this group, send email to >> protobuf+unsubscr...@googlegroups.com >> . >> For more options, visit this group at >> http://groups.google.com/group/protobuf?hl=en. >> > > -- Regards, Marc -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] How to tell a regular string or embedded message ?
On Thu, Apr 15, 2010 at 2:55 PM, Marc Gravell wrote: > In the case you don't have the definition (perhaps a network inspector), > you could always *try* to parse it as UTF8 - if it passes as UTF8 the > chances are fairly good that it truly is a UTF8 string - and the confidence > should go up as the string gets longer (since the chance of a random .proto > meeting the UFT8 spec isn't great). Actually, it's very easy for a random proto to meet UTF-8, since it is easy for a random proto to have no bytes with the top bit set. Also, this technique doesn't help you distinguish between messages and "bytes". The opposite heuristic may work better -- assume it is a message if it successfully parses as one. Although, this isn't a whole lot better -- the encoding is dense enough that random bytes can often parse successfully. > > Marc > > > On 15 April 2010 21:08, Daniel Wright wrote: > >> You can't tell without the .proto file. Protocol buffers are really >> intended to be used in conjunction with the .proto file, and it's only in >> rare cases (usually for debugging) that you'd ever want to try to understand >> them without the .proto file (or better yet, the protocol descriptor). >> >> Note that most users shouldn't write their own decoding code -- ideally >> you should use one of the existing APIs. >> >> >> On Thu, Apr 15, 2010 at 12:58 PM, ssk wrote: >> >>> Hi gurus, >>> >>> I've just dived into protocol buffers and been reading "http:// >>> code.google.com/apis/protocolbuffers/docs/encoding.html". >>> >>> I was trying to decode encoded messages but I'm totally stuck. >>> Wondering how I can tell an element is a regular string or an embedded >>> message if its element type is 2. >>> If I have its proto file, probably I could. But if I don't have >>> one ??? >>> >>> Is there any way I can figure it out ? or impossible ? >>> >>> Thanks in advance. >>> >>> -ssk >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Protocol Buffers" group. >>> To post to this group, send email to proto...@googlegroups.com. >>> To unsubscribe from this group, send email to >>> protobuf+unsubscr...@googlegroups.com >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/protobuf?hl=en. >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Protocol Buffers" group. >> To post to this group, send email to proto...@googlegroups.com. >> To unsubscribe from this group, send email to >> protobuf+unsubscr...@googlegroups.com >> . >> For more options, visit this group at >> http://groups.google.com/group/protobuf?hl=en. >> > > > > -- > Regards, > > Marc > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] How to tell a regular string or embedded message ?
In the case you don't have the definition (perhaps a network inspector), you could always *try* to parse it as UTF8 - if it passes as UTF8 the chances are fairly good that it truly is a UTF8 string - and the confidence should go up as the string gets longer (since the chance of a random .proto meeting the UFT8 spec isn't great). Marc On 15 April 2010 21:08, Daniel Wright wrote: > You can't tell without the .proto file. Protocol buffers are really > intended to be used in conjunction with the .proto file, and it's only in > rare cases (usually for debugging) that you'd ever want to try to understand > them without the .proto file (or better yet, the protocol descriptor). > > Note that most users shouldn't write their own decoding code -- ideally you > should use one of the existing APIs. > > > On Thu, Apr 15, 2010 at 12:58 PM, ssk wrote: > >> Hi gurus, >> >> I've just dived into protocol buffers and been reading "http:// >> code.google.com/apis/protocolbuffers/docs/encoding.html". >> >> I was trying to decode encoded messages but I'm totally stuck. >> Wondering how I can tell an element is a regular string or an embedded >> message if its element type is 2. >> If I have its proto file, probably I could. But if I don't have >> one ??? >> >> Is there any way I can figure it out ? or impossible ? >> >> Thanks in advance. >> >> -ssk >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Protocol Buffers" group. >> To post to this group, send email to proto...@googlegroups.com. >> To unsubscribe from this group, send email to >> protobuf+unsubscr...@googlegroups.com >> . >> For more options, visit this group at >> http://groups.google.com/group/protobuf?hl=en. >> >> > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > -- Regards, Marc -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] How to tell a regular string or embedded message ?
You can't tell without the .proto file. Protocol buffers are really intended to be used in conjunction with the .proto file, and it's only in rare cases (usually for debugging) that you'd ever want to try to understand them without the .proto file (or better yet, the protocol descriptor). Note that most users shouldn't write their own decoding code -- ideally you should use one of the existing APIs. On Thu, Apr 15, 2010 at 12:58 PM, ssk wrote: > Hi gurus, > > I've just dived into protocol buffers and been reading "http:// > code.google.com/apis/protocolbuffers/docs/encoding.html". > > I was trying to decode encoded messages but I'm totally stuck. > Wondering how I can tell an element is a regular string or an embedded > message if its element type is 2. > If I have its proto file, probably I could. But if I don't have > one ??? > > Is there any way I can figure it out ? or impossible ? > > Thanks in advance. > > -ssk > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.