Re: [protobuf] Detecting unknown field in raw message

2017-05-30 Thread Yacov Manevich
understood. 

Many thanks for the answers!

On Tuesday, 30 May 2017 20:48:34 UTC+3, Adam Cozzette wrote:
>
> Here is a doc with the rough plan for reintroducing unknown field 
> preservation: 
> https://docs.google.com/document/d/1KMRX-G91Aa-Y2FkEaHeeviLRRNblgIahbsk4wA14gRk/edit?usp=sharing
>  I 
> am not sure what the Go API for accessing the unknown fields will look 
> like, though.
>
> I would again recommend against trying to use unknown fields to try to 
> figure out what schema version was used to produce a particular message. 
> The idiomatic way to use protocol buffers is just to evolve the schema in a 
> compatible way, without trying to guess which schema version was used to 
> produce any particular message.
>
> On Tue, May 30, 2017 at 10:18 AM, Yacov Manevich  > wrote:
>
>> Another thing to keep in mind is that we are planning to reintroduce 
>>> support for preserving unknown fields in proto3, so the behavior around 
>>> unknown fields is going to change soon.
>>>
>>
>> Can you elaborate on this as well please? Does that mean that given a 
>> schema S1 that is a sub-schema of S2 (every field in S1 is contained in S2) 
>> I could know that I have additional fields in a message object of S2 given 
>> code that was generated with S1?
>> Also - any ETA on release?
>>
>>
>>
>> On Tuesday, 30 May 2017 19:20:49 UTC+3, Adam Cozzette wrote:
>>
>>>
>>> On Tue, May 30, 2017 at 9:06 AM, Yacov Manevich  
>>> wrote:
>>>
 but there are some corner cases where that would not be true
>

 Can you elaborate please?

>>>
>>> Sure, the main thing is that protocol buffers do not have a canonical 
>>> format, and there are many ways to create a valid protocol buffer in 
>>> serialized form that will not match what you would get from serializing a 
>>> message the ordinary way. For example, you can take two serialized protocol 
>>> buffers and concatenate them together and get back a valid proto 
>>> representing the merge of the original two. In that case, un-marshaling and 
>>> re-marshaling that concatenated message would often result in a smaller 
>>> size even if there were no unknown fields.
>>>
>>> Another thing to keep in mind is that we are planning to reintroduce 
>>> support for preserving unknown fields in proto3, so the behavior around 
>>> unknown fields is going to change soon.
>>>
 Why do you want to detect which schema was used to serialize the 
 message anyway?

 because I have a project where we have a distributed system that part 
 of it might be upgraded, but another part might not (depends on the 
 costumer / admin) and I want to prevent an earlier version of the software 
 to process information that came from a later version, because I want the 
 earlier version to process the information only when it is upgraded.

>>>  
>>> I see; in that case I would recommend adding a field to the message that 
>>> explicitly indicates whether the message is in the old form or new form.
>>>  
>>>
 On Tuesday, 30 May 2017 18:59:04 UTC+3, Adam Cozzette wrote:
>
> I am not that familiar with the Go implementation of protobuf, but my 
> guess is that this would be somewhat difficult to do reliably. If you 
> un-marshal and re-marshal the message and get a smaller size, then 
> usually 
> that would indicate there were some unknown fields, but there are some 
> corner cases where that would not be true. Why do you want to detect 
> which 
> schema was used to serialize the message anyway?
>
> On Sun, May 28, 2017 at 10:27 AM, Yacov Manevich  
> wrote:
>
>> Hi all.
>>
>> I am in need of knowing whether I have received a protobuf message 
>> (protobuf version 3)
>> that has additional fields than the schema that I currently have 
>> contains.
>> Any pointers on how to do so, specifically in golang?
>>
>>
>> *The longer version:*
>>
>> Let's assume I have a protobuf schema at version v1:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *message Foo {Bar bar = 1;Zig z = 2;Zag zz = 
>> 3;}message Bar {uint64 n = 1;bytes b = 2;bytes d 
>> = 
>> 3;}*
>>
>> *Zig *and *Zag *are not relevant
>>
>> and at some point in the future I will be forced to upgrade it to v2 
>> by adding another field to *Bar*:
>>
>>
>>
>>
>>
>>
>> *message Bar {uint64 n = 1;bytes b = 2;bytes 
>> d = 3;uint64 nn = 4;}*
>>
>> Let's also assume that from some reason I can only make modifications 
>> to the code but not to the proto structure at version v1.
>> What I want is to be able to support backward and forward 
>> compatibility.
>>
>>- Backward compatibility is easy - if the message was created at 
>>v1, and the 

Re: [protobuf] Detecting unknown field in raw message

2017-05-30 Thread 'Adam Cozzette' via Protocol Buffers
Here is a doc with the rough plan for reintroducing unknown field
preservation:
https://docs.google.com/document/d/1KMRX-G91Aa-Y2FkEaHeeviLRRNblgIahbsk4wA14gRk/edit?usp=sharing
I
am not sure what the Go API for accessing the unknown fields will look
like, though.

I would again recommend against trying to use unknown fields to try to
figure out what schema version was used to produce a particular message.
The idiomatic way to use protocol buffers is just to evolve the schema in a
compatible way, without trying to guess which schema version was used to
produce any particular message.

On Tue, May 30, 2017 at 10:18 AM, Yacov Manevich 
wrote:

> Another thing to keep in mind is that we are planning to reintroduce
>> support for preserving unknown fields in proto3, so the behavior around
>> unknown fields is going to change soon.
>>
>
> Can you elaborate on this as well please? Does that mean that given a
> schema S1 that is a sub-schema of S2 (every field in S1 is contained in S2)
> I could know that I have additional fields in a message object of S2 given
> code that was generated with S1?
> Also - any ETA on release?
>
>
>
> On Tuesday, 30 May 2017 19:20:49 UTC+3, Adam Cozzette wrote:
>
>>
>> On Tue, May 30, 2017 at 9:06 AM, Yacov Manevich 
>> wrote:
>>
>>> but there are some corner cases where that would not be true

>>>
>>> Can you elaborate please?
>>>
>>
>> Sure, the main thing is that protocol buffers do not have a canonical
>> format, and there are many ways to create a valid protocol buffer in
>> serialized form that will not match what you would get from serializing a
>> message the ordinary way. For example, you can take two serialized protocol
>> buffers and concatenate them together and get back a valid proto
>> representing the merge of the original two. In that case, un-marshaling and
>> re-marshaling that concatenated message would often result in a smaller
>> size even if there were no unknown fields.
>>
>> Another thing to keep in mind is that we are planning to reintroduce
>> support for preserving unknown fields in proto3, so the behavior around
>> unknown fields is going to change soon.
>>
>>> Why do you want to detect which schema was used to serialize the message
>>> anyway?
>>>
>>> because I have a project where we have a distributed system that part of
>>> it might be upgraded, but another part might not (depends on the costumer /
>>> admin) and I want to prevent an earlier version of the software to process
>>> information that came from a later version, because I want the earlier
>>> version to process the information only when it is upgraded.
>>>
>>
>> I see; in that case I would recommend adding a field to the message that
>> explicitly indicates whether the message is in the old form or new form.
>>
>>
>>> On Tuesday, 30 May 2017 18:59:04 UTC+3, Adam Cozzette wrote:

 I am not that familiar with the Go implementation of protobuf, but my
 guess is that this would be somewhat difficult to do reliably. If you
 un-marshal and re-marshal the message and get a smaller size, then usually
 that would indicate there were some unknown fields, but there are some
 corner cases where that would not be true. Why do you want to detect which
 schema was used to serialize the message anyway?

 On Sun, May 28, 2017 at 10:27 AM, Yacov Manevich 
 wrote:

> Hi all.
>
> I am in need of knowing whether I have received a protobuf message
> (protobuf version 3)
> that has additional fields than the schema that I currently have
> contains.
> Any pointers on how to do so, specifically in golang?
>
>
> *The longer version:*
>
> Let's assume I have a protobuf schema at version v1:
>
>
>
>
>
>
>
>
>
>
> *message Foo {Bar bar = 1;Zig z = 2;Zag zz =
> 3;}message Bar {uint64 n = 1;bytes b = 2;bytes d =
> 3;}*
>
> *Zig *and *Zag *are not relevant
>
> and at some point in the future I will be forced to upgrade it to v2
> by adding another field to *Bar*:
>
>
>
>
>
>
> *message Bar {uint64 n = 1;bytes b = 2;bytes d
> = 3;uint64 nn = 4;}*
>
> Let's also assume that from some reason I can only make modifications
> to the code but not to the proto structure at version v1.
> What I want is to be able to support backward and forward
> compatibility.
>
>- Backward compatibility is easy - if the message was created at
>v1, and the application (written in golang) is at version v2,  the nn 
> field
>will have the zero value and I can have the application act 
> accordingly.
>- Forward compatibility - I want to be able to know at the runtime
>of application v1 that given bytes of a serialized message, that 
> whether 

Re: [protobuf] Detecting unknown field in raw message

2017-05-30 Thread Yacov Manevich

>
> Another thing to keep in mind is that we are planning to reintroduce 
> support for preserving unknown fields in proto3, so the behavior around 
> unknown fields is going to change soon.
>

Can you elaborate on this as well please? Does that mean that given a 
schema S1 that is a sub-schema of S2 (every field in S1 is contained in S2) 
I could know that I have additional fields in a message object of S2 given 
code that was generated with S1?
Also - any ETA on release?



On Tuesday, 30 May 2017 19:20:49 UTC+3, Adam Cozzette wrote:
>
>
> On Tue, May 30, 2017 at 9:06 AM, Yacov Manevich  > wrote:
>
>> but there are some corner cases where that would not be true
>>>
>>
>> Can you elaborate please?
>>
>
> Sure, the main thing is that protocol buffers do not have a canonical 
> format, and there are many ways to create a valid protocol buffer in 
> serialized form that will not match what you would get from serializing a 
> message the ordinary way. For example, you can take two serialized protocol 
> buffers and concatenate them together and get back a valid proto 
> representing the merge of the original two. In that case, un-marshaling and 
> re-marshaling that concatenated message would often result in a smaller 
> size even if there were no unknown fields.
>
> Another thing to keep in mind is that we are planning to reintroduce 
> support for preserving unknown fields in proto3, so the behavior around 
> unknown fields is going to change soon.
>
>> Why do you want to detect which schema was used to serialize the message 
>> anyway?
>>
>> because I have a project where we have a distributed system that part of 
>> it might be upgraded, but another part might not (depends on the costumer / 
>> admin) and I want to prevent an earlier version of the software to process 
>> information that came from a later version, because I want the earlier 
>> version to process the information only when it is upgraded.
>>
>  
> I see; in that case I would recommend adding a field to the message that 
> explicitly indicates whether the message is in the old form or new form.
>  
>
>> On Tuesday, 30 May 2017 18:59:04 UTC+3, Adam Cozzette wrote:
>>>
>>> I am not that familiar with the Go implementation of protobuf, but my 
>>> guess is that this would be somewhat difficult to do reliably. If you 
>>> un-marshal and re-marshal the message and get a smaller size, then usually 
>>> that would indicate there were some unknown fields, but there are some 
>>> corner cases where that would not be true. Why do you want to detect which 
>>> schema was used to serialize the message anyway?
>>>
>>> On Sun, May 28, 2017 at 10:27 AM, Yacov Manevich  
>>> wrote:
>>>
 Hi all.

 I am in need of knowing whether I have received a protobuf message 
 (protobuf version 3)
 that has additional fields than the schema that I currently have 
 contains.
 Any pointers on how to do so, specifically in golang?


 *The longer version:*

 Let's assume I have a protobuf schema at version v1:










 *message Foo {Bar bar = 1;Zig z = 2;Zag zz = 
 3;}message Bar {uint64 n = 1;bytes b = 2;bytes d = 
 3;}*

 *Zig *and *Zag *are not relevant

 and at some point in the future I will be forced to upgrade it to v2 by 
 adding another field to *Bar*:






 *message Bar {uint64 n = 1;bytes b = 2;bytes d 
 = 3;uint64 nn = 4;}*

 Let's also assume that from some reason I can only make modifications 
 to the code but not to the proto structure at version v1.
 What I want is to be able to support backward and forward compatibility.

- Backward compatibility is easy - if the message was created at 
v1, and the application (written in golang) is at version v2,  the nn 
 field 
will have the zero value and I can have the application act accordingly.
- Forward compatibility - I want to be able to know at the runtime 
of application v1 that given bytes of a serialized message, that 
 whether it 
was created with a protobuf schema of v1 or v2. 
How can I do this? 
What I thought of this far is un-marshaling the bytes to a message, 
and then marshalling back to bytes and comparing the length of the two 
buffers I have - if the new buffer is smaller than the first one, it 
 means 
the message 
I got contains additional fields that my protobuf schema (v1) 
wasn't able to recognize.


 Is there a more elegant and sure way to do so?



 Regards and thanks. 

 -- 
 You received this message because you are subscribed to the Google 
 Groups "Protocol Buffers" group.
 To unsubscribe from this group and stop receiving emails 

Re: [protobuf] Detecting unknown field in raw message

2017-05-30 Thread 'Adam Cozzette' via Protocol Buffers
On Tue, May 30, 2017 at 9:06 AM, Yacov Manevich  wrote:

> but there are some corner cases where that would not be true
>>
>
> Can you elaborate please?
>

Sure, the main thing is that protocol buffers do not have a canonical
format, and there are many ways to create a valid protocol buffer in
serialized form that will not match what you would get from serializing a
message the ordinary way. For example, you can take two serialized protocol
buffers and concatenate them together and get back a valid proto
representing the merge of the original two. In that case, un-marshaling and
re-marshaling that concatenated message would often result in a smaller
size even if there were no unknown fields.

Another thing to keep in mind is that we are planning to reintroduce
support for preserving unknown fields in proto3, so the behavior around
unknown fields is going to change soon.

> Why do you want to detect which schema was used to serialize the message
> anyway?
>
> because I have a project where we have a distributed system that part of
> it might be upgraded, but another part might not (depends on the costumer /
> admin) and I want to prevent an earlier version of the software to process
> information that came from a later version, because I want the earlier
> version to process the information only when it is upgraded.
>

I see; in that case I would recommend adding a field to the message that
explicitly indicates whether the message is in the old form or new form.


> On Tuesday, 30 May 2017 18:59:04 UTC+3, Adam Cozzette wrote:
>>
>> I am not that familiar with the Go implementation of protobuf, but my
>> guess is that this would be somewhat difficult to do reliably. If you
>> un-marshal and re-marshal the message and get a smaller size, then usually
>> that would indicate there were some unknown fields, but there are some
>> corner cases where that would not be true. Why do you want to detect which
>> schema was used to serialize the message anyway?
>>
>> On Sun, May 28, 2017 at 10:27 AM, Yacov Manevich 
>> wrote:
>>
>>> Hi all.
>>>
>>> I am in need of knowing whether I have received a protobuf message
>>> (protobuf version 3)
>>> that has additional fields than the schema that I currently have
>>> contains.
>>> Any pointers on how to do so, specifically in golang?
>>>
>>>
>>> *The longer version:*
>>>
>>> Let's assume I have a protobuf schema at version v1:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *message Foo {Bar bar = 1;Zig z = 2;Zag zz =
>>> 3;}message Bar {uint64 n = 1;bytes b = 2;bytes d =
>>> 3;}*
>>>
>>> *Zig *and *Zag *are not relevant
>>>
>>> and at some point in the future I will be forced to upgrade it to v2 by
>>> adding another field to *Bar*:
>>>
>>>
>>>
>>>
>>>
>>>
>>> *message Bar {uint64 n = 1;bytes b = 2;bytes d =
>>> 3;uint64 nn = 4;}*
>>>
>>> Let's also assume that from some reason I can only make modifications to
>>> the code but not to the proto structure at version v1.
>>> What I want is to be able to support backward and forward compatibility.
>>>
>>>- Backward compatibility is easy - if the message was created at v1,
>>>and the application (written in golang) is at version v2,  the nn field
>>>will have the zero value and I can have the application act accordingly.
>>>- Forward compatibility - I want to be able to know at the runtime
>>>of application v1 that given bytes of a serialized message, that whether 
>>> it
>>>was created with a protobuf schema of v1 or v2.
>>>How can I do this?
>>>What I thought of this far is un-marshaling the bytes to a message,
>>>and then marshalling back to bytes and comparing the length of the two
>>>buffers I have - if the new buffer is smaller than the first one, it 
>>> means
>>>the message
>>>I got contains additional fields that my protobuf schema (v1) wasn't
>>>able to recognize.
>>>
>>>
>>> Is there a more elegant and sure way to do so?
>>>
>>>
>>>
>>> Regards and thanks.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to protobuf+u...@googlegroups.com.
>>> To post to this group, send email to prot...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/protobuf.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because 

Re: [protobuf] Detecting unknown field in raw message

2017-05-30 Thread Yacov Manevich

>
> but there are some corner cases where that would not be true
>

Can you elaborate please?

Why do you want to detect which schema was used to serialize the message 
anyway?

because I have a project where we have a distributed system that part of it 
might be upgraded, but another part might not (depends on the costumer / 
admin) and I want to prevent an earlier version of the software to process 
information that came from a later version, because I want the earlier 
version to process the information only when it is upgraded.
 

On Tuesday, 30 May 2017 18:59:04 UTC+3, Adam Cozzette wrote:
>
> I am not that familiar with the Go implementation of protobuf, but my 
> guess is that this would be somewhat difficult to do reliably. If you 
> un-marshal and re-marshal the message and get a smaller size, then usually 
> that would indicate there were some unknown fields, but there are some 
> corner cases where that would not be true. Why do you want to detect which 
> schema was used to serialize the message anyway?
>
> On Sun, May 28, 2017 at 10:27 AM, Yacov Manevich  > wrote:
>
>> Hi all.
>>
>> I am in need of knowing whether I have received a protobuf message 
>> (protobuf version 3)
>> that has additional fields than the schema that I currently have contains.
>> Any pointers on how to do so, specifically in golang?
>>
>>
>> *The longer version:*
>>
>> Let's assume I have a protobuf schema at version v1:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *message Foo {Bar bar = 1;Zig z = 2;Zag zz = 
>> 3;}message Bar {uint64 n = 1;bytes b = 2;bytes d = 
>> 3;}*
>>
>> *Zig *and *Zag *are not relevant
>>
>> and at some point in the future I will be forced to upgrade it to v2 by 
>> adding another field to *Bar*:
>>
>>
>>
>>
>>
>>
>> *message Bar {uint64 n = 1;bytes b = 2;bytes d = 
>> 3;uint64 nn = 4;}*
>>
>> Let's also assume that from some reason I can only make modifications to 
>> the code but not to the proto structure at version v1.
>> What I want is to be able to support backward and forward compatibility.
>>
>>- Backward compatibility is easy - if the message was created at v1, 
>>and the application (written in golang) is at version v2,  the nn field 
>>will have the zero value and I can have the application act accordingly.
>>- Forward compatibility - I want to be able to know at the runtime of 
>>application v1 that given bytes of a serialized message, that whether it 
>>was created with a protobuf schema of v1 or v2. 
>>How can I do this? 
>>What I thought of this far is un-marshaling the bytes to a message, 
>>and then marshalling back to bytes and comparing the length of the two 
>>buffers I have - if the new buffer is smaller than the first one, it 
>> means 
>>the message 
>>I got contains additional fields that my protobuf schema (v1) wasn't 
>>able to recognize.
>>
>>
>> Is there a more elegant and sure way to do so?
>>
>>
>>
>> Regards and thanks. 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to protobuf+u...@googlegroups.com .
>> To post to this group, send email to prot...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/protobuf.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.


Re: [protobuf] Detecting unknown field in raw message

2017-05-30 Thread 'Adam Cozzette' via Protocol Buffers
I am not that familiar with the Go implementation of protobuf, but my guess
is that this would be somewhat difficult to do reliably. If you un-marshal
and re-marshal the message and get a smaller size, then usually that would
indicate there were some unknown fields, but there are some corner cases
where that would not be true. Why do you want to detect which schema was
used to serialize the message anyway?

On Sun, May 28, 2017 at 10:27 AM, Yacov Manevich 
wrote:

> Hi all.
>
> I am in need of knowing whether I have received a protobuf message
> (protobuf version 3)
> that has additional fields than the schema that I currently have contains.
> Any pointers on how to do so, specifically in golang?
>
>
> *The longer version:*
>
> Let's assume I have a protobuf schema at version v1:
>
>
>
>
>
>
>
>
>
>
> *message Foo {Bar bar = 1;Zig z = 2;Zag zz =
> 3;}message Bar {uint64 n = 1;bytes b = 2;bytes d =
> 3;}*
>
> *Zig *and *Zag *are not relevant
>
> and at some point in the future I will be forced to upgrade it to v2 by
> adding another field to *Bar*:
>
>
>
>
>
>
> *message Bar {uint64 n = 1;bytes b = 2;bytes d =
> 3;uint64 nn = 4;}*
>
> Let's also assume that from some reason I can only make modifications to
> the code but not to the proto structure at version v1.
> What I want is to be able to support backward and forward compatibility.
>
>- Backward compatibility is easy - if the message was created at v1,
>and the application (written in golang) is at version v2,  the nn field
>will have the zero value and I can have the application act accordingly.
>- Forward compatibility - I want to be able to know at the runtime of
>application v1 that given bytes of a serialized message, that whether it
>was created with a protobuf schema of v1 or v2.
>How can I do this?
>What I thought of this far is un-marshaling the bytes to a message,
>and then marshalling back to bytes and comparing the length of the two
>buffers I have - if the new buffer is smaller than the first one, it means
>the message
>I got contains additional fields that my protobuf schema (v1) wasn't
>able to recognize.
>
>
> Is there a more elegant and sure way to do so?
>
>
>
> Regards and thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.