Re: [protobuf] Re: omitting tag numbers

2010-11-16 Thread Kenton Varda
On Tue, Nov 9, 2010 at 10:42 PM, Christopher Smith cbsm...@gmail.comwrote:

 This aspect could be mostly mitigated by integrating a metadata header in
 to files. For systems with this kind of an approach look at Avro  Hessian.


Problems with that:
1) Protobufs are routinely used to encode small messages of just a few
bytes.  Metadata would almost certainly be larger than the actual messages
in such cases.
2) This metadata would add an extra layer of indirection into the parsing
process which would probably make it much slower than it is today.
3) Interpreting the metadata itself to build that table would add additional
time and memory overhead.  Presumably this would have to involve looking up
field names in hash maps -- expensive operations compared to the things the
protobuf parser does today.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: omitting tag numbers

2010-11-16 Thread Christopher Smith
On Tue, Nov 16, 2010 at 7:28 PM, Kenton Varda ken...@google.com wrote:

 On Tue, Nov 9, 2010 at 10:42 PM, Christopher Smith cbsm...@gmail.comwrote:

 This aspect could be mostly mitigated by integrating a metadata header in
 to files. For systems with this kind of an approach look at Avro  Hessian.


 Problems with that:
 1) Protobufs are routinely used to encode small messages of just a few
 bytes.  Metadata would almost certainly be larger than the actual messages
 in such cases.
 2) This metadata would add an extra layer of indirection into the parsing
 process which would probably make it much slower than it is today.
 3) Interpreting the metadata itself to build that table would add
 additional time and memory overhead.  Presumably this would have to involve
 looking up field names in hash maps -- expensive operations compared to the
 things the protobuf parser does today.


Sorry, wasn't meaning to suggest that changes be made to protobuf. Mostly
just meaning that if that you want that, there are other solutions that are
a better fit. I think Avro in particularly has a solution that mitigates
drawbacks 1-3, at the expense of some additional complexity.

You can hack this in to a protobuf solutions though. You just encode the
FileDescriptorSet in to your file header. Then when you start a scan, you
read it in, find out the field numbers that correspond to the field names
you want, and then parse the protobuf's as before. The key thing is the
overhead is only once per file (which presumably has tons of small messages)
and that you transform the parse/query after reading the header to exactly
what you'd have had if you used the field numbers to start with.

Honestly, for me the win with the field numbers tends to be with long term
forward and backward compatibility.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: omitting tag numbers

2010-11-09 Thread Christopher Smith
On Mon, Oct 25, 2010 at 4:11 PM, Henner Zeller henner.zel...@googlemail.com
 wrote:

 On Mon, Oct 25, 2010 at 16:10, maninder batth batth.manin...@gmail.com
 wrote:
  I disagree. You could encode field name in the binary. Then at de-
  serialization, you can read the field descriptor and reconstruct the
  field. There is absolutely no need for tags. They are indeed
  cumbersome.

 If you include the field name, then your throw out part of the
 advantages of protocol buffers out of the window: speed and compact
 binary encoding.


This aspect could be mostly mitigated by integrating a metadata header in to
files. For systems with this kind of an approach look at Avro  Hessian.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: omitting tag numbers

2010-10-25 Thread Paul
ok that makes sense.  thanks!

On Oct 22, 4:02 pm, Henner Zeller henner.zel...@googlemail.com
wrote:
 On Fri, Oct 22, 2010 at 15:01, Paul mjpabl...@gmail.com wrote:
  Hi,

  This may seem like a basic question, but I find having to label
  the .proto file with unique tag numbers for each field a little
  cumbersome, especially if there are a lot of fields.

  message Person {
   required string name = 1;
   required int32 id = 2;
   optional string email = 3;
  }

  Can I define a .proto file without the tag numbers, like so?

  message Person {
   required string name;
   required int32 id;
   optional string email;
  }

 No.

 The reason for this explicit definition is that the protocol buffer is
 'future compatible': fields written with a particular tag will always
 be written with that tag. Consider you want to re-structure the fields
 in your proto buffer to say (Id, name, email) ... then they would get
 a different 'automatic' tag assigned and you wouldn't be able to read
 files written with older binaries. If the tags are assigned, then
 re-arranging fields in the file does not matter.

 -h



  Thanks,
  Paul

  --
  You received this message because you are subscribed to the Google Groups 
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to 
  protobuf+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: omitting tag numbers

2010-10-25 Thread maninder batth
I disagree. You could encode field name in the binary. Then at de-
serialization, you can read the field descriptor and reconstruct the
field. There is absolutely no need for tags. They are indeed
cumbersome.

On Oct 22, 6:02 pm, Henner Zeller henner.zel...@googlemail.com
wrote:
 On Fri, Oct 22, 2010 at 15:01, Paul mjpabl...@gmail.com wrote:
  Hi,

  This may seem like a basic question, but I find having to label
  the .proto file with unique tag numbers for each field a little
  cumbersome, especially if there are a lot of fields.

  message Person {
   required string name = 1;
   required int32 id = 2;
   optional string email = 3;
  }

  Can I define a .proto file without the tag numbers, like so?

  message Person {
   required string name;
   required int32 id;
   optional string email;
  }

 No.

 The reason for this explicit definition is that the protocol buffer is
 'future compatible': fields written with a particular tag will always
 be written with that tag. Consider you want to re-structure the fields
 in your proto buffer to say (Id, name, email) ... then they would get
 a different 'automatic' tag assigned and you wouldn't be able to read
 files written with older binaries. If the tags are assigned, then
 re-arranging fields in the file does not matter.

 -h





  Thanks,
  Paul

  --
  You received this message because you are subscribed to the Google Groups 
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to 
  protobuf+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: omitting tag numbers

2010-10-25 Thread Henner Zeller
On Mon, Oct 25, 2010 at 16:10, maninder batth batth.manin...@gmail.com wrote:
 I disagree. You could encode field name in the binary. Then at de-
 serialization, you can read the field descriptor and reconstruct the
 field. There is absolutely no need for tags. They are indeed
 cumbersome.

If you include the field name, then your throw out part of the
advantages of protocol buffers out of the window: speed and compact
binary encoding.


 On Oct 22, 6:02 pm, Henner Zeller henner.zel...@googlemail.com
 wrote:
 On Fri, Oct 22, 2010 at 15:01, Paul mjpabl...@gmail.com wrote:
  Hi,

  This may seem like a basic question, but I find having to label
  the .proto file with unique tag numbers for each field a little
  cumbersome, especially if there are a lot of fields.

  message Person {
   required string name = 1;
   required int32 id = 2;
   optional string email = 3;
  }

  Can I define a .proto file without the tag numbers, like so?

  message Person {
   required string name;
   required int32 id;
   optional string email;
  }

 No.

 The reason for this explicit definition is that the protocol buffer is
 'future compatible': fields written with a particular tag will always
 be written with that tag. Consider you want to re-structure the fields
 in your proto buffer to say (Id, name, email) ... then they would get
 a different 'automatic' tag assigned and you wouldn't be able to read
 files written with older binaries. If the tags are assigned, then
 re-arranging fields in the file does not matter.

 -h





  Thanks,
  Paul

  --
  You received this message because you are subscribed to the Google Groups 
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to 
  protobuf+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/protobuf?hl=en.

 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: omitting tag numbers

2010-10-25 Thread Henner Zeller
On Mon, Oct 25, 2010 at 16:11, Henner Zeller
henner.zel...@googlemail.com wrote:
 On Mon, Oct 25, 2010 at 16:10, maninder batth batth.manin...@gmail.com 
 wrote:
 I disagree. You could encode field name in the binary. Then at de-
 serialization, you can read the field descriptor and reconstruct the
 field. There is absolutely no need for tags. They are indeed
 cumbersome.

 If you include the field name, then your throw out part of the
 advantages of protocol buffers out of the window: speed and compact
 binary encoding.

.. and you would never be able to rename fields either.



 On Oct 22, 6:02 pm, Henner Zeller henner.zel...@googlemail.com
 wrote:
 On Fri, Oct 22, 2010 at 15:01, Paul mjpabl...@gmail.com wrote:
  Hi,

  This may seem like a basic question, but I find having to label
  the .proto file with unique tag numbers for each field a little
  cumbersome, especially if there are a lot of fields.

  message Person {
   required string name = 1;
   required int32 id = 2;
   optional string email = 3;
  }

  Can I define a .proto file without the tag numbers, like so?

  message Person {
   required string name;
   required int32 id;
   optional string email;
  }

 No.

 The reason for this explicit definition is that the protocol buffer is
 'future compatible': fields written with a particular tag will always
 be written with that tag. Consider you want to re-structure the fields
 in your proto buffer to say (Id, name, email) ... then they would get
 a different 'automatic' tag assigned and you wouldn't be able to read
 files written with older binaries. If the tags are assigned, then
 re-arranging fields in the file does not matter.

 -h





  Thanks,
  Paul

  --
  You received this message because you are subscribed to the Google Groups 
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to 
  protobuf+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/protobuf?hl=en.

 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.