Re: Tagged union types and "switch" statements

2009-04-13 Thread Kenton Varda
Thanks for doing this!  Can you send it to me as a code review via
codereview.appspot.com?  Also, you'll need to sign the CLA if you haven't
already:
http://code.google.com/legal/individual-cla-v1.0.html -- If you own
copyright on your work.
http://code.google.com/legal/corporate-cla-v1.0.html -- If your employer
does.

(Make sure to send me an e-mail after setting up the code review; sometimes
it doesn't notify me itself.)

On Wed, Apr 8, 2009 at 7:35 PM, Michael Poole  wrote:

> Kenton,
>
> Here's a first pass of this.  pb-extension-numbers.diff is the
> interesting part, with unit test changes up front to show what the
> user interface looks like.  pb-extension-descriptor.diff is the
> corresponding patch to src/google/protobuf/descriptor.pb.{cc,h}.
>
> Particular areas where my choices might not match the desired style:
>
> I replaced most of the expansions of "$number$" in the C++ and Java
> code with the new "$constant_name$" -- partly to make sure the
> constants work in many use cases, but mostly because the code was
> repeating itself.  (g++ 4.3.2 on x86_64 generated assembly with 26,264
> differences for descriptor.pb.cc, but they did not appear material:
> just debugging information and the new class-static variables.)
>
> For C++ and Java, I used different suffixes for field constants
> (e.g. classname::kBbFieldNumber) than for extensions
> (e.g. kOptionalInt32ExtensionExtensionNumber).
>
> For Python, I left alone what looked like a good-enough approach to
> finding extension numbers, and just made reflection.py add fields
> declaring the field numbers to each message class.  The unit test
> covers both of those interfaces.
>
> Michael Poole
>
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Tagged union types and "switch" statements

2009-04-07 Thread Michael Poole

Kenton Varda writes:

[snip]
> For a field called foo_bar, I think the corresponding field number constant
> should be kFooBarFieldNumber in C++ and FOO_BAR_FIELD_NUMBER in Java.  Not 
> sure
> what the preferred style is in Python.
>
> If you'd like to make this change, it should be a pretty easy modification to
> google/protobuf/compiler/{cpp/cpp,java/java}_{message,extension}.cc, google/
> protobuf/compiler/python/python_generator.cc, and appropriate unit tests.
>
> What do you think?

That sounds reasonable, and is exactly the kind of feedback I was
hoping for before I wrote too much code.  I'll give it a shot.  Thanks!

Michael Poole

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Tagged union types and "switch" statements

2009-04-07 Thread Kenton Varda
If you implemented this in ConsumeInteger(), it would not follow the usual
symbol lookup semantics used everywhere else in the language.  For example,
you would not be able to refer to enums defined later in the file, or
defined in a different file.  You might be able to handle enums defined
inside a message, but it would be hacky.
Put another way, I would expect this to work:

  import "baz.proto";

  message Bar {
enum Ids { BAR = 2; }
  }

  message OuterMessage {
optional int32 foo = FOO;
optional int32 bar = Bar.BAR;
optional int32 baz = BAZ;  // defined in baz.proto
  }

  enum Ids { FOO = 1; }

While we could just say "Sorry, you can only use enums defined in the same
file, *above* the point where they are first used, and not nested in any
other message.", these restrictions are totally different from the lookup
semantics of any other symbol and would thus be confusing and hard to learn.
 And eventually someone will want to fix these problem anyway.

So, the way you'd have to implement this is to extend the
FieldDescriptorProto message to allow the field number to be specified as a
string, then resolve that string in the cross-linking phase of
DescriptorBuilder (in src/google/protobuf/descriptor.cc).  I think this is
kind of hairy.

I think a much better solution would be to generate some sort of symbolic
constants for field numbers that can be accessed at runtime, so that you
don't have to generate a separate enum.  This would only require updating
the code generators.  Then you don't need to define an enum in the .proto
file at all.

For a field called foo_bar, I think the corresponding field number constant
should be kFooBarFieldNumber in C++ and FOO_BAR_FIELD_NUMBER in Java.  Not
sure what the preferred style is in Python.

If you'd like to make this change, it should be a pretty easy modification
to google/protobuf/compiler/{cpp/cpp,java/java}_{message,extension}.cc,
google/protobuf/compiler/python/python_generator.cc, and appropriate unit
tests.

What do you think?

On Tue, Apr 7, 2009 at 5:44 AM, Michael Poole  wrote:

>
> One of the suggested techniques for union types looks (almost) like this:
>
>  message OneMessage {
>required int32 type = 1;
>extensions 100 to max;
>  }
>
>  enum ExtensionType { FOO = 100; BAR = 101; BAZ = 102; }
>  extend OneMessage {
>optional Foo foo_ext = 100;
>optional Bar bar_ext = 101;
>optional Baz baz_ext = 102;
>  }
>
> C++ code that reads OneMessage messages would often like to switch on
> the "type" field; unfortunately, the constants that identify foo_ext,
> bar_ext and baz_ext are only given in the generated .pb.cc file, and
> cannot be evaluated for use in a C++ "case" statement.
>
> As the number of optional elements increases, the duplication of
> values in the enum and their associated extension identifiers grows,
> and managing the mapping between them becomes an egregious violation
> of the "Don't Repeat Yourself" principle.
>
> To help manage this, I am inclined to write a patch to allow
> google::protobuf::compiler::ConsumeInteger() to look up previously
> defined enum values and return their integer values, so that one could
> instead write:
>
>  extend OneMessage {
>optional Foo foo_ext = FOO;
>// et cetera..
>  }
>
> So, three questions:
>
> 1.) Is there a better way to solve the C++ switch/case problem?
> 2.) Is this a reasonable approach within the architecture of the
>Protocol Buffers compiler?  (That is, if the patch is clean,
>might it be rejected for design reasons?)
> 3.) Assuming the previous two answers indicate a patch is in order,
>what is the best way to submit the patch?
>
> Michael Poole
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---