Re: Major update submitted; 2.0.2 release soon

Kenton Varda Thu, 25 Sep 2008 10:41:48 -0700

I'll be updating the documentation to explain this better soon.  It's a bit
complicated.  Check out unittest_custom_options.proto for an example.  Let
me try to explain briefly...


First of all, note that UninterpretedOption is actually only ever used
within protoc, between the parsing phase and the cross-linking phase.
 During cross-linking, the UninterpretedOptions are converted to regular
extensions.  Users *never* use UninterpretedOption.  So, you could consider
UninterpretedOption to be an implementation detail of protoc and ignore it.
 Since you've written your own parser, you might want to use
UninterpretedOption as part of a similar design, but you might also want to
do something completely different; as long as the values eventually end up
as regular extensions, it's up to you.

OK, on to the syntax...

Let's take the example of a field option.  You see that the FieldOptions
message in descriptor.proto now allows extensions.  You might declare an
extension like:

  extend google.protobuf.FieldOptions {
    optional string foo = 51234;
  }

Now when you declare a field, you can specify that the above extension
should be set in that field's descriptor with syntax like:

  message MyMessage {
    optional int32 bar = 1 [(foo) = "baz"];
  }

The parentheses here say that the name within them is naming an extension,
not a regular field.  The name might even be qualified.  For example, if
"foo" had been declared in another package, e.g.:

  package pkg;
  extend google.protobuf.FieldOptions {
    optional string foo = 51234;
  }

And now you wanted to use "foo" from some other package, you'd have to do
something like:

  package other;
  message MyMessage {
    optional int32 bar = 1 [(pkg.foo) = "baz"];
  }

However, the qualification is not necessary if "foo" is in the same package.
 Just as with type names, extension names are resolved relative to the
current scope.

I think something that is confusing you is syntax like this:

  message MyMessage {
    optional int32 bar = 1 [(foo).bar = "baz"];
  }

Here, "foo" is naming an extension whose type is a message, and bar is a
field within that message.  E.g., foo might be defined as:

  message FooType {
    optional string bar = 1;
    optional int32 baz = 2;
  }
  extend google.protobuf.FieldOptions {
    optional FooType foo = 51234;
  }

It gets really complicated when FooType itself might have extensions.  Then
we might see something like "(foo).(bar) = 1".  This means that "bar" is an
extension of FooType.  I don't expect this to be used much, but it's
supported for consistency.

On Thu, Sep 25, 2008 at 9:18 AM, Chris <[EMAIL PROTECTED]> wrote:

> I can support the new options in Haskell right after I finish the unknown
> field loading/storing/writing.
> But work is busy right now, so I won't estimate how long I will take.
>
> I have more than a few questions about the new options below:
>
> Kenton Varda wrote:
>
>>  * It is now possible to define custom "options", which are basically
>>    annotations which may be placed on definitions in a .proto file.
>>    For example, you might define a field option called "foo" like so:
>>      import "google/protobuf/descriptor.proto"
>>      extend google.protobuf.FieldOptions {
>>        optional string foo = 12345;
>>      }
>>    Then you annotate a field using the "foo" option:
>>      message MyMessage {
>>        optional int32 some_field = 1 [(foo) = "bar"]
>>      }
>>    The value of this option is then visible via the message's
>>    Descriptor:
>>      const FieldDescriptor* field =
>>        MyMessage::descriptor()->FindFieldByName("some_field");
>>      assert(field->options().GetExtension(foo) == "bar");
>>    This feature has been implemented and tested in C++ and Java.
>>    Other languages may or may not need to do extra work to support
>>    custom options, depending on how they construct descriptors.
>>
> I have just done the svn checkout to look at the new descriptor.proto file.
>
> In Haskell I generate code from descriptor.proto and parse the lexical
> tokens into
> precisely these message structures.  Your new descriptor.proto looks just
> as perfect
> to parse into as the old one, so I expect no trouble adding support for
> custom options.
>
> I see the meat of the addition is encoded as "UninterpretedOption":
>
>>
>> // A message representing a option the parser does not recognize. This
>> only
>> // appears in options protos created by the compiler::Parser class.
>> // DescriptorPool resolves these when building Descriptor objects.
>> Therefore,
>> // options protos in descriptor objects (e.g. returned by
>> Descriptor::options(),
>> // or produced by Descriptor::CopyTo()) will never have
>> UninterpretedOptions
>> // in them.
>> message UninterpretedOption {
>>  // The name of the uninterpreted option.  Each string represents a
>> segment in
>>  // a dot-separated name.  is_extension is true iff a segment represents
>> an
>>  // extension (denoted with parentheses in options specs in .proto files).
>>  // E.g.,{ ["foo", false], ["bar.baz", true], ["qux", false] } represents
>>  // "foo.(bar.baz).qux".
>>  message NamePart {
>>    required string name_part = 1;
>>    required bool is_extension = 2;
>>  }
>>  repeated NamePart name = 2;
>>
>>  // The value of the uninterpreted option, in whatever type the tokenizer
>>  // identified it as during parsing. Exactly one of these should be set.
>>  optional string identifier_value = 3;
>>  optional uint64 positive_int_value = 4;
>>  optional int64 negative_int_value = 5;
>>  optional double double_value = 6;
>>  optional bytes string_value = 7;
>> }
>>
>
> This how I thought it would look, but the "NamePart" is a new twist.  My
> lexical tokens are
> close enough to your to make this work well.  Two questions about such
> tokens:
>
>  the "identifier_value" is a name such as ".foo17.bar_Baz.Qux" with periods
> or are periods disallowed?
>
>  the "string_value" is has type "bytes" instead of "string" so I am
> confused about,  do you require this to be a UTF8 string (to match the field
> name) or allow it be any sequence of bytes?
>
> My bigger questions are:
>
>  What is the rationale, semantics, and syntax of the name here?
>  Does "foo.(bar.baz).qux" show up in a proto file in practice?
>
> My initial hypothesis is that [ foo = "bar" ] has is_extension False and
> that
> foo must be a named field in the corresponding option;
> and that [ (foo) = "bar" ] has is_extension True and that
> foo must be from an extend declaration.
> I see how [ foreign.import.(foo) = "bar" ] would be needed if foo were in
> an extend declaration in an imported file.
> I cannot see how (bar.baz) or (baz).qux could be needed.
>
> Of all the"*_value" fields, will there ever be more or less than one of
> these set?
>
> As for semantics: If I declared  [ (foo) = "Hello", (foo) = "World"] then I
> presume this will appear as
> two UninterpretedOption messages.  But how will your code react to this?
>  Take the first one? last one?
> Merge them somehow?  Is this option dependent or is there a policy?  How
> hard and fast is this policy?
>
> Cheers,
>  Chris
>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Major update submitted; 2.0.2 release soon

Reply via email to