[
https://issues.apache.org/jira/browse/AVRO-519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859897#action_12859897
]
Doug Cutting commented on AVRO-519:
-----------------------------------
> What is the rational for not permitting a name to be associated with other
> types in a union?
This is discussed in AVRO-248. One rationale is simply that it would be an
incompatible change. Existing implementations should ignore the name, but they
should also generate an error if a union has two "bytes" branches.
A dynamic language needs a way at runtime to distinguish whether "a" or "b" is
used. So one would need to wrap the bytes in something to indicate this. Like
a record. Records can add a name to any type, with no serialized overhead.
> Efficient sparse optional fields support
> ----------------------------------------
>
> Key: AVRO-519
> URL: https://issues.apache.org/jira/browse/AVRO-519
> Project: Avro
> Issue Type: New Feature
> Components: spec
> Reporter: John Plevyak
>
> One of the nice features of protobuf is efficient support for very sparse
> optional fields,
> for example large number of tags potentially associated with a document the
> vast
> majority of which are empty.
> Avro does support optional fields as part of differing specifications, but
> not on a per-record
> level after a protocol has been agreed upon. Avro does have support for
> arrays and maps
> however both of these require homogeneous types.
> I would suggest adding an additional field attribute:
> * "optional" - with values "true"/"false" (where "false" is assumed)
> For the encoding I would suggest that that any record which includes optional
> fields
> would be prefixed by an presence map which would be a sequence of int8 x*
> where:
> x > 0 : the lower 7 bits are presence bits for the next 7 optional fields
> (low bit first)
> -128 < x < 0 : the next present field is position x + 135 (as x runs from 0
> to -127 and the first 7
> must be empty otherwise we would use the x > 0 encoding)
> x == -128: no optional fields present in the next 134 optional fields
> x = 0 : end of sequence
> further, if the map has covered all the options, the end-of-sequence marker
> can be
> elided. For example, a type with 3 optional fields would require only a
> single byte.
> This will permit encoding at 8/7 of a bit per present entry (worst case) and
> at a cost of
> 8/134 (0.06) bits/entry per all but last not-present (7.5 bytes / 1000
> optional fields).
> This encoding is backward compatible as well as schema's which do not contain
> optional
> elements do not have the presence map and the encoding is therefore
> identical. Backward
> compatibility can be maintained by simply using the default value for
> not-present fields.
> Language APIs:
> Efficient support could include either an explicit presence test or a
> function which returns the value
> or default value (if the field is not present).
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.