[
https://issues.apache.org/jira/browse/AVRO-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977859#action_12977859
]
Doug Cutting commented on AVRO-656:
-----------------------------------
> this patch would be a major backwards-incompatible change to the spec. In our
> code, we we're using the ["null", "fixed4", "fixed16"] case all the time to
> represent IPv4 or IPv6 addresses
That would nix the patch, then, since we don't want to introduce such an
incompatibility. If C does correctly implement unions as specified then I was
mistaken to assert above that no language did.
So instead perhaps I should fix Java to correctly implement unions as currently
specified:
- fixing union dispatch among records to consider the namespace (easy, should
be compatible, already in this patch)
- adding a getSchema() method to GenericEnumSymbol and GenericFixed so that we
can check the name (incompatible API change, adding a Schema method to the
constructors for these)
Unless there are objections, I'll try this approach.
> writing unions with multiple records, fixed or enums can choose wrong branch
> -----------------------------------------------------------------------------
>
> Key: AVRO-656
> URL: https://issues.apache.org/jira/browse/AVRO-656
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.4.0
> Reporter: Doug Cutting
> Assignee: Doug Cutting
> Fix For: 1.5.0
>
> Attachments: AVRO-656.patch, AVRO-656.patch
>
>
> According to the specification, a union may contain multiple instances of a
> named type, provided they have different names. There are several bugs in
> the Java implementation of this when writing data:
> - for record, only the short-name of the record is checked, so the branch
> for a record of the same name in a different namespace may be used by mistake
> - for enum and fixed, the name of the record is not checked, so the first
> enum or fixed in the union will always be assumed when writing. in many
> cases this may cause the wrong data to be written, potentially corrupting
> output.
> This is not a regression. This has never been implemented correctly by Java.
> Python and Ruby never check names, but rather perform a full, recursive
> validation of content.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.