[
https://issues.apache.org/jira/browse/AVRO-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891647#action_12891647
]
Doug Cutting commented on AVRO-600:
-----------------------------------
> This seems like it adds quite a bit of complexity to the base Avro system.
I think this should be easy to implement as a single-pass re-write of the
writer's schema, rewriting any names that are aliases in the reader's schema.
In Java, this will be a single recursive method, plus a single call to this
method in GenericDatumReader just before the ResolvingDecoder is created.
Moreover this can be an optional feature. The schema stored with the data
always fully and accurately describes the data. Applications build using
implementations without this feature would have to manually correlate data
which has different names, as they do today.
Consider an alternate, functionally-equivalent, implementation that puts such
aliases in a separate data structure that's passed to the reader, i.e., an
aliasing feature of that particular reader implementation. Such a feature
would be useful, and would be completely consistent with the Avro
specification. The only difference between that and the proposal here is that
the aliases are made available via the schema to every implementation in a
standard form should they choose to implement this feature.
> add support for type and field name aliases
> -------------------------------------------
>
> Key: AVRO-600
> URL: https://issues.apache.org/jira/browse/AVRO-600
> Project: Avro
> Issue Type: New Feature
> Components: java, spec
> Reporter: Doug Cutting
> Assignee: Doug Cutting
>
> It would be good if Avro would permit one to still read data if a type or
> field name has been changed. I propose we add a notion of name _aliases_.
> Aliases could be listed for every named type and for record fields. The
> writers schema would be permitted to contain any of the aliases.
> In general, this permits one to construct schemas that can read different
> types into a single type. One could use this not just to handle renamings,
> but also to join different datasets. For example, if two datasets each
> contain differently named records with a date and an ip address field, this
> could be used be used to project these both to a single record with just
> those fields.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.