[
https://issues.apache.org/jira/browse/AVRO-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421446#comment-16421446
]
Bridger Howell commented on AVRO-1340:
--------------------------------------
Well, looking the semantics for field defaults and enum defaults, they play
different roles, so you don't necessarily want both of them to apply all of the
time. Allowing an enum field to be optional shouldn't need to have any
connection to allowing an enum's values to not come from a finite set.
Here's an example where you would want a field default, but not an enum default:
* Suppose you have some consumer that reads a record schema
{{UserRegistration}} and fulfills some important requirement off of that
information.
* Now you want to support registering different types of users, so you add an
enum field {{userType}} to indicate which type of user that was registered, and
you want this consumer to understand the old data as having just the old value
of {{UserType.NORMAL}}.
* At this point, you add the enum and field default of {{NORMAL}} and keep
going.
* Now suppose somebody wants to add a value to that enum, of which your
consumer will need to be aware of since it was fulfilling an important
requirement in real-time.
* If the field default also counts as a fallback, that update will go through
compatibly, and your cosumer will start treating this new {{UserType}} value as
just {{UserType.NORMAL}}, even though the consumer needed to be changed to add
support for that value. If the field default and enum default are inseparable,
as soon as the consumer indicated that the enum field was optional, that also
meant that values passed to it no longer had to come from a known finite set.
Alternatively, here is a situation where you would want an enum default and not
a field default:
* Suppose you have some processor that reads {{UserRegistration}} data that
comes with a {{userType}} field, and builds analytics off of that data for user
tracking.
* It can be easily re-run on a particular set of data, but it's important that
the analytics eventually come out correctly.
* Since it's just computing analytics off of the data, it shouldn't block
development of new values of {{UserType}}.
Hence, this processor wants to require that
{{userType}}
is present, but to allow faster development it doesn't want to necessarily
limit values it receives to it's current finite set. If field default and enum
default were tied together, it would be impossible for the schema provide this
guarantee.
> use default to allow old readers to specify default enum value when
> encountering new enum symbols
> -------------------------------------------------------------------------------------------------
>
> Key: AVRO-1340
> URL: https://issues.apache.org/jira/browse/AVRO-1340
> Project: Avro
> Issue Type: Improvement
> Components: spec
> Environment: N/A
> Reporter: Jim Donofrio
> Priority: Minor
>
> The schema resolution page says:
> > if both are enums:
> > if the writer's symbol is not present in the reader's enum, then an
> error is signalled.
> This makes it difficult to use enum's because you can never add a enum value
> and keep old reader's compatible. Why not use the default option to refer to
> one of enum values so that when a old reader encounters a enum ordinal it
> does not recognize, it can default to the optional schema provided one. If
> the old schema does not provide a default then the older reader can continue
> to fail as it does today.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)