[ 
https://issues.apache.org/jira/browse/AVRO-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421446#comment-16421446
 ] 

Bridger Howell commented on AVRO-1340:
--------------------------------------

Well, looking the semantics for field defaults and enum defaults, they play 
different roles, so you don't necessarily want both of them to apply all of the 
time. Allowing an enum field to be optional shouldn't need to have any 
connection to allowing an enum's values to not come from a finite set.

Here's an example where you would want a field default, but not an enum default:
 * Suppose you have some consumer that reads a record schema 
{{UserRegistration}} and fulfills some important requirement off of that 
information.
 * Now you want to support registering different types of users, so you add an 
enum field {{userType}} to indicate which type of user that was registered, and 
you want this consumer to understand the old data as having just the old value 
of {{UserType.NORMAL}}.
 * At this point, you add the enum and field default of {{NORMAL}} and keep 
going.
 * Now suppose somebody wants to add a value to that enum, of which your 
consumer will need to be aware of since it was fulfilling an important 
requirement in real-time.
 * If the field default also counts as a fallback, that update will go through 
compatibly, and your cosumer will start treating this new {{UserType}} value as 
just {{UserType.NORMAL}}, even though the consumer needed to be changed to add 
support for that value. If the field default and enum default are inseparable, 
as soon as the consumer indicated that the enum field was optional, that also 
meant that values passed to it no longer had to come from a known finite set.

Alternatively, here is a situation where you would want an enum default and not 
a field default:
 * Suppose you have some processor that reads {{UserRegistration}} data that 
comes with a {{userType}} field, and builds analytics off of that data for user 
tracking.
 * It can be easily re-run on a particular set of data, but it's important that 
the analytics eventually come out correctly.
 * Since it's just computing analytics off of the data, it shouldn't block 
development of new values of {{UserType}}.
Hence, this processor wants to require that 

{{userType}}
 is present, but to allow faster development it doesn't want to necessarily 
limit values it receives to it's current finite set. If field default and enum 
default were tied together, it would be impossible for the schema provide this 
guarantee.

 

> use default to allow old readers to specify default enum value when 
> encountering new enum symbols
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-1340
>                 URL: https://issues.apache.org/jira/browse/AVRO-1340
>             Project: Avro
>          Issue Type: Improvement
>          Components: spec
>         Environment: N/A
>            Reporter: Jim Donofrio
>            Priority: Minor
>
> The schema resolution page says:
> > if both are enums:
> > if the writer's symbol is not present in the reader's enum, then an
> error is signalled.
> This makes it difficult to use enum's because you can never add a enum value 
> and keep old reader's compatible. Why not use the default option to refer to 
> one of enum values so that when a old reader encounters a enum ordinal it 
> does not recognize, it can default to the optional schema provided one. If 
> the old schema does not provide a default then the older reader can continue 
> to fail as it does today.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to