Hello!

In my experience, the overwhelming majority of UNION are for
"optional" or "nullable" fields, and I really like the versatility of
the "optional?" feature you've added to IDL.  It's an obviously
desirable and compelling improvement!

Outside of IDL, I've been caught on this before (and so have others,
which is why the Java Schema.Parser class can disable validating
defaults).

That being said, there's another disadvantage with backwards and
forwards compatibility for 2 + 3 that I didn't think of earlier.   If
you aren't using schema evolution, you can currently ignore the
default value for serializing and deserializing data.  If we were to
reorder any or some UNIONs to match the default value type correctly,
it would change the binary serialized data.  I suspect this would
cause more problems than it would solve, especially because rewriting
the order in the UNION would not be very obvious.

Something to think about -- what would happen if we were to say in the
spec that the default value for a UNION must match either the first
non-NULL schema, but can also be null if NULL is present at any
position in the UNION?  This would have similar compatibility impacts
as #2 but without having to rewrite the UNION schema.

All my best -- and it's always worth repeating that I'm appreciating
and learning a lot from your work on the IDL side!

Ryan



On Fri, Dec 10, 2021 at 7:30 PM Oscar Westra van Holthe - Kind
<[email protected]> wrote:
>
> Hi everyone,
>
> In PR #1411 <https://github.com/apache/avro/pull/1411> for issue AVRO-3257
> <https://issues.apache.org/jira/browse/AVRO-3257>, I've added some code to
> change/fix a union based on its default value.
> Currently, the change is in idl.jj, line 186-210
> <https://github.com/apache/avro/pull/1411/files/6b2d0b20981d81adde0059ee76441dc80de93c29#diff-376865a18691674de38817b4eea7e64c2c3848094a4730030c67f57f54f1028aR186>
> .
>
> Poll: please state your preference:
>
>    1. Keep the change in idl.jj (in avro-compiler)
>    Backwards compatibility: perfect
>    Forwards compatibility: perfect, except for IDL files that use the new ?
>    type suffix
>    Advantage: it's only a local change, specific to the ? suffix, so
>    guaranteed not to introduce bugs
>    Disadvantage: people who prefer the old syntax cannot benefit
>    2. Move the change to the Schema class (in avro-core), and adjust for
>    all optional schemas (unions of one schema with null)
>    Backwards compatibility: perfect
>    Forwards compatibility: partial (JSON schemas depending on the new
>    feature break in earlier versions)
>    Advantage: works for all optional schemas, both in IDL and JSON files.
>    Disadvantage: although I'm fairly certain it would work, this does
>    affect schema parsing; code I'm not as familiar with
>    3. Move the change to the Schema class, and adjust for all unions (move
>    the first schema for which the default value validates to the front)
>    Backwards compatibility: perfect
>    Forwards compatibility: partial (JSON schemas depending on the new
>    feature break in earlier versions)
>    Advantage: works for all unions, both in IDL and JSON files.
>    Disadvantage: although I'm fairly certain it would work, this does
>    affect schema parsing; code I'm not as familiar with
>
> Note that all solutions keep the in-memory representation of Schema
> instances intact. As a result, (de)serialisation, the schema fingerprint,
> etc. are not affected.
>
> Kind regards,
> Oscar

Reply via email to