Hi Elliot,
I can give you a simple example:
- Type1: attributes name, middle_name, last_name and age. middle_name is
optional and age has default value of 35.
- Type2: attributes name, last_name and age. age is optional and has no
default value.
If my schema has union of both types, what would I do with datum: `{"name":
"Hakuna", "last_name": "Matata"}` ? Should age be filled with the default
value?
If middle_name had a default value instead of being optional, would it be
filled, even if this was a datum of Type 2?
> Is it that the equivalent schemas might evolve in a divergent manner over
time or perhaps that by targeting a specific schema you are wanting to
convey some out of band information that may have some meaning to a
consumer, if not Avro
Not sure what you meant here, but I need no schema evolution in my case.
Every version is a completely independent schema.
Users use avro schemas to validate their input data, that's why they're
using avro for.
Thanks,
Marcelo.
On 25 April 2018 at 13:43, Elliot West <[email protected]> wrote:
> A quick question: If the datum is valid in more than one schema, what is
> the scenario where knowing the specific schema is necessary? Is it that the
> equivalent schemas might evolve in a divergent manner over time or perhaps
> that by targeting a specific schema you are wanting to convey some out of
> band information that may have some meaning to a consumer, if not Avro?
>
> Elliot.
>
> On 25 April 2018 at 12:27, Marcelo Valle <[email protected]> wrote:
>
>> I am writing a python program using the official avro library for python,
>> version 1.8.2.
>>
>> This is a simple schema to show my problem:
>>
>> {
>> "type": "record",
>> "namespace": "com.example",
>> "name": "NameUnion",
>> "fields": [
>> {
>> "name": "name",
>> "type": [
>> {
>> "type": "record",
>> "namespace": "com.example",
>> "name": "FullName",
>> "fields": [
>> {
>> "name": "first",
>> "type": "string"
>> },
>> {
>> "name": "last",
>> "type": "string"
>> }
>> ]
>> },
>> {
>> "type": "record",
>> "namespace": "com.example",
>> "name": "ConcatenatedFullName",
>> "fields": [
>> {
>> "name": "entireName",
>> "type": "string"
>> }
>> ]
>> }
>> ]
>> }
>> ]
>> }
>>
>> Possible datums for this schema would be `{"name": {"first": "Hakuna",
>> "last": "Matata"}}` and `{"name": {"entireName": "Hakuna Matata"}}`.
>>
>> However, this gives margin to ambiguity, as not always avro will be able
>> to detect the right schema specified in the union. In this case, either
>> datum will correspond to 1 and only 1 valid schema, but there might be a
>> case where more than 1 schema in the union would be valid.
>>
>> I wonder whether it would be possible to use a datum like `{"name":
>> {"FullName": {"first": "Hakuna", "last": "Matata"}}}`, where the specific
>> union schema name is specified in the datum.
>>
>> Is it possible? How to do it?
>>
>> --
>> Marcelo Valle
>> http://mvalle.com - @mvallebr
>>
>
>
--
Marcelo Valle
http://mvalle.com - @mvallebr