Hi Nandor,
Thanks a lot for this. What you have said makes logical sense but I am new
to Avro so I am just trying to figure out how the schema definition would
like for the following messages
{"hello1":{"foo": 5}}
{"hello2":{"bar": 10}}
I have tried the following schema definition to parse the above messages
but it didnt quite work so I am wondering how the schema should look like?
{
"type" : "record",
"name" : "data",
"namespace" : "example",
"fields" : [
{"type":"record","name":"hello1","fields":[{"name":"foo",
"type":"int","default":1}]},
{"type":"record","name":"hello2","fields":[{"name":"bar",
"type":"int","default":1}]}
]
}
On Tue, Jan 9, 2018 at 3:22 AM, Nandor Kollar <[email protected]> wrote:
> I think the problem is: you created a union of records, but the Avro
> doesn't know if it is a hello1 record instance, or a hello2 record
> instance. In this case the you should encode the data like this:
> {"hello1":{"foo": 5}}
> {"hello2":{"bar": 10}}
> Here <https://avro.apache.org/docs/1.8.1/spec.html#json_encoding> is the
> relevant part of the specification.
>
> Nandor
>
> On Tue, Jan 9, 2018 at 12:06 PM, kant kodali <[email protected]> wrote:
>
>> Sorry I had a typo I am correcting it here
>>
>> Hi All,
>>
>> I have avro messages in a Kafka topic and the requirement is that I
>> should be able to parse messages that can either have schema1 or schema2. I
>> was thinking to create a union of two records but I am not sure if I am
>> doing it right and I am obviously running into various exceptions like
>> ArrayOutOfBoundsException and so on.
>>
>> so I am going to simplify my problem here. Imagine I have the following
>> as an Example
>>
>> *schema1: *
>>
>> {“type”:“record”,“name”:“hello1”,“fields”:[{“name”:“foo”,“type
>> ”:“int”,“default”:1}]}
>>
>>
>> *schema2: *
>>
>> {“type”:“record”,“name”:“hello2”,“fields”:[{“name”:“bar”,“type
>> ”:“int”,“default”:1}]}
>>
>>
>> and if I do Schema.CreateUnion(Arrays.asList(schema1, schema2)) I get
>> the following
>>
>> *unionSchema:*
>>
>> [{“type”:“record”,“name”:“hello1”,“fields”:[{“name”:“foo”,“type
>> ”:“int”,“default”:1}]},{“type”:“record”,“name”:“hello2”,“
>> fields”:[{“name”:“bar”,“type”:“int”,“default”:1}]}]
>>
>>
>>
>> Now say my messages inside kafka topic will be something like this
>>
>> *message1:*
>>
>> {"foo": 5}
>>
>> *message2: *
>>
>> {"bar": 10}
>>
>>
>> and if I use unionSchema I am unable to parse it! and I am not sure why?
>> I can't find any resources on how to do this online. any suggestions will
>> be great.
>>
>> Thanks!
>>
>> On Tue, Jan 9, 2018 at 3:04 AM, kant kodali <[email protected]> wrote:
>>
>>> Hi All,
>>>
>>> I have avro messages in a Kafka topic and the requirement is that I
>>> should be able to parse messages that can either have schema1 or schema2. I
>>> was thinking to create a union of two records but I am not sure if I am
>>> doing it right and I am obviously running into various exceptions like
>>> ArrayOutOfBoundsException and so on.
>>>
>>> so I am going to simplify my problem here. Imagine I have the following
>>> as an Example
>>>
>>> *schema1: *
>>>
>>> {“type”:“record”,“name”:“hello1”,“fields”:[{“name”:“foo”,“type
>>> ”:“int”,“default”:1}]}
>>>
>>>
>>> *schema2: *
>>>
>>> {“type”:“record”,“name”:“hello2”,“fields”:[{“name”:“bar”,“type
>>> ”:“int”,“default”:1}]}
>>>
>>>
>>> and if I do Schema.CreateUnion(Arrays.asList(schema1, schema2)) I get
>>> the following
>>>
>>> *unionSchema:*
>>>
>>> [{“type”:“record”,“name”:“a”,“fields”:[{“name”:“one”,“type”:
>>> “int”,“default”:1}]},{“type”:“record”,“name”:“b”,“fields”:[{
>>> “name”:“one”,“type”:“int”,“default”:1}]}]
>>>
>>>
>>>
>>> Now say my messages inside kafka topic will be something like this
>>>
>>> *message1:*
>>>
>>> {"foo": 5}
>>>
>>> *message2: *
>>>
>>> {"bar": 10}
>>>
>>>
>>> and if I use unionSchema I am unable to parse it! and I am not sure why?
>>> I can't find any resources on how to do this online. any suggestions will
>>> be great.
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>>
>>
>