I agree that it is unlikely to cause any issues, since it is a pretty
unlikely that someone uses unions like this, but I think we should at
least track in a separate ticket that modification are required in C/C++
too.

On Mon, Mar 30, 2020 at 3:28 PM roger peppe <[email protected]> wrote:

> AIUI, longs are encoded exactly the same as ints, so there should only be
> a problem if your union has more than 217483647 members, which seems
> unlikely to me in practice :)
>
> On Mon, 30 Mar 2020 at 09:00, Andy Le <[email protected]> wrote:
>
>> Hey Nandor,
>>
>> Here what I see:
>> - Java/Perl/Python use int values to encode position indices
>> - C/C++ use long ones instead
>>
>> So is there any incompatibility when a C/C++ program talk to a Java one?
>> If yes, so we have to modify the spec, right?
>>
>>
>> On 2020/03/30 07:55:10, Nandor Kollar <[email protected]> wrote:
>> > I think we should be cautious when changing specification, other
>> language
>> > bindings might already use longs as position index. For example, it
>> appears
>> > that C++ implementation does what the spec says now:
>> >
>> https://github.com/apache/avro/blob/master/lang/c%2B%2B/impl/BinaryDecoder.cc#L230
>> ,
>> > and if we restrict this to int in the spec, then we make a breaking
>> change
>> > for sure, in the unlikely situation when one writes a huge union where
>> the
>> > position fits only into a long, then that won't be a valid Avro file any
>> > more - according to the new spec.
>> >
>> > On Sun, Mar 29, 2020 at 12:27 PM Driesprong, Fokko <[email protected]
>> >
>> > wrote:
>> >
>> > > Hi Anh,
>> > >
>> > > It looks like that you've found an inconsistency in the docs there. I
>> > > think we need to update the docs, and state that an int is being
>> written.
>> > >
>> > > Stay strong!
>> > >
>> > > Cheers, Fokko
>> > >
>> > > Op vr 20 mrt. 2020 om 07:58 schreef Anh Le <[email protected]>:
>> > >
>> > >> Hi guys,
>> > >>
>> > >> I'm reading the current Avro Spec. It states that:
>> > >>
>> > >> > A union is encoded by first writing a long value indicating the
>> > >> zero-based position within the union of the schema of its value. The
>> value
>> > >> is then encoded per the indicated schema within the union.
>> > >>
>> > >> But as I dive through the code base, for example:
>> > >>
>> https://github.com/rdblue/avro-java/blob/master/avro/src/main/java/org/apache/avro/generic/GenericDatumWriter.java#L123-L125
>> ,
>> > >> I see there's no long value here. We've got an Int instead.
>> > >>
>> > >> Would you please tell me if there's any misunderstanding here.
>> > >>
>> > >> Thank you (and be strong)!
>> > >>
>> > >
>> >
>>
>

Reply via email to