I agree that it is unlikely to cause any issues, since it is a pretty unlikely that someone uses unions like this, but I think we should at least track in a separate ticket that modification are required in C/C++ too.
On Mon, Mar 30, 2020 at 3:28 PM roger peppe <[email protected]> wrote: > AIUI, longs are encoded exactly the same as ints, so there should only be > a problem if your union has more than 217483647 members, which seems > unlikely to me in practice :) > > On Mon, 30 Mar 2020 at 09:00, Andy Le <[email protected]> wrote: > >> Hey Nandor, >> >> Here what I see: >> - Java/Perl/Python use int values to encode position indices >> - C/C++ use long ones instead >> >> So is there any incompatibility when a C/C++ program talk to a Java one? >> If yes, so we have to modify the spec, right? >> >> >> On 2020/03/30 07:55:10, Nandor Kollar <[email protected]> wrote: >> > I think we should be cautious when changing specification, other >> language >> > bindings might already use longs as position index. For example, it >> appears >> > that C++ implementation does what the spec says now: >> > >> https://github.com/apache/avro/blob/master/lang/c%2B%2B/impl/BinaryDecoder.cc#L230 >> , >> > and if we restrict this to int in the spec, then we make a breaking >> change >> > for sure, in the unlikely situation when one writes a huge union where >> the >> > position fits only into a long, then that won't be a valid Avro file any >> > more - according to the new spec. >> > >> > On Sun, Mar 29, 2020 at 12:27 PM Driesprong, Fokko <[email protected] >> > >> > wrote: >> > >> > > Hi Anh, >> > > >> > > It looks like that you've found an inconsistency in the docs there. I >> > > think we need to update the docs, and state that an int is being >> written. >> > > >> > > Stay strong! >> > > >> > > Cheers, Fokko >> > > >> > > Op vr 20 mrt. 2020 om 07:58 schreef Anh Le <[email protected]>: >> > > >> > >> Hi guys, >> > >> >> > >> I'm reading the current Avro Spec. It states that: >> > >> >> > >> > A union is encoded by first writing a long value indicating the >> > >> zero-based position within the union of the schema of its value. The >> value >> > >> is then encoded per the indicated schema within the union. >> > >> >> > >> But as I dive through the code base, for example: >> > >> >> https://github.com/rdblue/avro-java/blob/master/avro/src/main/java/org/apache/avro/generic/GenericDatumWriter.java#L123-L125 >> , >> > >> I see there's no long value here. We've got an Int instead. >> > >> >> > >> Would you please tell me if there's any misunderstanding here. >> > >> >> > >> Thank you (and be strong)! >> > >> >> > > >> > >> >
