I don't have a specific use-class that is problematic, but was trying to understand how it all works internally. Following your comment about indexes I looked in GenericDatumWriter and sure enough the union is "tagged" so we know which part of the union was written:
case UNION: int index = data.resolveUnion(schema, datum); out.writeIndex(index); write(schema.getTypes().get(index), datum, out); break; That's the bit I was missing! Thanks for the input. Andrew >________________________________ > From: Harsh J <[email protected]> >To: [email protected]; Andrew Kenworthy <[email protected]> >Sent: Monday, January 23, 2012 4:04 PM >Subject: Re: How does Avro mark (string) field delimition? > >The read part is empty as well, when the decoder is asked to read a >'null' type. For null carrying unions, I believe an index is written >out so if the index evals to a null, the same logic works yet again. > >Does not matter if there are two nulls adjacent to one another, >therefore. How do you imagine this ends up being a problem? What >trouble are you running into? > >On Mon, Jan 23, 2012 at 8:08 PM, Andrew Kenworthy ><[email protected]> wrote: >> I have looked at the Avro 1.6.0 code and am not sure how Avro distinguishes >> between field boundaries when reading null values. >> >> The BinaryEncoder class (which is where I land when debugging my code) has >> an empty method for writeNull: how does the parser then distinguuish between >> adjacent nullable fields when reading that data? >> >> Thanks in advance, >> >> Andrew > > > >-- >Harsh J >Customer Ops. Engineer, Cloudera > > >
