Hi Ivan,

You're correct about the GenericDatumReader javadoc, but the writer schema
can be adjusted after creation. This is what the DataFileReader does.

So after the DataFileReader is initialised, the underlying
GenericDatumReader uses the the schema in the file as write schema (to
understand the data), and the schema you provided as read schema (to give
data to you via dataFileReader.next(user)).

Does that clarify things for you?


Kind regards,
Oscar


On Wed, 20 Jul 2022 at 10:37, Ivan Tsyba <ivants...@gmail.com> wrote:

> Hello
>
> As stated in Avro Getting Started
> <https://avro.apache.org/docs/current/gettingstartedjava.html#Deserializing> 
> about
> deserialization without code generation: "The data will be read using the
> writer's schema included in the file, and the reader's schema provided to
> the GenericDatumReader". Here is how GenericDatumReader is created in the
> example
>
> DatumReader<GenericRecord> datumReader = new
> GenericDatumReader<GenericRecord>(schema);
>
> But when you look at this GenericDatumReader constructor Javadoc it states
> "Construct where the writer's and reader's schemas are the same." (and
> actual code corresponds to this).
>
> So the writer's schema isn’t taken from a serialized file but from a
> constructor parameter?
>


-- 

✉️ Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>

Reply via email to