[
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190947#comment-16190947
]
Etienne Chauchot commented on BEAM-2993:
----------------------------------------
Your questions are rightful. In more detail, we use the lazy avro coder I was
talking about. It is responsible for determining the schema at runtime and
delegate to the AvroCoder. The thing is that it also stores the obtained schema
to a network registry service. We find it a bad idea to call the network
registry before writing just to get back the schema while we can avoid passing
it to the write transform. But I know, it entails calling once again (in
addition to the call in lazy avro coder) {{GenericRecord.getSchema()}}.
[~ryanskraba] feel free to comment if you have anything to add.
> AvroIO.write without specifying a schema
> ----------------------------------------
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-extensions
> Reporter: Etienne Chauchot
> Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be
> able to write to avro files using {{AvroIO}} without specifying a schema at
> build time. Consider the following use case: a user has a
> {{PCollection<GenericRecord>}} but the schema is only known while running
> the pipeline. {{AvroIO.writeGenericRecords}} needs the schema, but the
> schema is already available in {{GenericRecord}}. We should be able to call
> {{AvroIO.writeGenericRecords()}} with no schema.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)