[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188412#comment-16188412
 ] 

Eugene Kirpichov commented on BEAM-2993:
----------------------------------------

But when you do a schemaless read, you don't get a PCollection<GenericRecord>, 
you get a collection of a custom type. And when writing a collection of custom 
type to Avro, you already don't need to specify a schema at construction time - 
you can use DynamicDestinations to specify different schema for different 
groups of elements. I may be missing something obvious, but can you give a more 
detailed example of a pipeline you'd like to write but currently can't, or that 
is too cumbersome to write, with a before/after example with an API that you 
propose for this?

> AvroIO.write without specifying a schema
> ----------------------------------------
>
>                 Key: BEAM-2993
>                 URL: https://issues.apache.org/jira/browse/BEAM-2993
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Etienne Chauchot
>            Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection<GenericRecord>}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to