removing setCoder call breaks my pipeline. No Coder has been manually specified; you may do so using .setCoder().
Inferring a Coder from the CoderRegistry failed: Unable to provide a Coder for Data. Building a Coder using a registered CoderProvider failed. Reason being the code which is building the pipeline is based on Java Generics. Actual pipeline building code sets a bunch of parameters which are used to construct the pipeline. PCollection<Data> stream = pipeline.apply(userProvidedTransform).get(outputTag).setCoder(userProvidedCoder) So I guess I will need to provide some more information to the framework to make the annotation work. On Sun, Jan 9, 2022 at 1:39 PM Reuven Lax <[email protected]> wrote: > If you annotate your POJO with @DefaultSchema(JavaFieldSchema.class), that > will usually automatically set up schema inference (you'll have to remove > the setCoder call). > > On Sun, Jan 9, 2022 at 1:32 PM gaurav mishra <[email protected]> > wrote: > >> How to set up my pipeline to use Beam's schema encoding. >> In my current code I am doing something like this >> >> PCollection<Data> = >> pipeline.apply(someTransform).get(outputTag).setCoder(AvroCoder.of(Data.class)) >> >> >> On Sun, Jan 9, 2022 at 1:16 PM Reuven Lax <[email protected]> wrote: >> >>> I don't think we make any guarantees about Avro coder. Can you use >>> Beam's schema encoding instead? >>> >>> On Sun, Jan 9, 2022 at 1:14 PM gaurav mishra < >>> [email protected]> wrote: >>> >>>> Is there a way to programmatically check for compatibility? I >>>> would like to fail my unit tests if incompatible changes are made to Pojo. >>>> >>>> On Fri, Jan 7, 2022 at 4:49 PM Luke Cwik <[email protected]> wrote: >>>> >>>>> Check the schema of the avro encoding for the POJO before and after >>>>> the change to ensure that they are compatible as you expect. >>>>> >>>>> On Fri, Jan 7, 2022 at 4:12 PM gaurav mishra < >>>>> [email protected]> wrote: >>>>> >>>>>> This is more of a Dataflow question I guess but asking here in hopes >>>>>> someone has faced a similar problem and can help. >>>>>> I am trying to use "--update" option to update a running Dataflow >>>>>> job. I am noticing that compatibility checks fail any time I add a new >>>>>> field to my data model. Error says >>>>>> >>>>>> The Coder or type for step XYZ has changed >>>>>> >>>>>> >>>>>> I am using a Java Pojo for data. Avro coder to serialize the model. I >>>>>> read somewhere that adding new optional fields to the data should work >>>>>> when updating the pipeline. >>>>>> >>>>>> I am fine with updating the coder or implementation of the model to >>>>>> something which allows me to update the pipeline in cases when I add new >>>>>> optional fields to existing model. Any suggestions? >>>>>> >>>>>>
