removing setCoder call breaks my pipeline.

No Coder has been manually specified;  you may do so using .setCoder().

  Inferring a Coder from the CoderRegistry failed: Unable to provide a
Coder for Data.

  Building a Coder using a registered CoderProvider failed.

Reason being the code which is building the pipeline is based on Java
Generics. Actual pipeline building code sets a bunch of parameters which
are used to construct the pipeline.
PCollection<Data> stream =
pipeline.apply(userProvidedTransform).get(outputTag).setCoder(userProvidedCoder)
So I guess I will need to provide some more information to the framework to
make the annotation work.


On Sun, Jan 9, 2022 at 1:39 PM Reuven Lax <[email protected]> wrote:

> If you annotate your POJO with @DefaultSchema(JavaFieldSchema.class), that
> will usually automatically set up schema inference (you'll have to remove
> the setCoder call).
>
> On Sun, Jan 9, 2022 at 1:32 PM gaurav mishra <[email protected]>
> wrote:
>
>> How to set up my pipeline to use Beam's schema encoding.
>> In my current code I am doing something like this
>>
>> PCollection<Data> =
>> pipeline.apply(someTransform).get(outputTag).setCoder(AvroCoder.of(Data.class))
>>
>>
>> On Sun, Jan 9, 2022 at 1:16 PM Reuven Lax <[email protected]> wrote:
>>
>>> I don't think we make any guarantees about Avro coder. Can you use
>>> Beam's schema encoding instead?
>>>
>>> On Sun, Jan 9, 2022 at 1:14 PM gaurav mishra <
>>> [email protected]> wrote:
>>>
>>>> Is there a way to programmatically check for compatibility? I
>>>> would like to fail my unit tests if incompatible changes are made to Pojo.
>>>>
>>>> On Fri, Jan 7, 2022 at 4:49 PM Luke Cwik <[email protected]> wrote:
>>>>
>>>>> Check the schema of the avro encoding for the POJO before and after
>>>>> the change to ensure that they are compatible as you expect.
>>>>>
>>>>> On Fri, Jan 7, 2022 at 4:12 PM gaurav mishra <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> This is more of a Dataflow question I guess but asking here in hopes
>>>>>> someone has faced a similar problem and can help.
>>>>>> I am trying to use "--update" option to update a running Dataflow
>>>>>> job. I am noticing that compatibility checks fail any time I add a new
>>>>>> field to my data model. Error says
>>>>>>
>>>>>> The Coder or type for step XYZ  has changed
>>>>>>
>>>>>>
>>>>>> I am using a Java Pojo for data.  Avro coder to serialize the model. I 
>>>>>> read somewhere that adding new optional fields to the data should work 
>>>>>> when updating the pipeline.
>>>>>>
>>>>>> I am fine with updating the coder or implementation of the model to 
>>>>>> something which allows me to update the pipeline in cases when I add new 
>>>>>> optional fields to existing model. Any suggestions?
>>>>>>
>>>>>>

Reply via email to