Re: Updating a Running Dataflow Job

gaurav mishra Sun, 09 Jan 2022 13:32:32 -0800

How to set up my pipeline to use Beam's schema encoding.
In my current code I am doing something like this


PCollection<Data> =
pipeline.apply(someTransform).get(outputTag).setCoder(AvroCoder.of(Data.class))


On Sun, Jan 9, 2022 at 1:16 PM Reuven Lax <[email protected]> wrote:

> I don't think we make any guarantees about Avro coder. Can you use Beam's
> schema encoding instead?
>
> On Sun, Jan 9, 2022 at 1:14 PM gaurav mishra <[email protected]>
> wrote:
>
>> Is there a way to programmatically check for compatibility? I would like
>> to fail my unit tests if incompatible changes are made to Pojo.
>>
>> On Fri, Jan 7, 2022 at 4:49 PM Luke Cwik <[email protected]> wrote:
>>
>>> Check the schema of the avro encoding for the POJO before and after the
>>> change to ensure that they are compatible as you expect.
>>>
>>> On Fri, Jan 7, 2022 at 4:12 PM gaurav mishra <
>>> [email protected]> wrote:
>>>
>>>> This is more of a Dataflow question I guess but asking here in hopes
>>>> someone has faced a similar problem and can help.
>>>> I am trying to use "--update" option to update a running Dataflow job.
>>>> I am noticing that compatibility checks fail any time I add a new field to
>>>> my data model. Error says
>>>>
>>>> The Coder or type for step XYZ  has changed
>>>>
>>>>
>>>> I am using a Java Pojo for data.  Avro coder to serialize the model. I 
>>>> read somewhere that adding new optional fields to the data should work 
>>>> when updating the pipeline.
>>>>
>>>> I am fine with updating the coder or implementation of the model to 
>>>> something which allows me to update the pipeline in cases when I add new 
>>>> optional fields to existing model. Any suggestions?
>>>>
>>>>

Re: Updating a Running Dataflow Job

Reply via email to