Thanks Brian and Luke!

I'm curious whether Schema supports optional fields like protobuf. In my
use case, most of the fields will be optional and my application only
accesses these field when the value is presented. Also it seems like if I
want to use Schema to transfer data across sdk, I need to define a Schema
in Java and a NamedTuple in python, right?

On Fri, Jun 12, 2020 at 4:11 PM Brian Hulette <bhule...@google.com> wrote:

> > are unknown fields propagated through if the user only reads/modifies a
> row?
> I'm not sure I understand this question. Are you asking about handling
> schema changes?
> The wire format includes the number of fields in the schema, specifically
> so that we can detect when the schema changes. This is restricted to added
> or removed fields at the end of the schema. i.e. if we receive an element
> that says it has N more fields than the schema this coder was created with
> we assume the pipeline was updated with a schema that drops the last N
> fields and ignore the extra fields. Similarly if we receive an element with
> N fewer fields than we expect we'll just fill the last N fields with nulls.
> This logic is implemented in Python [1] and Java [2], but it's not
> exercised since no runners actually support pipeline update with schema
> changes.
>
> > how does it work in a pipeline update scenario (downgrade / upgrade)?
> It's a standard coder with a defined spec [3] and tests in
> standard_coders.yaml [4] (although we could certainly use more coverage
> there) so I think pipeline update should work fine, unless I'm missing
> something.
>
> Brian
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189
> [2]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356
> [3]
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864
> [4]
> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364
>
> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>
>> +Boyuan Zhang <boyu...@google.com>
>>
>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> What is the update / compat story around schemas?
>>> * are unknown fields propagated through if the user only reads/modifies
>>> a row?
>>> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>>>
>>> Boyuan has been working on a Kafka via SDF source and have been trying
>>> to figure out which interchange format to use for the "source descriptors"
>>> that feed into the SDF. Some obvious choices are json, avro, proto, and
>>> Beam schemas all with their caveats.
>>>
>>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bhule...@google.com>
>>> wrote:
>>>
>>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
>>>> Are you planning on using row coder for them?
>>>> If so I want to make sure you're aware of
>>>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>>>> [1]). +Scott Lukas <slu...@google.com> will be working on building out
>>>> the ideas there this summer. His work could be useful for making these IOs
>>>> cross-language (and you would get a mapping to SQL out of it without much
>>>> more effort).
>>>>
>>>> Brian
>>>>
>>>> [1]
>>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>>>
>>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>>> piotr.szuber...@polidea.com> wrote:
>>>>
>>>>> Sure, I'll do that
>>>>>
>>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <chamik...@google.com>
>>>>> wrote:
>>>>> > Great. Thanks for working on this. Can you please add these tasks
>>>>> and JIRAs
>>>>> > to the cross-language transforms roadmap under "Connector/transform
>>>>> > support".
>>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>>>> >
>>>>> > Happy to help if you run into any issues during this task.
>>>>> >
>>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>>>> > Cham
>>>>> >
>>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>>>> piotr.szuber...@polidea.com>
>>>>> > wrote:
>>>>> >
>>>>> > > I added to Jira task of creating cross-language wrappers for Java
>>>>> IOs. It
>>>>> > > will soon be in progress.
>>>>> > >
>>>>> >
>>>>>
>>>>

Reply via email to