On Tue, Sep 9, 2025 at 10:01 AM Yi Hu via dev <dev@beam.apache.org> wrote:

> Thanks all for the input. It helped a lot for the doc. From the feedback,
>

Thanks for pursuing this and all the iteration on the doc.


> - The main concern is that RawType/CoderLogicalType break the strong
> mapping of schema<->coder. This is a valid concern.
>
> - On the other hand, it is a way to make schemas the fundamental concept
> (which is a goal of Beam 3) under the situation that Beam and its ecosystem
> has already evolved for years with many Beam pipelines using (non-portable)
> coders+custom types.
>
> From these feedbacks, I suggest we proceed with CoderLogicalType approach,
> given the requirements noted in "Requirement" section of the doc, and in
> addition,
>
> - We should clearly document that this approach, if implemented, should
> not be used to bypass the schema framework. We always encourage schema-fy
> structured types.
>
> I'll start drafting changes for each supported SDK.
>

I don't think the doc yet reflects the actual discussion, including what I
think the consensus was in
https://docs.google.com/document/d/1PggR27eg96Y8TzB9L29PszrMHPwL9u-JDTQ5N0Vc_5I/edit?disco=AAABqm7vQMU
(but it's worth fleshing this out to ensure we have the same idea of what
we think we're all agreeing on). I added this as option 2b.

It may seem trivial, but I also think we should avoid the name
"CoderLogicalType" and go with something like EncodedBytesLogicalType (open
to other suggestions) which more accurately reflects the fact that coders
may not be present in all SDKs.

- Robert



> On Wed, Sep 3, 2025 at 1:51 PM Yi Hu <ya...@google.com> wrote:
>
>> Hi all,
>>
>> Please find the following design doc for a portable RAW field type
>> enabling arbitrary (serializable) data type to be included and take
>> advantage of the Beam portable schema framework
>>
>> https://s.apache.org/beam-portable-raw-type
>>
>> It aims to solve https://github.com/apache/beam/issues/23374 (as well as
>> https://github.com/apache/beam/issues/19817) as part of schema
>> improvement for Beam 3 (https://github.com/apache/beam/issues/34672).
>>
>> It also includes an appendix of term disambiguation between
>> Beam/Flink/Avro schema systems that might find useful in general.
>>
>> I proposed two alternative designs. Any feedback is welcome!
>>
>> Regards,
>>
>> Yi
>>
>> --
>>
>> Yi Hu, (he/him/his)
>>
>> Software Engineer
>>
>>
>>

Reply via email to