Thanks and updated the doc and writing a section. I meant "CoderLogicalType" should go with `LogicalType<byte[], T>` but doc wasn't updated with it. Sorry for confusion. Will use proposal 2b including the "EncodedBytesLogicalType" naming to proceed.
Regards, Yi On Tue, Sep 9, 2025 at 3:37 PM Robert Bradshaw <rober...@waymo.com> wrote: > On Tue, Sep 9, 2025 at 10:01 AM Yi Hu via dev <dev@beam.apache.org> wrote: > >> Thanks all for the input. It helped a lot for the doc. From the feedback, >> > > Thanks for pursuing this and all the iteration on the doc. > > >> - The main concern is that RawType/CoderLogicalType break the strong >> mapping of schema<->coder. This is a valid concern. >> >> - On the other hand, it is a way to make schemas the fundamental concept >> (which is a goal of Beam 3) under the situation that Beam and its ecosystem >> has already evolved for years with many Beam pipelines using (non-portable) >> coders+custom types. >> >> From these feedbacks, I suggest we proceed with CoderLogicalType >> approach, given the requirements noted in "Requirement" section of the doc, >> and in addition, >> >> - We should clearly document that this approach, if implemented, should >> not be used to bypass the schema framework. We always encourage schema-fy >> structured types. >> >> I'll start drafting changes for each supported SDK. >> > > I don't think the doc yet reflects the actual discussion, including what I > think the consensus was in > https://docs.google.com/document/d/1PggR27eg96Y8TzB9L29PszrMHPwL9u-JDTQ5N0Vc_5I/edit?disco=AAABqm7vQMU > (but it's worth fleshing this out to ensure we have the same idea of what > we think we're all agreeing on). I added this as option 2b. > > It may seem trivial, but I also think we should avoid the name > "CoderLogicalType" and go with something like EncodedBytesLogicalType (open > to other suggestions) which more accurately reflects the fact that coders > may not be present in all SDKs. > > - Robert > > > >> On Wed, Sep 3, 2025 at 1:51 PM Yi Hu <ya...@google.com> wrote: >> >>> Hi all, >>> >>> Please find the following design doc for a portable RAW field type >>> enabling arbitrary (serializable) data type to be included and take >>> advantage of the Beam portable schema framework >>> >>> https://s.apache.org/beam-portable-raw-type >>> >>> It aims to solve https://github.com/apache/beam/issues/23374 (as well >>> as https://github.com/apache/beam/issues/19817) as part of schema >>> improvement for Beam 3 (https://github.com/apache/beam/issues/34672). >>> >>> It also includes an appendix of term disambiguation between >>> Beam/Flink/Avro schema systems that might find useful in general. >>> >>> I proposed two alternative designs. Any feedback is welcome! >>> >>> Regards, >>> >>> Yi >>> >>> -- >>> >>> Yi Hu, (he/him/his) >>> >>> Software Engineer >>> >>> >>>