My first thought is that this should go in contrib for now. BTW in the Java SDK, field access is integrated directly into ParDo. e.g. you can write
new DoFn<> { @ProcessElement public void process(@FieldAccess("field1") Type1 field1, @FieldAccess("field2") Type2 field2) { ... } } It also supports selecting wildcards (e.g. @FieldAccess("top.*")). I'm not sure how this pattern would translate into the Python SDK though. On Sat, May 10, 2025 at 3:35 AM Joey Tran <joey.t...@schrodinger.com> wrote: > Not currently > > On Sat, May 10, 2025, 12:48 AM Reuven Lax <re...@google.com> wrote: > >> Does this work with nested fields? Can you specify Input_field="a.b.c"? >> >> On Fri, May 9, 2025 at 7:18 PM Joey Tran <joey.t...@schrodinger.com> >> wrote: >> >>> Sure! >>> >>> Given a DoFn that has... >>> >>> def process(self, sentence): >>> yield from sentence.split() >>> >>> >>> You could use it with SchemadParDo as: >>> >>> (p | beam.Create([pvalue.Row(element="hello world", id="id")]) >>> | SchemadParDo(SchemadParDo(SplitSentenceDoFn(), input_field="element", >>> output_field="word")) >>> >>> And it'd produce Row(word="hello", id="id") and Row(word=""world", >>> id="id") >>> >>> On Fri, May 9, 2025, 9:57 PM Reuven Lax via dev <dev@beam.apache.org> >>> wrote: >>> >>>> Can you explain a bit how SchemadParDo works? >>>> >>>> On Fri, May 9, 2025 at 4:49 PM Joey Tran <joey.t...@schrodinger.com> >>>> wrote: >>>> >>>>> I've written a `SchemadParDo(input_field: str, output_field, >>>>> dofn:DoFn)` transform for more easily writing a Schemad transform given a >>>>> DoFn. >>>>> >>>>> Is this something worth upstreaming into the Beam Python SDK? I wrote >>>>> it to make it easier to convert our current set of dofn's into >>>>> schemad dofns for use with the YAML SDK. Just wanted to gauge interest >>>>> before setting up the dev env again >>>>> >>>>