Sounds like you want a monad, heh. It would be nice if their DoFn took a generic type and you could pass it a selector func to pick out what they need. If you can access their dofn is not too complex, perhaps you just use their processElement implementation directly?
eg class TheirDoFn ..{ void processElement(...){...} } class YourDoFn .. { void processElement() { TheirDoFn().processElement(...) } } Depending on what annotations they're using in their processElement func, it could be trickier or not. You could pass in a mock implementation OutputReceiver, so you can wrap the results and delegate. On Sat, 12 Oct 2024 at 08:51, XQ Hu via user <user@beam.apache.org> wrote: > This sounds like what CDC (Change Data Capture) typically does, which > usually runs as a streaming pipeline. > > On Fri, Oct 11, 2024 at 3:51 PM Joey Tran <joey.t...@schrodinger.com> > wrote: > >> Another basic pattern question for the user group. >> >> Say I have a database of records with an ID and some float property. >> Another team has written and published a transform `SquareRoot`. I want to >> write a pipeline that reads this database and outputs extended records that >> have (ID, foo_prop, squareroot(foo)_prop). How do I do this? >> >> Of course I can strip my records of their ID and then pass in the >> properties straight into `SquareRoot`, but then I have no way to link it >> back to what record the square root corresponds to. Do I just need to ask >> the other team to make their SquareRootDoFn public? Should they have >> included a `SquareRoot.WithKey()` transform that ignores a key? >> >> This feels like it'd be a common pattern but how to approach it feels >> awkward, not sure if I'm missing something obvious so thought I'd ask the >> group. >> >> Cheers, >> Joey >> >> -- >> >> Joey Tran | Staff Developer | AutoDesigner TL >> >> *he/him* >> >> [image: Schrödinger, Inc.] <https://schrodinger.com/> >> >