Sounds like you want a monad, heh.

It would be nice if their DoFn took a generic type and you could pass it a
selector func to pick out what they need.
If you can access their dofn is not too complex, perhaps you just use their
processElement implementation directly?

eg

class TheirDoFn ..{ void processElement(...){...} }

class YourDoFn .. {
  void processElement() {
    TheirDoFn().processElement(...)
  }
}

Depending on what annotations they're using in their processElement func,
it could be trickier or not. You could pass in a mock implementation
OutputReceiver, so you can wrap the results and delegate.

On Sat, 12 Oct 2024 at 08:51, XQ Hu via user <user@beam.apache.org> wrote:

> This sounds like what CDC (Change Data Capture) typically does, which
> usually runs as a streaming pipeline.
>
> On Fri, Oct 11, 2024 at 3:51 PM Joey Tran <joey.t...@schrodinger.com>
> wrote:
>
>> Another basic pattern question for the user group.
>>
>> Say I have a database of records with an ID and some float property.
>> Another team has written and published a transform `SquareRoot`. I want to
>> write a pipeline that reads this database and outputs extended records that
>> have (ID, foo_prop, squareroot(foo)_prop). How do I do this?
>>
>> Of course I can strip my records of their ID and then pass in the
>> properties straight into `SquareRoot`, but then I have no way to link it
>> back to what record the square root corresponds to. Do I just need to ask
>> the other team to make their SquareRootDoFn public? Should they have
>> included a `SquareRoot.WithKey()` transform that ignores a key?
>>
>> This feels like it'd be a common pattern but how to approach it feels
>> awkward, not sure if I'm missing something obvious so thought I'd ask the
>> group.
>>
>> Cheers,
>> Joey
>>
>> --
>>
>> Joey Tran | Staff Developer | AutoDesigner TL
>>
>> *he/him*
>>
>> [image: Schrödinger, Inc.] <https://schrodinger.com/>
>>
>

Reply via email to