Thanks for the feedback! Sounds like it'll be a while before this is supported. Would it make sense to formally _not_ support this functionality by raising an early and clear exception if someone does try to use a side input with a combiner? As it stands now, the exception a user gets is very difficult to understand. My motivation for looking into this was a new user I was introducing to Beam ran into this issue and their first impression was that Beam had terrible error handling / messages.
Best, Joey On Thu, Apr 11, 2024 at 3:52 PM Valentyn Tymofieiev via dev < dev@beam.apache.org> wrote: > I took a look and mentioned the PR to a few folks. Couple of thoughts: > - We should avoid Beam adding a high-level functionality only for Batch. > - Supporting Windowing/Triggers would likely be non-trivial and worth > considering early in the design. > - If you'd like to continue working on this, I would suggest to start a > document and gradually cover the following (with help/feedback from others > in this list): > - the motivational example, available workarounds (if any) > - background on aspects of combiner implementation that are relevant. > - proposed approach and considered alternatives > - any runner-specific considerations. > > Thanks, > Valentyn > > On Fri, Mar 29, 2024 at 5:06 AM Joey Tran <joey.t...@schrodinger.com> > wrote: > >> I posted a PoC PR [1] for fixing deferred side inputs with combiners in >> the python SDK. Would someone be willing to take a look at it? >> >> I have it working but could use some feedback on where to take it next. >> It looks like bundle processor combiner operations don't currently support >> side inputs [2] so I added a conditional in `CombinePerKey` that checks >> whether it was instantiated with a side input and if so, use a ParDo-based >> version of the combiner so we can piggyback off of the Do operations >> implementation of side inputs rather than reimplementing it for the >> combiner operation. >> >> [1] https://github.com/apache/beam/pull/30743 >> [2] >> https://github.com/apache/beam/blob/e3fee5156b3515f96dc5ba90ea2fbc6f6be2bd34/sdks/python/apache_beam/runners/worker/operations.py#L1146 >> >