Hi Reuven,
I didn't investigate that particular one, but looking into that now, it
looks that is (same as the "classic" join library) builds around CoGBK.
Is that correct? If yes, then it essentially means that it:
- works only for cases where both sides have the same windowfn (that
is limitation of Flatten that precedes CoGBK)
- when using global window, there has to be trigger and (afaik) there
is no trigger that would guarantee firing after each data element (for
early panes) (because triggers are there to express cost-latency
tradeoff, not semantics)
Moreover, I'd like to define the join semantics so that when there are
available elements from both sides, the fired pane should be ON_TIME,
not EARLY. That essentially means that the fully general case would not
be built around (Co)GBK, but stateful ParDo. There are specific options
where this fully general case "degrades" into forms that can be
efficiently expressed using (Co)GBK, that is true.
Jan
On 11/22/19 6:47 PM, Reuven Lax wrote:
Have you seen the Join library that is part of schemas? I'm curious
whether this fits your needs, or there's something lacking there.
On Fri, Nov 22, 2019 at 12:31 AM Jan Lukavský <[email protected]
<mailto:[email protected]>> wrote:
Hi,
based on roadmap [1], we would like to define and implement a full
set
of (unified) stream-stream joins. That would include:
- joins (left, right, full outer) on global window with "immediate
trigger"
- joins with different windowing functions on left and right side
The approach would be to define these operations in a natural way, so
that the definition is aligned with how current joins work (same
windows, cartesian product of values with same keys, output timestamp
projected to the end of window, etc.). Because this should be a
generic
approach, this effort should probably be part of join library,
that can
the be reused by other components, too (e.g. SQL).
The question is - is (or was) there any effort that we can build
upon?
Or should this be designed from scratch?
Jan
[1] https://beam.apache.org/roadmap/euphoria/