The PR issued by Shunxin is based on the Windowed Operator and cannot be used if you don't use Beam's windowing semantics. Also, the PR only provides a framework for generic merge of two input streams to one output stream. We may change the name from "join" to "merge" before we merge that PR since "join" has a specific meaning and is usually associated with a key.
For specific join behaviors like outer/inner/theta joins, additional layers will be required (i.e. implementing JoinAccumulation) on top of what Shunxin has. Hope this helps. David On Wed, Oct 5, 2016 at 11:05 PM, Thomas Weise <[email protected]> wrote: > I would recommend you have a look at that PR and participate in the review > as many of the challenges discussed apply to the work that you are looking > to take up. For example, the state management based on event time is > directly applicable to the same stateful transformation that happens in a > batch job. You could imagine that your batch is a single window and will > need to accumulate the potentially very large state before emitting the > result, which is the equivalent of a watermark for the single large window. > > Thomas > > > On Wed, Oct 5, 2016 at 10:08 PM, Chaitanya Chebolu < > [email protected]> wrote: > > > Hi David, > > > > I am working on Outer Join Using Managed State. This will support event > > time as well as tuple process time. This covers Left Outer, Right Outer, > > Full Outer Join. I am planning to start design discussion on dev@apex. > > > > But, I have seen a PR(#414 > > <https://github.com/apache/apex-malhar/pull/414>) for windowed Join > > Operator. I would like to know whether the outer join is also included in > > this or not. > > > > I don't have much idea about this PR. So, I would like to know whether > my > > task is overlapping the Windowed Join Operator or not. > > > > Regards, > > Chaitanya > > >
