Hi,

Ken's approach of having a joint data type and unioning the streams is
good. This will work seamlessly with checkpoints. Timo (in CC) used the
same approach to implement a prototype of a multi-way join.

A Tuple won't work though because the Tuple serializer does not support
null fields. You can use a Row or implement a custom, Either-like type.

Best, Fabian


TechnoMage <mla...@technomage.com> schrieb am Sa., 7. Apr. 2018, 17:25:

> Thanks for the Tuple suggestion, I may use that.  I was asking about
> building a custom operator (just an idea).  I have since decided I can
> decompose the problem into pairs of streams and emit a stream to the next
> CoFlatMap to get the result I need.  Now to see if the idea works ...
>
> Michael
>
> > On Apr 7, 2018, at 1:10 PM, Ken Krugler <kkrugler_li...@transpac.com>
> wrote:
> >
> > Hi Michael,
> >
> > There isn’t an operator that takes three (or more) streams, AFAIK.
> >
> > There is a CoFlatMapFunction that takes two different streams in, which
> could be used for some types of joins.
> >
> > Streaming joins are (typically) windowed (bounded), by
> time/count/something, so if you can maintain the required windowed state in
> a ProcessFunction then you can implement whatever custom logic is required
> for your join case.
> >
> > And for creating a unioned stream of multiple data types, one easy way
> is via (e.g.) Tuple3<POJO1, POJO2, POJO3>, where only one of the three
> fields is non-null for each tuple.
> >
> > -- Ken
> >
> > PS - I think the u...@flink.apache.org <mailto:u...@flink.apache.org>
> list is probably a better forum for this question.
> >
> >> On Apr 7, 2018, at 10:47 AM, TechnoMage <mla...@technomage.com> wrote:
> >>
> >> In my case I have more elaborate logic to select data from the
> streams.  They are not all the same logical type, though I may be able to
> represent them as the same Java type.  My main question is whether it is
> technically feasible to have a single operator that takes multiple streams
> as input.  For example Operator(stream1, stream2, stream3) and produces an
> output stream.  Can the checkpointing and other logic accomodate this if I
> write sufficient custom code in the operator?
> >>
> >> Michael
> >>
> >>> On Apr 7, 2018, at 10:42 AM, Ken Krugler <kkrugler_li...@transpac.com>
> wrote:
> >>>
> >>> When you say “join” are you talking about a real join (so one or more
> fields can be used as a joining key), or some other operation?
> >>>
> >>> For more than two streams, you can do cascading window joins via
> multiple join()s that reduce your source streams down to a single stream.
> >>>
> >>> If the fields are the same across these streams, then a union()
> followed by say a ProcessFunction that implements your joining logic could
> work.
> >>>
> >>> Or you can convert all the streams to a common tuple format that
> consists of a unions the fields, so you can do a union() and then follow
> that with whatever logic is needed to actually do the join.
> >>>
> >>> Though I’m sure there are more elegant approaches :)
> >>>
> >>> — Ken
> >>>
> >>>
> >>>
> >>>> On Apr 6, 2018, at 5:04 PM, Michael Latta <mla...@technomage.com>
> wrote:
> >>>>
> >>>> I would like to “join” several streams (>3) in a custom operator. Is
> this feasible in Flink?
> >>>>
> >>>>
> >>>> Michael
> >>>
> >>> --------------------------------------------
> >>> http://about.me/kkrugler
> >>> +1 530-210-6378
> >>>
> >>
> >
> > --------------------------------------------
> > http://about.me/kkrugler
> > +1 530-210-6378
> >
>
>

Reply via email to