Hi Hequn, I am implementing the broadcast and the regular join. As you said I need different functions. My question is more about if I can have an operator which decides beteween broadcast and regular join dynamically. I suppose I will have to extend the generic TwoInputStreamOperator in Flink. Do you have any suggestion?
Thanks On Wed, 14 Aug 2019, 03:59 Hequn Cheng, <chenghe...@gmail.com> wrote: > Hi Felipe, > > > I want to implement a join operator which can use different strategies > for joining tuples. > Not all kinds of join strategies can be applied to streaming jobs. Take > sort-merge join as an example, it's impossible to sort an unbounded data. > However, you can perform a window join and use the sort-merge strategy to > join the data within a window. Even though, I'm not sure it's worth to do > it considering the performance. > > > Therefore, I am not sure if I will need to implement my own operator to > do this or if it is still possible to do with CoProcessFunction. > You can't implement broadcast join with CoProcessFunction. But you can > implement it with BroadcastProcessFunction or > KeyedBroadcastProcessFunction, more details here[1]. > > Furthermore, you can take a look at the implementation of both window join > and non-window join in Table API & SQL[2]. The code can be found here[3]. > Hope this helps. > > Best, Hequn > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sql.html#joins > [3] > https://github.com/apache/flink/tree/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/join > > > On Tue, Aug 13, 2019 at 11:30 PM Felipe Gutierrez < > felipe.o.gutier...@gmail.com> wrote: > >> Hi all, >> >> I want to implement a join operator which can use different strategies >> for joining tuples. I saw that with CoProcessFunction I am able to >> implement low-level joins [1]. However, I do know how to decide between >> different algorithms to join my tuples. >> >> On the other hand, to do a broadcast join I will need to use the >> broadcast operator [2] which yields a BroadcastStream. Therefore, I am not >> sure if I will need to implement my own operator to do this or if it is >> still possible to do with CoProcessFunction. >> >> Does anyone have some clues for this matter? >> Thanks >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/process_function.html#low-level-joins >> [2] >> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html >> *--* >> *-- Felipe Gutierrez* >> >> *-- skype: felipe.o.gutierrez* >> *--* *https://felipeogutierrez.blogspot.com >> <https://felipeogutierrez.blogspot.com>* >> >