Hi Felipe,

> I want to implement a join operator which can use different strategies
for joining tuples.
Not all kinds of join strategies can be applied to streaming jobs. Take
sort-merge join as an example, it's impossible to sort an unbounded data.
However, you can perform a window join and use the sort-merge strategy to
join the data within a window. Even though, I'm not sure it's worth to do
it considering the performance.

> Therefore, I am not sure if I will need to implement my own operator to
do this or if it is still possible to do with CoProcessFunction.
You can't implement broadcast join with CoProcessFunction. But you can
implement it with BroadcastProcessFunction or
KeyedBroadcastProcessFunction, more details here[1].

Furthermore, you can take a look at the implementation of both window join
and non-window join in Table API & SQL[2]. The code can be found here[3].
Hope this helps.

Best, Hequn

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sql.html#joins
[3]
https://github.com/apache/flink/tree/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/join


On Tue, Aug 13, 2019 at 11:30 PM Felipe Gutierrez <
felipe.o.gutier...@gmail.com> wrote:

> Hi all,
>
> I want to implement a join operator which can use different strategies for
> joining tuples. I saw that with CoProcessFunction I am able to implement
> low-level joins [1]. However, I do know how to decide between different
> algorithms to join my tuples.
>
> On the other hand, to do a broadcast join I will need to use the broadcast
> operator [2] which yields a BroadcastStream. Therefore, I am not sure if I
> will need to implement my own operator to do this or if it is still
> possible to do with CoProcessFunction.
>
> Does anyone have some clues for this matter?
> Thanks
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/process_function.html#low-level-joins
> [2]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>*
>

Reply via email to