Re: Limited join with stop condition

2019-10-11 Thread Alexey Romanenko
Many thanks for your ideas, everybody, I really appreciate it. I’m going to play with Stateful DoFn and see if it will work for us. > And I have to ask, though, can you build indices instead of brute force for > the join? Answering your question, Kenn. Yes, potentially, we can build indices for

Re: Limited join with stop condition

2019-10-10 Thread Reza Rokni
Hi, Agreed with the others that this does not sound like a good fit... But to explore ideas... One possible (complicated and error prone) way this could be done, ... Beam does not support cycles, but you could use an external unbounded source as a way of sending impulse out and then back into

Re: Limited join with stop condition

2019-10-10 Thread Kenneth Knowles
Interesting! I agree with Luke that it seems not a great fit for Beam in the most rigorous sense. There are many considerations: 1. We assume ParDo has side effects by default. So the model actual *requires* eager evaluation, not lazy, in order to make all the side effects happen. But for your

Re: Limited join with stop condition

2019-10-10 Thread Luke Cwik
This doesn't seem like a good fit for Apache Beam but have you tried: * using a StatefulDoFn that performs all the joining and signals the service powering the sources to stop sending data once your criteria is met (most services powering these sources won't have a way to be controlled this way)?

Limited join with stop condition

2019-10-10 Thread Alexey Romanenko
Hello, We have a use case and it's not clear how it can be solved/implemented with Beam. I count on community help with this, maybe I miss something that lays on the surface. Let’s say, there are two different bounded sources and one join transform (say GBK) downstream. This Join transform is