Hi,

This is a very interesting proposal! I read you comment about side inputs and I 
tend to agree, though I think that side inputs don’t have to be strictly 
streams. It’s easily possible to imagine a Beam where a side input can be based 
on an external system and accessing side input simply goes through to the 
external system. In this world, it would be somewhat hard to reason about side 
input availability and making sure to only process main input when side-input 
is available. Though it’s not unsolvable, I think.

What I like about your solution is that it is implementable as a DoFn, without 
any special support by the Runners. However, I think that in the Flink Runner 
it should be possible to execute this with the Async I/O operator and therefore 
get asynchronous accesses to the external system. I also think that this is not 
always better than batching, though.

Best,
Aljoscha
> On 3. Jul 2017, at 04:36, JingsongLee <[email protected]> wrote:
> 
> Hi all:
> In some scenarios, the user needs to query some information from external kv 
> store in the pipeline.I think we can have a good abstraction that allows 
> users to get as little as possible with the underlying details.Here is a docs 
> of this proposal, would like to receive your feedback.
> https://docs.google.com/document/d/1B-XnUwXh64lbswRieckU0BxtygSV58hysqZbpZmk03A/edit?usp=sharing
> Best, Jingsong Lee
> 

Reply via email to