[ 
https://issues.apache.org/jira/browse/BEAM-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248295#comment-16248295
 ] 

James Xu commented on BEAM-3171:
--------------------------------

I like the ability to external joining a dimension table. 

After read the external join link you referenced, seems [~kenn] are objecting 
to implement it as a new PTransform in the Beam model, instead we should 
optimize the SideInput to let the beam program can get values by keys, and let 
the runner decide to load all data in or just query the remote KV store every 
time, I like this solution, it's cleaner.

But before we have the optimized SideInput, there's no harm to have an improved 
external join first.

> convert a join into lookup
> --------------------------
>
>                 Key: BEAM-3171
>                 URL: https://issues.apache.org/jira/browse/BEAM-3171
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Xu Mingmin
>            Assignee: Xu Mingmin
>              Labels: experimental
>
> We use BeamSQL to run streaming jobs mostly, and  add a join_as_lookup 
> improvement(internal branch) to cover the streaming-to-batch case(similar as 
> [1]). I could submit a PR as experimental if people are interested. 
> The rough solution is, if one source of join node implements 
> {{BeamSeekableTable}} and the other is not, then the join node is converted 
> to a fact-lookup operation.
> Ref:
> [1] 
> https://docs.google.com/document/d/1B-XnUwXh64lbswRieckU0BxtygSV58hysqZbpZmk03A/edit?usp=sharing
>  
> [~xumingming] [~takidau] for any comments



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to