[
https://issues.apache.org/jira/browse/BEAM-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969579#comment-15969579
]
Tyler Akidau commented on BEAM-1197:
------------------------------------
A related aspect to consider here is improving the support for temporal joins
via side inputs. [~julianhyde]'s [Streams, joins and temporal
tables|https://docs.google.com/document/d/1RvnLEEQK92axdAaZ9XIU5szpkbGqFMBtzYiIY4dHe0Q/edit]
doc discusses (in a SQL context) what robust semantics here would mean.
> Slowly-changing external data as a side input
> ---------------------------------------------
>
> Key: BEAM-1197
> URL: https://issues.apache.org/jira/browse/BEAM-1197
> Project: Beam
> Issue Type: Wish
> Components: beam-model
> Reporter: Eugene Kirpichov
>
> I've seen repeatedly the following pattern: a user wants to join a
> PCollection against a slowly-changing external dataset: e.g. a file on GCS,
> or a Bigtable, etc.
> Side inputs come to mind, but current side input mechanisms don't allow for
> something like periodically reloading the side input.
> The best hacky solution I came up with for one use case is documented here:
> http://stackoverflow.com/questions/41254028/can-dataflow-sideinput-be-updated-per-window-by-reading-a-gcs-bucket/41271159#41271159
> , we need to do better than this.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)