[
https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-685:
---------------------------------
Labels: github-import stale-minor (was: github-import)
> Add support for semi-joins
> --------------------------
>
> Key: FLINK-685
> URL: https://issues.apache.org/jira/browse/FLINK-685
> Project: Flink
> Issue Type: New Feature
> Components: API / DataSet
> Reporter: GitHub Import
> Assignee: pietro pinoli
> Priority: Minor
> Labels: github-import, stale-minor
>
> A semi-join is basically a join filter. One input is "filtering" and the
> other one is "filtered".
> A tuple of the "filtered" input is emitted exactly once if the "filtering"
> input has one (ore more) tuples with matching join keys. That means that the
> output of a semi-join has the same type as the "filtered" input and the
> "filtering" input is completely discarded.
> In order to support a semi-join, we need to add an additional physical
> execution strategy, that ensures, that a tuple of the "filtered" input is
> emitted only once if the "filtering" input has more than one tuple with
> matching keys. Furthermore, a couple of optimizations compared to standard
> joins can be done such as storing only keys and not the full tuple of the
> "filtering" input in a hash table.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/685
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime,
> Milestone: Release 0.6 (unplanned)
> Created at: Mon Apr 14 12:05:29 CEST 2014
> State: open
--
This message was sent by Atlassian Jira
(v8.3.4#803005)