[
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485014#comment-14485014
]
ASF GitHub Bot commented on FLINK-703:
--------------------------------------
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/572#issuecomment-90865966
Thanks @chiwanpark for the PR!
Using an IdentityKeySelector is not the best solution in this case. A
KeySelector<X,Y> transparently converts a `DataSet<X>` into a
`DataSet<Tuple2<Y,X>` and uses the first tuple field as key. Since the
KeySelector used here is an IdentityKeySelector we end up with a `DataSet<X,X>`
which unnecessarily doubles the amount of data.
I will look at this PR later in detail and give some feedback how it could
be improved. Thanks!
> Use complete element as join key.
> ---------------------------------
>
> Key: FLINK-703
> URL: https://issues.apache.org/jira/browse/FLINK-703
> Project: Flink
> Issue Type: Improvement
> Reporter: GitHub Import
> Assignee: Chiwan Park
> Priority: Trivial
> Labels: github-import
> Fix For: pre-apache
>
>
> In some situations such as semi-joins it could make sense to use a complete
> element as join key.
> Currently this can be done using a key-selector function, but we could offer
> a shortcut for that.
> This is not an urgent issue, but might be helpful.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/703
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, user satisfaction,
> Milestone: Release 0.6 (unplanned)
> Created at: Thu Apr 17 23:40:00 CEST 2014
> State: open
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)