[
https://issues.apache.org/jira/browse/FLINK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045800#comment-14045800
]
Fabian Hueske commented on FLINK-925:
-------------------------------------
The method signatures
{code}
public <K extends Comparable<K>> UnsortedGrouping<T> groupBy(KeySelector<T, K>
keyExtractor)
public <K extends Comparable<K>> JoinOperatorSetsPredicate
where(KeySelector<I1, K> keySelector)
{code}
restrict K to extend Comparable. This is the restriction that needs to be
removed to enable the use of KeySelectors which return Tuples (for some reason
CoGroup does not have the restriction, so we are good there...).
The Tuple data type cannot be not comparable, because this depends on the types
of its fields. Therefore, we need to check if we can construct a comparator for
the Tuple type that is returned by a KeySelector when the program is
constructed.
We need to be a bit careful with statements like "I understood that the
groupBy() has to be working with Tuple datatypes." Are you referring to the
type of the DataSet which is grouped or the type of the grouping key?
FieldPositionKeys can only be specified for Tuple DataSets (other datasets do
not have fields...).
> Support KeySelector function returning Tuples
> ---------------------------------------------
>
> Key: FLINK-925
> URL: https://issues.apache.org/jira/browse/FLINK-925
> Project: Flink
> Issue Type: Improvement
> Affects Versions: 0.6-incubating
> Reporter: Fabian Hueske
> Assignee: Tobias
> Priority: Minor
> Labels: starter
>
> KeySelector functions are used to extract keys on which DataSets can be
> grouped or joined.
> Currently, the keys types returned by KeySelector function are restricted to
> be comparable. However, Flinks Tuple data types are not comparable (because
> this depends on the types of its fields) which makes grouping and joining on
> composite keys difficult.
> We should change the signature of the groupBy(), join(), and coGroup()
> methods to allow also non-comparable keys as return types of a KeySelector
> function.
> Instead we will check at optimization time whether the returned type is
> comparable (which is true for tuples if all elements are comparable).
--
This message was sent by Atlassian JIRA
(v6.2#6252)