[
https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384904#comment-14384904
]
ASF GitHub Bot commented on FLINK-1664:
---------------------------------------
GitHub user fhueske opened a pull request:
https://github.com/apache/flink/pull/541
[FLINK-1664] Adds checks if selected sort key is sortable
- Adds checks if a sort key can be actually sorted.
- The POJO type is defined as non-sortable, because an order would depend
on the undefined order of POJO fields.
- Adds a few more tests for API sort functions
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/fhueske/flink sortOnPojo
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/541.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #541
----
commit e26d934eb1b2c14298900c53e8413487ce43a17a
Author: Fabian Hueske <[email protected]>
Date: 2015-03-27T20:37:59Z
[FLINK-1664] Adds check if a selected sort key is sortable
----
> Forbid sorting on POJOs
> -----------------------
>
> Key: FLINK-1664
> URL: https://issues.apache.org/jira/browse/FLINK-1664
> Project: Flink
> Issue Type: Bug
> Components: JobManager
> Affects Versions: 0.8.0, 0.9
> Reporter: Fabian Hueske
> Assignee: Fabian Hueske
>
> Flink's groupSort, partitionSort, and outputSort operators allow to sort
> partitions or groups of a DataSet.
> If the sort is defined on a POJO field, the sort order is not well defined.
> Internally, the POJO is recursively decomposed into atomic fields (primitives
> or generic types) and sorted by sorting these atomic fields. Thereby, the
> order of these atomic fields is not well defined (I believe it is
> lexicographic order of the POJO's member names).
> IMO, the best approach is to forbid sorting on POJO types for now. Instead,
> it is always possible to select the nested fields of the POJO that should be
> used for sorting. Later we can relax this restriction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)