[jira] [Commented] (FLINK-1664) Forbid sorting on POJOs
[ https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394968#comment-14394968 ] ASF GitHub Bot commented on FLINK-1664: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/541 Forbid sorting on POJOs --- Key: FLINK-1664 URL: https://issues.apache.org/jira/browse/FLINK-1664 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8.0, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Flink's groupSort, partitionSort, and outputSort operators allow to sort partitions or groups of a DataSet. If the sort is defined on a POJO field, the sort order is not well defined. Internally, the POJO is recursively decomposed into atomic fields (primitives or generic types) and sorted by sorting these atomic fields. Thereby, the order of these atomic fields is not well defined (I believe it is lexicographic order of the POJO's member names). IMO, the best approach is to forbid sorting on POJO types for now. Instead, it is always possible to select the nested fields of the POJO that should be used for sorting. Later we can relax this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1664) Forbid sorting on POJOs
[ https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394070#comment-14394070 ] ASF GitHub Bot commented on FLINK-1664: --- Github user hsaputra commented on a diff in the pull request: https://github.com/apache/flink/pull/541#discussion_r27715729 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/DataSink.java --- @@ -208,6 +214,28 @@ public DataSink(DataSetT data, OutputFormatT format, TypeInformationT type return this; } + private void isValidSortKeyType(int field) { --- End diff -- There are repeating yet similar code for isValidSortKeyType methods to check if sortable. Could the code be moved to utility class to be reused/ shared? Forbid sorting on POJOs --- Key: FLINK-1664 URL: https://issues.apache.org/jira/browse/FLINK-1664 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8.0, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Flink's groupSort, partitionSort, and outputSort operators allow to sort partitions or groups of a DataSet. If the sort is defined on a POJO field, the sort order is not well defined. Internally, the POJO is recursively decomposed into atomic fields (primitives or generic types) and sorted by sorting these atomic fields. Thereby, the order of these atomic fields is not well defined (I believe it is lexicographic order of the POJO's member names). IMO, the best approach is to forbid sorting on POJO types for now. Instead, it is always possible to select the nested fields of the POJO that should be used for sorting. Later we can relax this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1664) Forbid sorting on POJOs
[ https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393170#comment-14393170 ] ASF GitHub Bot commented on FLINK-1664: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/541#issuecomment-89010189 Will rebase and merge this in about 24h unless somebody raises a flag. Forbid sorting on POJOs --- Key: FLINK-1664 URL: https://issues.apache.org/jira/browse/FLINK-1664 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8.0, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Flink's groupSort, partitionSort, and outputSort operators allow to sort partitions or groups of a DataSet. If the sort is defined on a POJO field, the sort order is not well defined. Internally, the POJO is recursively decomposed into atomic fields (primitives or generic types) and sorted by sorting these atomic fields. Thereby, the order of these atomic fields is not well defined (I believe it is lexicographic order of the POJO's member names). IMO, the best approach is to forbid sorting on POJO types for now. Instead, it is always possible to select the nested fields of the POJO that should be used for sorting. Later we can relax this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1664) Forbid sorting on POJOs
[ https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384904#comment-14384904 ] ASF GitHub Bot commented on FLINK-1664: --- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/541 [FLINK-1664] Adds checks if selected sort key is sortable - Adds checks if a sort key can be actually sorted. - The POJO type is defined as non-sortable, because an order would depend on the undefined order of POJO fields. - Adds a few more tests for API sort functions You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink sortOnPojo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/541.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #541 commit e26d934eb1b2c14298900c53e8413487ce43a17a Author: Fabian Hueske fhue...@apache.org Date: 2015-03-27T20:37:59Z [FLINK-1664] Adds check if a selected sort key is sortable Forbid sorting on POJOs --- Key: FLINK-1664 URL: https://issues.apache.org/jira/browse/FLINK-1664 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8.0, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Flink's groupSort, partitionSort, and outputSort operators allow to sort partitions or groups of a DataSet. If the sort is defined on a POJO field, the sort order is not well defined. Internally, the POJO is recursively decomposed into atomic fields (primitives or generic types) and sorted by sorting these atomic fields. Thereby, the order of these atomic fields is not well defined (I believe it is lexicographic order of the POJO's member names). IMO, the best approach is to forbid sorting on POJO types for now. Instead, it is always possible to select the nested fields of the POJO that should be used for sorting. Later we can relax this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1664) Forbid sorting on POJOs
[ https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355054#comment-14355054 ] Maximilian Michels commented on FLINK-1664: --- +1 Forbid sorting on POJOs --- Key: FLINK-1664 URL: https://issues.apache.org/jira/browse/FLINK-1664 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8.0, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Flink's groupSort, partitionSort, and outputSort operators allow to sort partitions or groups of a DataSet. If the sort is defined on a POJO field, the sort order is not well defined. Internally, the POJO is recursively decomposed into atomic fields (primitives or generic types) and sorted by sorting these atomic fields. Thereby, the order of these atomic fields is not well defined (I believe it is lexicographic order of the POJO's member names). IMO, the best approach is to forbid sorting on POJO types for now. Instead, it is always possible to select the nested fields of the POJO that should be used for sorting. Later we can relax this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)