Github user chiwanpark commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1585#discussion_r52098334
  
    --- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/SortPartitionOperator.java
 ---
    @@ -79,16 +112,33 @@ public SortPartitionOperator(DataSet<T> dataSet, 
String sortField, Order sortOrd
         * local partition sorting of the DataSet.
         *
         * @param field The field expression referring to the field of the 
additional sort order of
    -    *                 the local partition sorting.
    -    * @param order The order  of the additional sort order of the local 
partition sorting.
    +    *              the local partition sorting.
    +    * @param order The order of the additional sort order of the local 
partition sorting.
         * @return The DataSet with sorted local partitions.
         */
        public SortPartitionOperator<T> sortPartition(String field, Order 
order) {
    +           if (useKeySelector) {
    +                   throw new InvalidProgramException("Expression keys 
cannot be appended after selector function keys");
    +           }
    +
                int[] flatOrderKeys = getFlatFields(field);
                this.appendSorting(flatOrderKeys, order);
                return this;
        }
     
    +   /**
    +    * Appends an additional sort order with the specified field in the 
specified order to the
    +    * local partition sorting of the DataSet.
    +    *
    +    * @param keyExtractor The KeySelector function which extracts the key 
value of the additional
    +    *                     sort order of the local partition sorting.
    +    * @param order        The order of the additional sort order of the 
local partition sorting.
    +    * @return The DataSet with sorted local partitions.
    +    */
    +   public <K> SortPartitionOperator<T> sortPartition(KeySelector<T, K> 
keyExtractor, Order order) {
    --- End diff --
    
    If we remove this method, the following code can be executed:
    
    ```java
    DataSet<MyObject> data = ...
    DataSet<MyObject> result = data
      .sortPartition(new KeySelector<MyObject, Integer> {
        public Integer getKey(MyObject value) throws Exception {
          return value.myInt;
        }
      }, Order.ASCENDING)
      .sortPartition(new KeySelector<MyObject, String> {
        public String getKey(MyObject value) throws Exception {
          return value.myString;
        }
      }, Order.ASCENDING);
    ```
    
    In above case, `data` is sorted twice (first by Integer, second by String) 
and the result of first sorting will be ignored. I think this is quite 
confusing for users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to