[jira] [Commented] (FLINK-3650) Add maxBy/minBy to Scala DataSet API

ASF GitHub Bot (JIRA) Wed, 15 Jun 2016 05:23:22 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331619#comment-15331619
 ]


ASF GitHub Bot commented on FLINK-3650:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1856#discussion_r67150746
  
    --- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala ---
    @@ -699,6 +700,55 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
       }
     
       /**
    +    * Selects an element with minimum value.
    +    *
    +    * The minimum is computed over the specified fields in lexicographical 
order.
    +    *
    +    * Example 1: Given a data set with elements [0, 1], [1, 0], the
    +    * results will be:
    +    *
    +    * minBy(0)[0, 1]
    +    * minBy(1)[1, 0]
    +    * Example 2: Given a data set with elements [0, 0], [0, 1], the
    +    * results will be:
    +    * minBy(0, 1)[0, 0]
    +    * If multiple values with minimum value at the specified fields exist, 
a random one will be
    +    * picked.
    +    * Internally, this operation is implemented as a {@link 
ReduceFunction}.
    +    */
    +  def minBy(fields: Int*) : DataSet[T]  = {
    +    if (!getType.isTupleType) {
    +      throw new InvalidProgramException("DataSet#minBy(int...) only works 
on Tuple types.")
    +    }
    +
    +    reduce(new 
SelectByMinFunction[T](getType.asInstanceOf[TupleTypeInfoBase[T]], 
fields.toArray))
    +  }
    +
    +  /**
    +    * Selects an element with maximum value.
    +    *
    +    * The maximum is computed over the specified fields in lexicographical 
order.
    +    *
    +    * Example 1: Given a data set with elements [0, 1], [1, 0], the
    +    * results will be:
    +    *
    +    * maxBy(0)[1, 0]
    +    * maxBy(1)[0, 1]
    +    * Example 2: Given a data set with elements [0, 0], [0, 1], the
    +    * results will be:
    +    * maxBy(0, 1)[0, 1]
    +    * If multiple values with maximum value at the specified fields exist, 
a random one will be
    +    * picked
    +    * Internally, this operation is implemented as a {@link 
ReduceFunction}.
    +    *
    +    */
    +  def maxBy(fields: Int*) : DataSet[T] = {
    +    if (!getType.isTupleType) {
    +      throw new InvalidProgramException("DataSet#maxBy(int...) only works 
on Tuple types.")
    +    }
    +    reduce(new 
SelectByMaxFunction[T](getType.asInstanceOf[TupleTypeInfoBase[T]], 
fields.toArray))
    +  }
    --- End diff --
    
    Add a new line


> Add maxBy/minBy to Scala DataSet API
> ------------------------------------
>
>                 Key: FLINK-3650
>                 URL: https://issues.apache.org/jira/browse/FLINK-3650
>             Project: Flink
>          Issue Type: Improvement
>          Components: Java API, Scala API
>    Affects Versions: 1.1.0
>            Reporter: Till Rohrmann
>            Assignee: ramkrishna.s.vasudevan
>
> The stable Java DataSet API contains the API calls {{maxBy}} and {{minBy}}. 
> These methods are not supported by the Scala DataSet API. These methods 
> should be added in order to have a consistent API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3650) Add maxBy/minBy to Scala DataSet API

Reply via email to