[jira] [Commented] (FLINK-5768) Apply new aggregation functions for datastream and dataset tables

ASF GitHub Bot (JIRA) Tue, 28 Feb 2017 07:33:03 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888222#comment-15888222
 ]


ASF GitHub Bot commented on FLINK-5768:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3423#discussion_r103475723
  
    --- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleTimeWindowAggReduceCombineFunction.scala
 ---
    @@ -28,68 +30,74 @@ import org.apache.flink.types.Row
       * [[org.apache.flink.api.java.operators.GroupCombineOperator]].
       * It is used for tumbling time-window on batch.
       *
    -  * @param rowtimePos The rowtime field index in input row
    -  * @param windowSize Tumbling time window size
    -  * @param windowStartPos The relative window-start field position to the 
last field of output row
    -  * @param windowEndPos The relative window-end field position to the last 
field of output row
    -  * @param aggregates The aggregate functions.
    +  * @param windowSize       Tumbling time window size
    +  * @param windowStartPos   The relative window-start field position to 
the last field of output row
    +  * @param windowEndPos     The relative window-end field position to the 
last field of output row
    +  * @param aggregates       The aggregate functions.
       * @param groupKeysMapping The index mapping of group keys between 
intermediate aggregate Row
       *                         and output Row.
       * @param aggregateMapping The index mapping between aggregate function 
list and aggregated value
       *                         index in output Row.
    -  * @param intermediateRowArity The intermediate row field count
    -  * @param finalRowArity The output row field count
    +  * @param finalRowArity    The output row field count
       */
     class DataSetTumbleTimeWindowAggReduceCombineFunction(
    -    rowtimePos: Int,
         windowSize: Long,
         windowStartPos: Option[Int],
         windowEndPos: Option[Int],
    -    aggregates: Array[Aggregate[_ <: Any]],
    +    aggregates: Array[AggregateFunction[_ <: Any]],
         groupKeysMapping: Array[(Int, Int)],
         aggregateMapping: Array[(Int, Int)],
    -    intermediateRowArity: Int,
         finalRowArity: Int)
       extends DataSetTumbleTimeWindowAggReduceGroupFunction(
    -    rowtimePos,
         windowSize,
         windowStartPos,
         windowEndPos,
         aggregates,
         groupKeysMapping,
         aggregateMapping,
    -    intermediateRowArity,
         finalRowArity)
    -  with CombineFunction[Row, Row] {
    +    with CombineFunction[Row, Row] {
     
       /**
         * For sub-grouped intermediate aggregate Rows, merge all of them into 
aggregate buffer,
         *
    -    * @param records  Sub-grouped intermediate aggregate Rows iterator.
    +    * @param records Sub-grouped intermediate aggregate Rows iterator.
         * @return Combined intermediate aggregate Row.
         *
         */
       override def combine(records: Iterable[Row]): Row = {
     
    -    // initiate intermediate aggregate value.
    -    aggregates.foreach(_.initiate(aggregateBuffer))
    -
    -    // merge intermediate aggregate value to buffer.
         var last: Row = null
    -
         val iterator = records.iterator()
    +    val accumulatorList = Array.fill(aggregates.length) {
    +      new JArrayList[Accumulator]()
    +    }
    +
    +    // per each aggregator, collect its accumulators to a list
         while (iterator.hasNext) {
           val record = iterator.next()
    -      aggregates.foreach(_.merge(record, aggregateBuffer))
    +      for (i <- aggregates.indices) {
    +        accumulatorList(i).add(
    --- End diff --
    
    pairwise merging


> Apply new aggregation functions for datastream and dataset tables
> -----------------------------------------------------------------
>
>                 Key: FLINK-5768
>                 URL: https://issues.apache.org/jira/browse/FLINK-5768
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: Shaoxuan Wang
>            Assignee: Shaoxuan Wang
>
> Apply new aggregation functions for datastream and dataset tables
> This includes:
> 1. Change the implementation of the DataStream aggregation runtime code to 
> use new aggregation functions and aggregate dataStream API.
> 2. DataStream will be always running in incremental mode, as explained in 
> 06/Feb/2017 in FLINK5564.
> 2. Change the implementation of the Dataset aggregation runtime code to use 
> new aggregation functions.
> 3. Clean up unused class and method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5768) Apply new aggregation functions for datastream and dataset tables

Reply via email to