dark2momo commented on issue #5179:
URL: https://github.com/apache/paimon/issues/5179#issuecomment-2701054770

   @fantasy2100 @LinMingQiang the case
   +I(1, 1, 1, 1);
   +I(1, 2, 2, 2);
   +I(1, 3, 3, 1);
   will be aggregate in such order
   +I(1, 3, 3, 1);
   +I(1, 1, 1, 1);
   +I(1, 2, 2, 2);
   Thus, the result is +I(1, 6, 2, 2)
   
   I have tested the logic in `org.apache.paimon.flink.PartialUpdateITCase`, 
FYI: 
   ```Java
       @Test
       public void testSequenceGroupAggregationFunction() {
           for (int i = 1; i <= 2; i++) {
               sql(
                       "CREATE TABLE t" + i +
                               "(k INT, vsum INT, vproduct INT," +
                               " vmax INT, vmin INT, vlast_value INT, 
vlast_non_null_value INT, vlistagg VARCHAR(10), " +
                               "vbool_and BOOLEAN, vbool_or BOOLEAN, " +
                               "vfirst_value INT, vfirst_non_null_value INT, " +
                               "vnested_update Array<Row<nested_id INT, 
nested_a INT>>, " +
                               "g_1 INT, " +
                               "PRIMARY KEY (k) NOT ENFORCED" +
                               ") WITH (" +
                               "'merge-engine' = 'partial-update'," +
                               "'fields.vsum.aggregate-function' = 'sum'," +
                               "'fields.vproduct.aggregate-function' = 
'product'," +
                               "'fields.vmax.aggregate-function' = 'max'," +
                               "'fields.vmin.aggregate-function' = 'min'," +
                               "'fields.vlast_value.aggregate-function' = 
'last_value'," +
                               
"'fields.vlast_non_null_value.aggregate-function' = 'last_non_null_value'," +
                               "'fields.vlistagg.aggregate-function' = 
'listagg'," +
                               "'fields.vbool_and.aggregate-function' = 
'bool_and'," +
                               "'fields.vbool_or.aggregate-function' = 
'bool_or'," +
                               "'fields.vfirst_value.aggregate-function' = 
'first_value'," +
                               
"'fields.vfirst_non_null_value.aggregate-function' = 'first_non_null_value'," +
                               "'fields.vnested_update.aggregate-function' = 
'nested_update'," +
                               "'fields.vnested_update.nested-key' = 
'nested_id'," +
                               "'fields.g_1.sequence-group' = 'vsum,vproduct," +
                               
"vmax,vmin,vlast_value,vlast_non_null_value,vlistagg,vbool_and,vbool_or," +
                               "vfirst_value,vfirst_non_null_value," +
                               "vnested_update');");
   
               sql("INSERT INTO t" + i + " VALUES (1, 1, 1, " +
                       "1, 1, 1, 1, '1', " +
                       "true, false," +
                       "1, 1, " +
                       "Array[ROW(1, 1)]," +
                       "1);");
   
               sql("INSERT INTO t" + i + " VALUES (1, 2, 2, " +
                       "2, 2, 2, 2, '2', " +
                       "true, false," +
                       "2, 2, " +
                       "Array[ROW(2, 2)]," +
                       "2);");
           }
   
           sql("INSERT INTO t1 VALUES (1, 4, 4, " +
                   "4, 0, 4, 4, '4', " +
                   "false, true," +
                   "3, 3, " +
                   "Array[ROW(1, 3)]," +
                   "1);");
   
           sql("INSERT INTO t2 VALUES (1, 4, 4, " +
                   "4, 0, 4, 4, '4', " +
                   "false, true," +
                   "3, 3, " +
                   "Array[ROW(1, 3)]," +
                   "CAST(NULL AS INT));");
   
           Row row = Row.of(2, 2);
           assertThat(sql("SELECT vlast_value, vlast_non_null_value FROM 
t1")).containsExactly(row);
           assertThat(sql("SELECT vlast_value, vlast_non_null_value FROM 
t2")).containsExactly(row);
           List<Row> expected = new ArrayList<>();
           expected.add(Row.of(1, 1));
           expected.add(Row.of(2, 2));
           assertThat(sql("SELECT nested_id, nested_a FROM t1, 
UNNEST(vnested_update)"))
                   .containsExactlyInAnyOrderElementsOf(expected);
           assertThat(sql("SELECT nested_id, nested_a FROM t2, 
UNNEST(vnested_update)"))
                   .containsExactlyInAnyOrderElementsOf(expected);
   
           Row t1Row = Row.of(7, 8, 4, 0, "4,1,2", false, true, 3, 3);
           Row t2Row = Row.of(3, 2, 2, 1, "1,2", true, false, 1, 1);
           assertThat(sql("SELECT vsum, vproduct, vmax, vmin, vlistagg, 
vbool_and, " +
                   "vbool_or, vfirst_value, vfirst_non_null_value FROM 
t1")).containsExactly(t1Row);
           assertThat(sql("SELECT vsum, vproduct, vmax, vmin, vlistagg, 
vbool_and, " +
                   "vbool_or, vfirst_value, vfirst_non_null_value FROM 
t2")).containsExactly(t2Row);
       }
   ```
   
   The following logic is consistent across all aggregation functions.
   
   ```text
   The records having same primary-key will be aggregated, and the logic is:
   1. when all of the `sequence-group` fields are NULL, discard the record.
   2. when some of the `sequence-group` fields are not NULL and those non-NULL 
fields valued smaller/greater than that of the preceding row, performs 
aggregation on the specified fields using the aggregate function 
(fields.<field-name>.aggregate-function) after reordering the rows based on the 
sequence-group field (fields.<field-name>.sequence-group).
   3. when some of the `sequence-group` fields are not NULL and those non-NULL 
fields valued equal to that of the preceding row, the corresponding field value 
to aggregate takes precedence over the previous one. A typical example of this 
behavior is the first_value function.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to