[ 
https://issues.apache.org/jira/browse/FLINK-31060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31060:
----------------------------------
    Description: 
This issue aims to verify FLINK-30542: Support adaptive local hash aggregate in 
runtime.

Adaptive local hash aggregation is an optimization of local hash aggregation, 
which can adaptively determine whether to continue to do local hash aggregation 
according to the distinct value rate of sampling data. If distinct value rate 
bigger than defined threshold (see parameter: 
'table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold'), we will 
stop aggregating and just send the input data to the downstream after a simple 
projection. Otherwise, we will continue to do aggregation.

We can verify it in SQL client after we build the flink-dist package.
 # Create a source table firstly. (Note: the source table need have different 
degree of aggregation, means the distinct count can be controlled by source 
connector, we recommend to modify dataGen table source to produce different 
data with different distinct row number).
 # Verify the result with different distinct value rate. (See: 
table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold)
 # Check the log in 'TM' to see whether the adaptive local hash aggregate works.

If you meet any problems, it's welcome to ping me directly.

  was:This issue aims to verify FLINK-30542: [Support adaptive local hash 
aggregate in runtime|https://issues.apache.org/jira/browse/FLINK-30542].


> Release Testing: Verify FLINK-30542 Support adaptive local hash aggregate in 
> runtime
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-31060
>                 URL: https://issues.apache.org/jira/browse/FLINK-31060
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / Runtime
>    Affects Versions: 1.17.0
>            Reporter: Yunhong Zheng
>            Priority: Major
>             Fix For: 1.17.0
>
>
> This issue aims to verify FLINK-30542: Support adaptive local hash aggregate 
> in runtime.
> Adaptive local hash aggregation is an optimization of local hash aggregation, 
> which can adaptively determine whether to continue to do local hash 
> aggregation according to the distinct value rate of sampling data. If 
> distinct value rate bigger than defined threshold (see parameter: 
> 'table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold'), we will 
> stop aggregating and just send the input data to the downstream after a 
> simple projection. Otherwise, we will continue to do aggregation.
> We can verify it in SQL client after we build the flink-dist package.
>  # Create a source table firstly. (Note: the source table need have different 
> degree of aggregation, means the distinct count can be controlled by source 
> connector, we recommend to modify dataGen table source to produce different 
> data with different distinct row number).
>  # Verify the result with different distinct value rate. (See: 
> table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold)
>  # Check the log in 'TM' to see whether the adaptive local hash aggregate 
> works.
> If you meet any problems, it's welcome to ping me directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to