[
https://issues.apache.org/jira/browse/FLINK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lijie Wang updated FLINK-32780:
-------------------------------
Description:
This issue aims to verify FLIP-324:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
We can enable runtime filter by set: table.optimizer.runtime-filter.enabled:
true
1. Create two tables, one small table (small amount of data), one large table
(large amount of data), and then run join query on these two tables(such as the
example in FLIP doc: SELECT * FROM fact, dim WHERE x = a AND z = 2). The Flink
table planner should be able to obtain the statistical information of these two
tables (for example, Hive table), and the data volume of the small table should
be less than "table.optimizer.runtime-filter.max-build-data-size", and the data
volume of the large table should be larger than
"table.optimizer.runtime-filter.min-probe-data-size".
2. Show the plan of the join query. The plan should include nodes such as
LocalRuntimeFilterBuilder, GlobalRuntimeFilterBuilder and RuntimeFilter. We can
also verify plan for the various variants of above query.
3. Execute the above plan, and:
* Check whether the data in the large table has been successfully filtered
* Verify the execution result, the execution result should be same with the
execution plan which disable runtime filter.
> Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch
> Jobs
> -------------------------------------------------------------------------------
>
> Key: FLINK-32780
> URL: https://issues.apache.org/jira/browse/FLINK-32780
> Project: Flink
> Issue Type: Sub-task
> Components: Tests
> Affects Versions: 1.18.0
> Reporter: Qingsheng Ren
> Assignee: dalongliu
> Priority: Major
> Fix For: 1.18.0
>
>
> This issue aims to verify FLIP-324:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> We can enable runtime filter by set: table.optimizer.runtime-filter.enabled:
> true
> 1. Create two tables, one small table (small amount of data), one large table
> (large amount of data), and then run join query on these two tables(such as
> the example in FLIP doc: SELECT * FROM fact, dim WHERE x = a AND z = 2). The
> Flink table planner should be able to obtain the statistical information of
> these two tables (for example, Hive table), and the data volume of the small
> table should be less than
> "table.optimizer.runtime-filter.max-build-data-size", and the data volume of
> the large table should be larger than
> "table.optimizer.runtime-filter.min-probe-data-size".
> 2. Show the plan of the join query. The plan should include nodes such as
> LocalRuntimeFilterBuilder, GlobalRuntimeFilterBuilder and RuntimeFilter. We
> can also verify plan for the various variants of above query.
> 3. Execute the above plan, and:
> * Check whether the data in the large table has been successfully filtered
> * Verify the execution result, the execution result should be same with the
> execution plan which disable runtime filter.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)