[
https://issues.apache.org/jira/browse/IGNITE-22328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Iurii Gerzhedovich updated IGNITE-22328:
----------------------------------------
Epic Link: IGNITE-20729
> Improve test coverage for SQL planner optimization for JOIN
> -----------------------------------------------------------
>
> Key: IGNITE-22328
> URL: https://issues.apache.org/jira/browse/IGNITE-22328
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Reporter: Iurii Gerzhedovich
> Priority: Major
> Labels: ignite-3
>
> During implementation planner optimization for JOIN in SQL ( IGNITE-18749 )
> were added set of tests. However added set of tests is insufficient.
> Let's add the following test scenarios:
> h2. Performance Testing
> # Check bound intersection timings, i.e. between
> MAX_SIZE_OF_JOIN_TO_OPTIMIZE up to MAX_SIZE_OF_JOIN_TO_OPTIMIZE + 1 tables
> joining. Take into account that this approach is applies only to the planning
> *phase* of execution engine thus statement for N joins (with empty data rows
> in table) will need to consume equal time in comparison with an equal query
> but with N+1 joins instead.
> # Check there is no sufficient difference between involved N and N+1 tables.
> We need to store (somehow) performance results of such a checks from build to
> build
> h2. Functional and End-to-End Testing:
> pre requisites:
> Current implementation uses MAX_SIZE_OF_JOIN_TO_OPTIMIZE = 5; constant as a
> threshold for disabling JOIN COMMUTE rules, thus all sql statements need to
> have joins with more than 5 tables\sources.
>
> # All possible joins INNER, OUTER(LEFT, RIGHT), NATURAL, SELF need to be
> checked.
> # Not only tables can be used as a sources for JOIN operations, but
> subqueries(with and without table sources), *system_range* function and
> system views.
> # Due to non optimal plans for some statements are raised - fill distributed
> table (more than 3 nodes) with data step by step (10k 100k and so on rows)
> and run all from p1.
> # Mutate statements to change sequence of join order, i.e. for: ON T1.custId
> = T2.custId also need to be checked: ON T2.custId = T1.custId. Self check:
> you need to obtain two different plans (explain plan for sql statement):
> # Ignite.*Join
> ...
> Ignite.*Scan(table=[[PUBLIC, T1]]
> ...
> Ignite.*Scan(table=[[PUBLIC, T2]]
>
> # Ignite.*Join
> ...
> Ignite.*Scan(table=[[PUBLIC, T2]]
> ...
> Ignite.*Scan(table=[[PUBLIC, T1]]
>
> # Check over > 1000 tables
>
> Functional tests complete successfully if no timeout or any other exceptions
> are defined in a log and all statements (up to 50 different tables\sources)
> are passed.
> Example of SELF join:
> SELECT _column_name(s)_
> FROM _table1 T1, table1 T2_
> WHERE {_}condition{_};
>
> Example of join with subqueries:
> SELECT t1.a, t2.b from t1, (SELECT 1 as b) as t2 where t1.a=t2.b
> SELECT t1.a, t2.b from t1, (SELECT b as b from integers1 where b>1) as t2
> where t1.a=t2.b
--
This message was sent by Atlassian Jira
(v8.20.10#820010)