[jira] [Commented] (IGNITE-14588) Calcite integration. Wrong processing of nested aggregates

2021-05-04 Thread Aleksey Plekhanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338782#comment-17338782
 ] 

Aleksey Plekhanov commented on IGNITE-14588:


[~korlov] thanks for the review! Merged to sql-calcite branch.

> Calcite integration. Wrong processing of nested aggregates
> --
>
> Key: IGNITE-14588
> URL: https://issues.apache.org/jira/browse/IGNITE-14588
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: calcite
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The wrong plan is created when nested aggregates are used. 
> For example, this query: 
> {{SELECT avg(salary) FROM (SELECT avg(salary) as salary FROM employer UNION 
> ALL SELECT salary FROM employer)}}
> Generates such a plan:
> {noformat}
> IgniteReduceHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
>   IgniteExchange(distribution=[single])
> IgniteMapHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
>   IgniteUnionAll(all=[true])
> IgniteSingleHashAggregate(group=[{}], SALARY=[AVG($0)])
>   IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], 
> requiredColumns=[{3}])
> IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], 
> requiredColumns=[{3}])
> {noformat}
> With this plan, in subquery data is aggregated locally on nodes and can 
> produce the wrong results.
> For example:
> {code:java}
> @Test
> public void aggregateNested() throws Exception {
> String cacheName = "employer";
> IgniteCache employer = client.getOrCreateCache(new 
> CacheConfiguration()
> .setName(cacheName)
> .setSqlSchema("PUBLIC")
> .setIndexedTypes(Integer.class, Employer.class)
> .setBackups(2)
> );
> awaitPartitionMapExchange(true, true, null);
> List keysNode0 = primaryKeys(grid(0).cache(cacheName), 2);
> List keysNode1 = primaryKeys(grid(1).cache(cacheName), 1);
> employer.putAll(ImmutableMap.of(
> keysNode0.get(0), new Employer("Igor", 1d),
> keysNode0.get(1), new Employer("Roman", 2d) ,
> keysNode1.get(0), new Employer("Nikolay", 3d)
> ));
> QueryEngine engine = Commons.lookupComponent(grid(1).context(), 
> QueryEngine.class);
> List>> qry = engine.query(null, "PUBLIC",
> "SELECT avg(salary) FROM " +
> "(SELECT avg(salary) as salary FROM employer UNION ALL SELECT 
> salary FROM employer)");
> assertEquals(1, qry.size());
> List> rows = qry.get(0).getAll();
> assertEquals(1, rows.size());
> assertEquals(2d, F.first(F.first(rows)));
> }
> {code}
> With this reproducer we should get 2 as a result (avg(1, 2, 3) = 2, avg(2, 1, 
> 2, 3) = 2), but actual result is 2.1 (avg(1, 2) = 1.5, avg (3) = 3, avg(1.5, 
> 3, 1, 2, 3) = 2.1).
> Root cause: default {{passThroughDistribution}} is not suitable for "reduce 
> aggregate" and "single aggregate" nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14588) Calcite integration. Wrong processing of nested aggregates

2021-04-30 Thread Konstantin Orlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337302#comment-17337302
 ] 

Konstantin Orlov commented on IGNITE-14588:
---

[~alex_pl], now it's all good.

> Calcite integration. Wrong processing of nested aggregates
> --
>
> Key: IGNITE-14588
> URL: https://issues.apache.org/jira/browse/IGNITE-14588
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The wrong plan is created when nested aggregates are used. 
> For example, this query: 
> {{SELECT avg(salary) FROM (SELECT avg(salary) as salary FROM employer UNION 
> ALL SELECT salary FROM employer)}}
> Generates such a plan:
> {noformat}
> IgniteReduceHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
>   IgniteExchange(distribution=[single])
> IgniteMapHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
>   IgniteUnionAll(all=[true])
> IgniteSingleHashAggregate(group=[{}], SALARY=[AVG($0)])
>   IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], 
> requiredColumns=[{3}])
> IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], 
> requiredColumns=[{3}])
> {noformat}
> With this plan, in subquery data is aggregated locally on nodes and can 
> produce the wrong results.
> For example:
> {code:java}
> @Test
> public void aggregateNested() throws Exception {
> String cacheName = "employer";
> IgniteCache employer = client.getOrCreateCache(new 
> CacheConfiguration()
> .setName(cacheName)
> .setSqlSchema("PUBLIC")
> .setIndexedTypes(Integer.class, Employer.class)
> .setBackups(2)
> );
> awaitPartitionMapExchange(true, true, null);
> List keysNode0 = primaryKeys(grid(0).cache(cacheName), 2);
> List keysNode1 = primaryKeys(grid(1).cache(cacheName), 1);
> employer.putAll(ImmutableMap.of(
> keysNode0.get(0), new Employer("Igor", 1d),
> keysNode0.get(1), new Employer("Roman", 2d) ,
> keysNode1.get(0), new Employer("Nikolay", 3d)
> ));
> QueryEngine engine = Commons.lookupComponent(grid(1).context(), 
> QueryEngine.class);
> List>> qry = engine.query(null, "PUBLIC",
> "SELECT avg(salary) FROM " +
> "(SELECT avg(salary) as salary FROM employer UNION ALL SELECT 
> salary FROM employer)");
> assertEquals(1, qry.size());
> List> rows = qry.get(0).getAll();
> assertEquals(1, rows.size());
> assertEquals(2d, F.first(F.first(rows)));
> }
> {code}
> With this reproducer we should get 2 as a result (avg(1, 2, 3) = 2, avg(2, 1, 
> 2, 3) = 2), but actual result is 2.1 (avg(1, 2) = 1.5, avg (3) = 3, avg(1.5, 
> 3, 1, 2, 3) = 2.1).
> Root cause: default {{passThroughDistribution}} is not suitable for "reduce 
> aggregate" and "single aggregate" nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14588) Calcite integration. Wrong processing of nested aggregates

2021-04-27 Thread Konstantin Orlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332999#comment-17332999
 ] 

Konstantin Orlov commented on IGNITE-14588:
---

[~alex_pl], the patch looks good in general, but let's add one more test. Since 
the problem is on planner side, let's add a planner test that verifies Single 
and Reduce nodes could have only SINGLE distribution.

> Calcite integration. Wrong processing of nested aggregates
> --
>
> Key: IGNITE-14588
> URL: https://issues.apache.org/jira/browse/IGNITE-14588
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The wrong plan is created when nested aggregates are used. 
> For example, this query: 
> {{SELECT avg(salary) FROM (SELECT avg(salary) as salary FROM employer UNION 
> ALL SELECT salary FROM employer)}}
> Generates such a plan:
> {noformat}
> IgniteReduceHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
>   IgniteExchange(distribution=[single])
> IgniteMapHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
>   IgniteUnionAll(all=[true])
> IgniteSingleHashAggregate(group=[{}], SALARY=[AVG($0)])
>   IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], 
> requiredColumns=[{3}])
> IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK], 
> requiredColumns=[{3}])
> {noformat}
> With this plan, in subquery data is aggregated locally on nodes and can 
> produce the wrong results.
> For example:
> {code:java}
> @Test
> public void aggregateNested() throws Exception {
> String cacheName = "employer";
> IgniteCache employer = client.getOrCreateCache(new 
> CacheConfiguration()
> .setName(cacheName)
> .setSqlSchema("PUBLIC")
> .setIndexedTypes(Integer.class, Employer.class)
> .setBackups(2)
> );
> awaitPartitionMapExchange(true, true, null);
> List keysNode0 = primaryKeys(grid(0).cache(cacheName), 2);
> List keysNode1 = primaryKeys(grid(1).cache(cacheName), 1);
> employer.putAll(ImmutableMap.of(
> keysNode0.get(0), new Employer("Igor", 1d),
> keysNode0.get(1), new Employer("Roman", 2d) ,
> keysNode1.get(0), new Employer("Nikolay", 3d)
> ));
> QueryEngine engine = Commons.lookupComponent(grid(1).context(), 
> QueryEngine.class);
> List>> qry = engine.query(null, "PUBLIC",
> "SELECT avg(salary) FROM " +
> "(SELECT avg(salary) as salary FROM employer UNION ALL SELECT 
> salary FROM employer)");
> assertEquals(1, qry.size());
> List> rows = qry.get(0).getAll();
> assertEquals(1, rows.size());
> assertEquals(2d, F.first(F.first(rows)));
> }
> {code}
> With this reproducer we should get 2 as a result (avg(1, 2, 3) = 2, avg(2, 1, 
> 2, 3) = 2), but actual result is 2.1 (avg(1, 2) = 1.5, avg (3) = 3, avg(1.5, 
> 3, 1, 2, 3) = 2.1).
> Root cause: default {{passThroughDistribution}} is not suitable for "reduce 
> aggregate" and "single aggregate" nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)