Aleksey Plekhanov created IGNITE-14588:
------------------------------------------
Summary: Calcite integration: Wrong processing of nested aggregates
Key: IGNITE-14588
URL: https://issues.apache.org/jira/browse/IGNITE-14588
Project: Ignite
Issue Type: Bug
Reporter: Aleksey Plekhanov
Assignee: Aleksey Plekhanov
The wrong plan is created when nested aggregates are used.
For example, this query:
{{SELECT avg(salary) FROM (SELECT avg(salary) as salary FROM employer UNION ALL
SELECT salary FROM employer)}}
Generates such a plan:
{noformat}
IgniteReduceHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
IgniteExchange(distribution=[single])
IgniteMapHashAggregate(group=[{}], AVG(SALARY)=[AVG($0)])
IgniteUnionAll(all=[true])
IgniteSingleHashAggregate(group=[{}], SALARY=[AVG($0)])
IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK],
requiredColumns=[{3}])
IgniteIndexScan(table=[[PUBLIC, EMPLOYER]], index=[_key_PK],
requiredColumns=[{3}])
{noformat}
With this plan, in subquery data is aggregated locally on nodes and can produce
the wrong results.
For example:
{code:java}
@Test
public void aggregateNested() throws Exception {
String cacheName = "employer";
IgniteCache<Integer, Employer> employer = client.getOrCreateCache(new
CacheConfiguration<Integer, Employer>()
.setName(cacheName)
.setSqlSchema("PUBLIC")
.setIndexedTypes(Integer.class, Employer.class)
.setBackups(2)
);
awaitPartitionMapExchange(true, true, null);
List<Integer> keysNode0 = primaryKeys(grid(0).cache(cacheName), 2);
List<Integer> keysNode1 = primaryKeys(grid(1).cache(cacheName), 1);
employer.putAll(ImmutableMap.of(
keysNode0.get(0), new Employer("Igor", 1d),
keysNode0.get(1), new Employer("Roman", 2d) ,
keysNode1.get(0), new Employer("Nikolay", 3d)
));
QueryEngine engine = Commons.lookupComponent(grid(1).context(),
QueryEngine.class);
List<FieldsQueryCursor<List<?>>> qry = engine.query(null, "PUBLIC",
"SELECT avg(salary) FROM " +
"(SELECT avg(salary) as salary FROM employer UNION ALL SELECT
salary FROM employer)");
assertEquals(1, qry.size());
List<List<?>> rows = qry.get(0).getAll();
assertEquals(1, rows.size());
assertEquals(2d, F.first(F.first(rows)));
}
{code}
With this reproducer we should get 2 as a result (avg(1, 2, 3) = 2, avg(2, 1,
2, 3) = 2), but actual result is 2.1 (avg(1, 2) = 1.5, avg (3) = 3, avg(1.5, 3,
1, 2, 3) = 2.1).
Root cause: default {{passThroughDistribution}} is not suitable for "reduce
aggregate" and "single aggregate" nodes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)