Andrew Mashenkov commented on IGNITE-3448:

Distinct aggregates splits incorrectly if we have no collocated group by. We 
miss uniqueness check across nodes on Map step, and  as a consequence we have 
wrong numbers on Reduce step.

At first, we check if there is at least one aggregate with distinct. 
If so we can not propagate aggregates to map step queries, but compute 
aggregates on reduce step. Same goes for grouping if non collocated groups are 

DISTINCT in MIN and MAX aggregates does not affect result, but MIN and MAX 
should be processed like other aggregates if there is another (non MIN or MAX) 
aggregate with distinct in query.

I've fixed the issue. Please let me know if i miss something.

> Wrong count returned by count distinct and similar queries.
> -----------------------------------------------------------
>                 Key: IGNITE-3448
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3448
>             Project: Ignite
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.7
>            Reporter: Alexei Scherbakov
>            Assignee: Andrew Mashenkov
>             Fix For: 1.8
> Partitioned cache is deployed on 3 nodes.
> The code below outputs incorrect counts: 
> 14
> 14
> {code}
>        IgniteCache<Integer, Value> cache = grid(0).cache(null);
>         cache.put(0, new Value("v1"));
>         cache.put(3, new Value("v1"));
>         cache.put(5, new Value("v1"));
>         cache.put(9, new Value("v1"));
>         cache.put(1, new Value("v3"));
>         cache.put(15, new Value("v3"));
>         cache.put(8, new Value("v3"));
>         cache.put(2, new Value("v5"));
>         cache.put(12, new Value("v5"));
>         cache.put(4, new Value("v2"));
>         cache.put(6, new Value("v2"));
>         cache.put(7, new Value("v6"));
>         cache.put(10, new Value("v7"));
>         cache.put(11, new Value("v8"));
>         cache.put(13, new Value("v4"));
>         cache.put(14, new Value("v4"));
>         QueryCursor<List<?>> qry = cache.query(new SqlFieldsQuery("select 
> count(distinct str) from Value"));
>         for (List<?> objects : qry)
>             System.out.println(objects.get(0));
>         qry = cache.query(new SqlFieldsQuery("select count(*) from (select 1 
> from Value group by str)"));
>         for (List<?> objects : qry)
>             System.out.println(objects.get(0));
> {code}

This message was sent by Atlassian JIRA

Reply via email to