[
https://issues.apache.org/jira/browse/DRILL-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582357#comment-14582357
]
Sean Hsuan-Yi Chu commented on DRILL-3182:
------------------------------------------
https://reviews.apache.org/r/35365/
> Window function with DISTINCT qualifier returns seemingly incorrect result
> --------------------------------------------------------------------------
>
> Key: DRILL-3182
> URL: https://issues.apache.org/jira/browse/DRILL-3182
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Drill
> Affects Versions: 1.0.0
> Reporter: Victoria Markman
> Assignee: Sean Hsuan-Yi Chu
> Labels: window_function
> Fix For: 1.1.0
>
>
> Both count(distinct <colname>) and count(all <colname>) return the same
> result. It does not look correct to me and I'm not sure what the correct
> behavior is going to be.
> (1) Latest postgres does not support distinct with Window functions:
> postgres=# select a2, count(distinct b2) over(partition by a2) from t2;
> ERROR: DISTINCT is not implemented for window functions
> LINE 1: select a2, count(distinct b2) over(partition by a2) from t2;
> ^
> postgres=# select a2, avg(distinct a2) over(partition by a2) from t2;
> ERROR: DISTINCT is not implemented for window functions
> LINE 1: select a2, avg(distinct a2) over(partition by a2) from t2;
> (2) Calcite does not support this either:
> https://github.com/apache/incubator-calcite/blob/master/doc/reference.md
> Do we support it ? If not, I think we should throw an error ...
> {code}
> 0: jdbc:drill:schema=dfs> select * from t2;
> +-----+--------+-------------+
> | a2 | b2 | c2 |
> +-----+--------+-------------+
> | 0 | zzz | 2014-12-31 |
> | 1 | aaaaa | 2015-01-01 |
> | 2 | bbbbb | 2015-01-02 |
> | 2 | bbbbb | 2015-01-02 |
> | 2 | bbbbb | 2015-01-02 |
> | 3 | ccccc | 2015-01-03 |
> | 4 | ddddd | 2015-01-04 |
> | 5 | eeeee | 2015-01-05 |
> | 6 | fffff | 2015-01-06 |
> | 7 | ggggg | 2015-01-07 |
> | 7 | ggggg | 2015-01-07 |
> | 8 | hhhhh | 2015-01-08 |
> | 9 | iiiii | 2015-01-09 |
> +-----+--------+-------------+
> 13 rows selected (0.134 seconds)
> 0: jdbc:drill:schema=dfs> select a2, count(distinct b2) over(partition by a2)
> from t2;
> +-----+---------+
> | a2 | EXPR$1 |
> +-----+---------+
> | 0 | 1 |
> | 1 | 1 |
> | 2 | 3 |
> | 2 | 3 |
> | 2 | 3 |
> | 3 | 1 |
> | 4 | 1 |
> | 5 | 1 |
> | 6 | 1 |
> | 7 | 2 |
> | 7 | 2 |
> | 8 | 1 |
> | 9 | 1 |
> +-----+---------+
> 13 rows selected (0.224 seconds)
> 0: jdbc:drill:schema=dfs> select a2, count(b2) over(partition by a2) from t2;
> +-----+---------+
> | a2 | EXPR$1 |
> +-----+---------+
> | 0 | 1 |
> | 1 | 1 |
> | 2 | 3 |
> | 2 | 3 |
> | 2 | 3 |
> | 3 | 1 |
> | 4 | 1 |
> | 5 | 1 |
> | 6 | 1 |
> | 7 | 2 |
> | 7 | 2 |
> | 8 | 1 |
> | 9 | 1 |
> +-----+---------+
> 13 rows selected (0.219 seconds)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)