[jira] [Updated] (DRILL-3188) Disable row range functionality for window functions

Victoria Markman (JIRA) Tue, 26 May 2015 16:58:57 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Victoria Markman updated DRILL-3188:
------------------------------------
    Description: 
We don't support row range with window functions. So we should disable this 
functionality, because currently we return default frame result.
The only frame we currently support is BETWEEN UNBOUNDED PRECEDING  AND CURRENT 
ROW.

If you don't specify frame in Calcite, this is exactly what you get:

{code}
0: jdbc:drill:schema=dfs> explain plan for select a2,b2,c2, cast(count(*) 
over(partition by a2, substr(b2,1,2),c2 order by cast(a2 as double) + 100 ) as 
bigint) from t2 order by a2;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$3])
00-02        SelectionVectorRemover
00-03          Sort(sort0=[$0], dir0=[ASC])
00-04            Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$5])
00-05              Window(window#0=[window(partition {0, 2, 4} order by [3] 
range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
00-06                SelectionVectorRemover
00-07                  Sort(sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$3], 
dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC])
00-08                    Project(a2=[$1], b2=[$0], c2=[$2], 
$3=[+(CAST($1):DOUBLE, 100)], $4=[SUBSTR($0, 1, 2)])
00-09                      Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/aggregation/t2]], 
selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`a2`, `b2`, 
`c2`]]])
{code}

Row range should be disabled as well:
{code}
0: jdbc:drill:schema=dfs> select * from t2;
+-----+--------+-------------+
| a2  |   b2   |     c2      |
+-----+--------+-------------+
| 0   | zzz    | 2014-12-31  |
| 1   | aaaaa  | 2015-01-01  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 3   | ccccc  | 2015-01-03  |
| 4   | ddddd  | 2015-01-04  |
| 5   | eeeee  | 2015-01-05  |
| 6   | fffff  | 2015-01-06  |
| 7   | ggggg  | 2015-01-07  |
| 7   | ggggg  | 2015-01-07  |
| 8   | hhhhh  | 2015-01-08  |
| 9   | iiiii  | 2015-01-09  |
+-----+--------+-------------+
13 rows selected (0.123 seconds)

0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
rows between 1 preceding and 1 following ) from t2 order by a2;
+-----+---------+
| a2  | EXPR$1  |
+-----+---------+
| 0   | 0       |
| 1   | 1       |
| 2   | 6       |
| 2   | 6       |
| 2   | 6       |
| 3   | 3       |
| 4   | 4       |
| 5   | 5       |
| 6   | 6       |
| 7   | 14      |
| 7   | 14      |
| 8   | 8       |
| 9   | 9       |
+-----+---------+
13 rows selected (0.2 seconds)
{code}


  was:
We don't support row range with window functions. So we should disable this 
functionality, because currently we return default frame result.

{code}
0: jdbc:drill:schema=dfs> select * from t2;
+-----+--------+-------------+
| a2  |   b2   |     c2      |
+-----+--------+-------------+
| 0   | zzz    | 2014-12-31  |
| 1   | aaaaa  | 2015-01-01  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 2   | bbbbb  | 2015-01-02  |
| 3   | ccccc  | 2015-01-03  |
| 4   | ddddd  | 2015-01-04  |
| 5   | eeeee  | 2015-01-05  |
| 6   | fffff  | 2015-01-06  |
| 7   | ggggg  | 2015-01-07  |
| 7   | ggggg  | 2015-01-07  |
| 8   | hhhhh  | 2015-01-08  |
| 9   | iiiii  | 2015-01-09  |
+-----+--------+-------------+
13 rows selected (0.123 seconds)

0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
rows between 1 preceding and 1 following ) from t2 order by a2;
+-----+---------+
| a2  | EXPR$1  |
+-----+---------+
| 0   | 0       |
| 1   | 1       |
| 2   | 6       |
| 2   | 6       |
| 2   | 6       |
| 3   | 3       |
| 4   | 4       |
| 5   | 5       |
| 6   | 6       |
| 7   | 14      |
| 7   | 14      |
| 8   | 8       |
| 9   | 9       |
+-----+---------+
13 rows selected (0.2 seconds)
{code}



> Disable row range functionality for window functions
> ----------------------------------------------------
>
>                 Key: DRILL-3188
>                 URL: https://issues.apache.org/jira/browse/DRILL-3188
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.0.0
>            Reporter: Victoria Markman
>            Assignee: Jinfeng Ni
>            Priority: Critical
>
> We don't support row range with window functions. So we should disable this 
> functionality, because currently we return default frame result.
> The only frame we currently support is BETWEEN UNBOUNDED PRECEDING  AND 
> CURRENT ROW.
> If you don't specify frame in Calcite, this is exactly what you get:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select a2,b2,c2, cast(count(*) 
> over(partition by a2, substr(b2,1,2),c2 order by cast(a2 as double) + 100 ) 
> as bigint) from t2 order by a2;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$3])
> 00-02        SelectionVectorRemover
> 00-03          Sort(sort0=[$0], dir0=[ASC])
> 00-04            Project(a2=[$0], b2=[$1], c2=[$2], EXPR$3=[$5])
> 00-05              Window(window#0=[window(partition {0, 2, 4} order by [3] 
> range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
> 00-06                SelectionVectorRemover
> 00-07                  Sort(sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$3], 
> dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC])
> 00-08                    Project(a2=[$1], b2=[$0], c2=[$2], 
> $3=[+(CAST($1):DOUBLE, 100)], $4=[SUBSTR($0, 1, 2)])
> 00-09                      Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/aggregation/t2]], 
> selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`a2`, 
> `b2`, `c2`]]])
> {code}
> Row range should be disabled as well:
> {code}
> 0: jdbc:drill:schema=dfs> select * from t2;
> +-----+--------+-------------+
> | a2  |   b2   |     c2      |
> +-----+--------+-------------+
> | 0   | zzz    | 2014-12-31  |
> | 1   | aaaaa  | 2015-01-01  |
> | 2   | bbbbb  | 2015-01-02  |
> | 2   | bbbbb  | 2015-01-02  |
> | 2   | bbbbb  | 2015-01-02  |
> | 3   | ccccc  | 2015-01-03  |
> | 4   | ddddd  | 2015-01-04  |
> | 5   | eeeee  | 2015-01-05  |
> | 6   | fffff  | 2015-01-06  |
> | 7   | ggggg  | 2015-01-07  |
> | 7   | ggggg  | 2015-01-07  |
> | 8   | hhhhh  | 2015-01-08  |
> | 9   | iiiii  | 2015-01-09  |
> +-----+--------+-------------+
> 13 rows selected (0.123 seconds)
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> rows between 1 preceding and 1 following ) from t2 order by a2;
> +-----+---------+
> | a2  | EXPR$1  |
> +-----+---------+
> | 0   | 0       |
> | 1   | 1       |
> | 2   | 6       |
> | 2   | 6       |
> | 2   | 6       |
> | 3   | 3       |
> | 4   | 4       |
> | 5   | 5       |
> | 6   | 6       |
> | 7   | 14      |
> | 7   | 14      |
> | 8   | 8       |
> | 9   | 9       |
> +-----+---------+
> 13 rows selected (0.2 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3188) Disable row range functionality for window functions

Reply via email to