[ 
https://issues.apache.org/jira/browse/DRILL-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-3652.
-----------------------------------
       Resolution: Fixed
    Fix Version/s:     (was: Future)
                   1.2.0

Updated the Usage Notes section in Introduction to Window Functions with a note 
about this. 

> Need to document order of operations with window functions and flatten
> ----------------------------------------------------------------------
>
>                 Key: DRILL-3652
>                 URL: https://issues.apache.org/jira/browse/DRILL-3652
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.2.0
>            Reporter: Victoria Markman
>              Labels: window_function
>             Fix For: 1.2.0
>
>
> In standard SQL, window functions are the last set of operations performed in 
> a query except for the final order by clause. 
> Using window function with flatten is a bit confusing, because it appears as 
> an operator in the query plan and I expected flatten to run first followed by 
> a window function.
> This is not what is happening:
> {code}
> 0: jdbc:drill:schema=dfs> select * from `complex.json`;
> +----+-----------+----------+
> | x  |     y     |    z     |
> +----+-----------+----------+
> | 5  | a string  | [1,2,3]  |
> +----+-----------+----------+
> 1 row selected (0.128 seconds)
> 0: jdbc:drill:schema=dfs> select sum(x) over(), x , y, flatten(z) from 
> `complex.json`;
> +---------+----+-----------+---------+
> | EXPR$0  | x  |     y     | EXPR$3  |
> +---------+----+-----------+---------+
> | 5       | 5  | a string  | 1       |
> | 5       | 5  | a string  | 2       |
> | 5       | 5  | a string  | 3       |
> +---------+----+-----------+---------+
> 3 rows selected (0.152 seconds)
> 0: jdbc:drill:schema=dfs> explain plan for select sum(x) over(), x , y, 
> flatten(z) from `complex.json`;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      ProjectAllowDup(EXPR$0=[$0], x=[$1], y=[$2], EXPR$3=[$3])
> 00-02        Project(w0$o0=[$3], x=[$0], y=[$1], EXPR$3=[$4])
> 00-03          Flatten(flattenField=[$4])
> 00-04            Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3], 
> EXPR$5=[$2])
> 00-05              Project(x=[$1], y=[$2], z=[$3], w0$o0=[$4])
> 00-06                Window(window#0=[window(partition {} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
> 00-07                  Project(T38¦¦*=[$0], x=[$1], y=[$2], z=[$3])
> 00-08                    Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/drill/testdata/subqueries/complex.json, numFiles=1, 
> columns=[`*`], files=[maprfs:///drill/testdata/subqueries/complex.json]]]
> {code}
> We should suggest to users to put flatten in a subquery if they want to run 
> window function on top of the result set returned by flatten.
> {code}
> 0: jdbc:drill:schema=dfs> select x, y, a, sum(x) over() from  ( select x , y, 
> flatten(z) as a from `complex.json`);
> +----+-----------+----+---------+
> | x  |     y     | a  | EXPR$3  |
> +----+-----------+----+---------+
> | 5  | a string  | 1  | 15      |
> | 5  | a string  | 2  | 15      |
> | 5  | a string  | 3  | 15      |
> +----+-----------+----+---------+
> 3 rows selected (0.145 seconds)
> {code}
> I suggest we document this issue in the window function section, perhaps in 
> "Usage notes".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to