Rahul Challapalli created DRILL-2232:
----------------------------------------
Summary: Flatten functionality not well defined when we use
flatten in an order by without projecting it
Key: DRILL-2232
URL: https://issues.apache.org/jira/browse/DRILL-2232
Project: Apache Drill
Issue Type: Bug
Components: Functions - Drill
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
Priority: Critical
git.commit.id.abbrev=3d863b5
Data Set :
{code}
{
"id" : 1,
"lst" : [1,2,3,4]
}
{code}
The below query returns 4 rows instead of 1. The expected behavior in this case
is not documented properly
{code}
select id from `data.json` where 2 in (select flatten(lst) from `data.json`)
order by flatten(lst);
+------------+
| id |
+------------+
| 1 |
| 1 |
| 1 |
| 1 |
+------------+
{code}
The below projects a flatten.
{code}
0: jdbc:drill:schema=dfs_eea> select id, flatten(lst) from `temp.json` where 2
in (select flatten(lst) from `temp.json`) order by flatten(lst);
+------------+------------+
| id | EXPR$1 |
+------------+------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
+------------+------------+
{code}
We can agree on one of the 3 possibilites when flatten is not projected:
1. Irrespective of whether flatten is in the select list or not, we would still
return more records based on flatten in the order by
2. Flatten in the order by clause does not change the no of records we return
3. Using flatten in an order by (or probably group by) is not supported
Whatever we agree on, we should document it more clearly. Let me know your
thoughts
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)