[
https://issues.apache.org/jira/browse/DRILL-2783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Altekruse updated DRILL-2783:
-----------------------------------
Description:
The flatten function in Drill works great for any level of nesting, however the
applying nested flatten can complicate query syntax as the # of nested levels
increase due to the need to use subqueries, especially when the structure
contains maps of repeated lists.
{code}
Consider JSON below
{
"Id":0,
"level1":[
{
"Id1":1,
"level2":[
{
"Id2":2,
"level3":[
{
"Id3":3,
"Num":1
}
]
}
]
}
]
}
{code}
Below is a query we need to write to flatten the 3 levels.
select count(t2.td.id3) from
(select flatten(t1.tp.level3) as td from
(select flatten(t.ts.level2) as tp from
(select flatten(level1) as ts from
dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`) as t) as t1) as
t2;
An abbreviated/intuitive syntax to apply flatten on maps with arrays would make
working with this kind of data much easier.
For ex:
The above query could potentially be rewritten as below.
select flatten(flatten(flatten(level1).level2).level3) as ts from
dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`
was:
The flatten function in Drill works great for any level of nesting, however the
applying nested flatten can complicate query syntax as the # of nested levels
increase due to the need to use subqueries, especially when the structure
contains maps of repeated lists.
Consider JSON below
{
"Id":0,
"level1":[
{
"Id1":1,
"level2":[
{
"Id2":2,
"level3":[
{
"Id3":3,
"Num":1
}
]
}
]
}
]
}
Below is a query we need to write to flatten the 3 levels.
select count(t2.td.id3) from
(select flatten(t1.tp.level3) as td from
(select flatten(t.ts.level2) as tp from
(select flatten(level1) as ts from
dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`) as t) as t1) as
t2;
An abbreviated/intuitive syntax to apply flatten on maps with arrays would make
working with this kind of data much easier.
For ex:
The above query could potentially be rewritten as below.
select flatten(flatten(flatten(level1).level2).level3) as ts from
dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`
> Abbreviated Flatten syntax for multi-level nested elements
> ----------------------------------------------------------
>
> Key: DRILL-2783
> URL: https://issues.apache.org/jira/browse/DRILL-2783
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - JSON
> Affects Versions: 0.7.0
> Reporter: Neeraja
> Assignee: Steven Phillips
> Fix For: Future
>
>
> The flatten function in Drill works great for any level of nesting, however
> the applying nested flatten can complicate query syntax as the # of nested
> levels increase due to the need to use subqueries, especially when the
> structure contains maps of repeated lists.
> {code}
> Consider JSON below
> {
> "Id":0,
> "level1":[
> {
> "Id1":1,
> "level2":[
> {
> "Id2":2,
> "level3":[
> {
> "Id3":3,
> "Num":1
> }
> ]
> }
> ]
> }
> ]
> }
> {code}
> Below is a query we need to write to flatten the 3 levels.
> select count(t2.td.id3) from
> (select flatten(t1.tp.level3) as td from
> (select flatten(t.ts.level2) as tp from
> (select flatten(level1) as ts from
> dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`) as t) as t1)
> as t2;
> An abbreviated/intuitive syntax to apply flatten on maps with arrays would
> make working with this kind of data much easier.
> For ex:
> The above query could potentially be rewritten as below.
> select flatten(flatten(flatten(level1).level2).level3) as ts from
> dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)