Re: Some issues with nested (JSON) queries

Ashwin Jayaprakash Fri, 16 Jan 2015 07:32:38 -0800

Rahul, thanks for your response. (I couldn't figure out a way to reply to
your actual email as I have "digest subscription").


I think #4 length function on arrays will be a very useful feature (does
anyone remember XPath?)

Same with #3 - predicate pushdown into a nested field. Otherwise the only
way to filter would be to flatten/unnest and then filter it.


On Thu, Jan 15, 2015 at 8:27 AM, Ashwin Jayaprakash <
[email protected]> wrote:

> Hello, I was trying to run some queries on a JSON document. I think I may
> have discovered some bugs. I was using Drill 0.7.0.
>
> This is the JSON document (test1.json):
>
> {
>     "id": "0001",
>     "type": "donut",
>     "name": "Cake",
>     "ppu": 0.55,
>     "batters":
>         {
>             "batter":
>                 [
>                     { "id": "1001", "type": "Regular" },
>                     { "id": "1002", "type": "Chocolate" },
>                     { "id": "1003", "type": "Blueberry" },
>                     { "id": "1004", "type": "Devil's Food" }
>                 ]
>         },
>     "topping":
>         [
>             { "id": "5001", "type": "None" },
>             { "id": "5002", "type": "Glazed" },
>             { "id": "5005", "type": "Sugar" },
>             { "id": "5007", "type": "Powdered Sugar" },
>             { "id": "5006", "type": "Chocolate with Sprinkles" },
>             { "id": "5003", "type": "Chocolate" },
>             { "id": "5004", "type": "Maple" }
>         ]
> }
>
>
> 1) I think the parser got confused with the various "type" fields. I think
> this query is valid as "j.type" is "donut" for the one and only row.
> Although there are other "type" fields, I believe my query should have
> worked.
>
> select j.id id, j.name name, flatten(j.topping) tt,
> flatten(j.batters.batter) bb from
> dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json`
> j where j.type = 'donut';
> Query failed: Query failed: Failure while running fragment., Trying to
> flatten a non-repeated filed.
>
>
> 2) The parser appears to be automatically converting "id" to a tinyint. I
> suppose this is correct, but wanted your opinion on this.
>
> select j.id id, j.name name, flatten(j.topping) tt,
> flatten(j.batters.batter) bb from
> dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json`
> j where id = 'donut';
> Query failed: Query failed: Failure while running fragment., index: -4,
> length: 4 (expected: range(0, 16384))
>
>
> 3) Isn't there a way to filter the records before the flattening happens
> by specifying that the path "j.topping.type" should only be "Sugar".
>
> select j.id id, j.name name, flatten(j.topping) tt,
> flatten(j.batters.batter) bb from
> dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json`
> j where j.topping.type = 'Sugar';
> Query failed: Query failed: Failure while running fragment.,
> org.apache.drill.exec.vector.complex.RepeatedMapVector cannot be cast to
> org.apache.drill.exec.vector.complex.MapVector
>
> 4) Is there a length function supported on the nested arrays?
>
> 5) There is a spelling mistake in the error message :) "Trying to flatten
> a non-repeated filed." - "filed"
>
> Thanks.
>

Re: Some issues with nested (JSON) queries

Reply via email to