Rahul, thanks for your response. (I couldn't figure out a way to reply to your actual email as I have "digest subscription").
I think #4 length function on arrays will be a very useful feature (does anyone remember XPath?) Same with #3 - predicate pushdown into a nested field. Otherwise the only way to filter would be to flatten/unnest and then filter it. On Thu, Jan 15, 2015 at 8:27 AM, Ashwin Jayaprakash < [email protected]> wrote: > Hello, I was trying to run some queries on a JSON document. I think I may > have discovered some bugs. I was using Drill 0.7.0. > > This is the JSON document (test1.json): > > { > "id": "0001", > "type": "donut", > "name": "Cake", > "ppu": 0.55, > "batters": > { > "batter": > [ > { "id": "1001", "type": "Regular" }, > { "id": "1002", "type": "Chocolate" }, > { "id": "1003", "type": "Blueberry" }, > { "id": "1004", "type": "Devil's Food" } > ] > }, > "topping": > [ > { "id": "5001", "type": "None" }, > { "id": "5002", "type": "Glazed" }, > { "id": "5005", "type": "Sugar" }, > { "id": "5007", "type": "Powdered Sugar" }, > { "id": "5006", "type": "Chocolate with Sprinkles" }, > { "id": "5003", "type": "Chocolate" }, > { "id": "5004", "type": "Maple" } > ] > } > > > 1) I think the parser got confused with the various "type" fields. I think > this query is valid as "j.type" is "donut" for the one and only row. > Although there are other "type" fields, I believe my query should have > worked. > > select j.id id, j.name name, flatten(j.topping) tt, > flatten(j.batters.batter) bb from > dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json` > j where j.type = 'donut'; > Query failed: Query failed: Failure while running fragment., Trying to > flatten a non-repeated filed. > > > 2) The parser appears to be automatically converting "id" to a tinyint. I > suppose this is correct, but wanted your opinion on this. > > select j.id id, j.name name, flatten(j.topping) tt, > flatten(j.batters.batter) bb from > dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json` > j where id = 'donut'; > Query failed: Query failed: Failure while running fragment., index: -4, > length: 4 (expected: range(0, 16384)) > > > 3) Isn't there a way to filter the records before the flattening happens > by specifying that the path "j.topping.type" should only be "Sugar". > > select j.id id, j.name name, flatten(j.topping) tt, > flatten(j.batters.batter) bb from > dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json` > j where j.topping.type = 'Sugar'; > Query failed: Query failed: Failure while running fragment., > org.apache.drill.exec.vector.complex.RepeatedMapVector cannot be cast to > org.apache.drill.exec.vector.complex.MapVector > > 4) Is there a length function supported on the nested arrays? > > 5) There is a spelling mistake in the error message :) "Trying to flatten > a non-repeated filed." - "filed" > > Thanks. >
