https://issues.apache.org/jira/browse/DRILL-2012 https://issues.apache.org/jira/browse/DRILL-2013
On Thu, Jan 15, 2015 at 9:51 AM, rahul challapalli < [email protected]> wrote: > Hi Ashwin, > > Thanks for trying out drill. Below are a few of my comments in blue. For > the issues found I will raise relevant jira's. Let us know if you have > further questions. > > > On Thu, Jan 15, 2015 at 8:27 AM, Ashwin Jayaprakash < > [email protected]> wrote: > >> Hello, I was trying to run some queries on a JSON document. I think I may >> have discovered some bugs. I was using Drill 0.7.0. >> >> This is the JSON document (test1.json): >> >> { >> "id": "0001", >> "type": "donut", >> "name": "Cake", >> "ppu": 0.55, >> "batters": >> { >> "batter": >> [ >> { "id": "1001", "type": "Regular" }, >> { "id": "1002", "type": "Chocolate" }, >> { "id": "1003", "type": "Blueberry" }, >> { "id": "1004", "type": "Devil's Food" } >> ] >> }, >> "topping": >> [ >> { "id": "5001", "type": "None" }, >> { "id": "5002", "type": "Glazed" }, >> { "id": "5005", "type": "Sugar" }, >> { "id": "5007", "type": "Powdered Sugar" }, >> { "id": "5006", "type": "Chocolate with Sprinkles" }, >> { "id": "5003", "type": "Chocolate" }, >> { "id": "5004", "type": "Maple" } >> ] >> } >> >> >> 1) I think the parser got confused with the various "type" fields. I think >> this query is valid as "j.type" is "donut" for the one and only row. >> Although there are other "type" fields, I believe my query should have >> worked. >> >> select j.id id, j.name name, flatten(j.topping) tt, >> flatten(j.batters.batter) bb from >> >> dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json` >> j where j.type = 'donut'; >> Query failed: Query failed: Failure while running fragment., Trying to >> flatten a non-repeated filed. >> > > This looks like a bug. You can overcome this by adding the filter column > to the list of columns in the 'select' part > *select j.type, j.id <http://j.id> id, j.name <http://j.name> name, > flatten(j.topping) tt, flatten(j.batters.batter) bb from `sample.json` j > where j.type = 'donut'; * > >> >> >> 2) The parser appears to be automatically converting "id" to a tinyint. I >> suppose this is correct, but wanted your opinion on this. >> >> select j.id id, j.name name, flatten(j.topping) tt, >> flatten(j.batters.batter) bb from >> >> dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json` >> j where id = 'donut'; >> Query failed: Query failed: Failure while running fragment., index: -4, >> length: 4 (expected: range(0, 16384)) >> >> To my knowledge drill only tries to do an implicit cast based on your > filter(or join) condition. If you compare a column with a string literal, > drill tries to cast the column as a string > The same query works without flatten or by changing your filter condition > so that the filter condition succeeds. So its most likely a flatten related > issue > *select j.id <http://j.id> id, j.name <http://j.name> name from > `sample.json` j where j.id <http://j.id> = 'donut';* > > *select j.id <http://j.id> id, j.name <http://j.name> name, > flatten(j.topping) tt, flatten(j.batters.batter) bb from `user.json` j > where j.id <http://j.id> = '0001';* > > >> >> 3) Isn't there a way to filter the records before the flattening happens >> by >> specifying that the path "j.topping.type" should only be "Sugar". >> >> select j.id id, j.name name, flatten(j.topping) tt, >> flatten(j.batters.batter) bb from >> >> dfs.root.`/Users/ashwin.jayaprakash/Downloads/apache-drill-0.7.0/sample/test1.json` >> j where j.topping.type = 'Sugar'; >> Query failed: Query failed: Failure while running fragment., >> org.apache.drill.exec.vector.complex.RepeatedMapVector cannot be cast to >> org.apache.drill.exec.vector.complex.MapVector >> > > You cannot do something like this as of today. However you can apply a > filter on a particular element in the array. > *select j.id <http://j.id> id, j.name <http://j.name> name, > flatten(j.topping) tt, flatten(j.batters.batter) bb from `user.json` j > where j.topping[2].type = 'Sugar'; * > >> >> 4) Is there a length function supported on the nested arrays? >> > I believe not. Some can correct me if I am wrong. > >> >> 5) There is a spelling mistake in the error message :) "Trying to flatten >> a >> non-repeated filed." - "filed" >> > > I will raise a JIRA to get this fixed. > >> >> Thanks. >> > >
