Actually I agree with Jiang. The result does seem unintuitive. If it is a
file with just a list, it does still make sense to return the ids in that
list as an array unless the user has configured Drill to automatically
flatten the first level.
Does anyone know how does the other systems behave for this use case? (for
ex: Mongo)



On Thu, Mar 10, 2016 at 4:21 PM, Nathan Griffith <[email protected]>
wrote:

> Hi Jiang,
>
> Think of it this way: If you had a file that was just the list:
>
> {"id":"1001","type":"Regular"}
> {"id":"1002","type":"Chocolate"}
> {"id":"1003","type":"Blueberry"}
> {"id":"1004","type":"Devil's Food"}
>
> What would you like it to return when you query:
>
> select id from dfs.`/path/to/sample_file.json`;
>
> ?
>
> When you enter the query that you're asking about, you're indicating
> exactly that structure of data. Does this explanation make sense?
>
> Best,
> Nathan
>
> On Thu, Mar 10, 2016 at 4:07 PM, Jiang Wu <[email protected]> wrote:
> > Drill version: 1.4.0.  Assuming 3 JSON objects with the following
> structure:
> >
> >   {  ...
> >     "batters":
> >       {
> >         "batter":
> >           [
> >             { "id": "1001", "type": "Regular" },
> >             { "id": "1002", "type": "Chocolate" },
> >             { "id": "1003", "type": "Blueberry" },
> >             { "id": "1004", "type": "Devil's Food" }
> >           ]
> >       },
> >     ...
> >   }
> >
> > Now running a few sample queries against the above data:
> >
> >
> > A)      select "batters" returns expected results, which are the values
> of "batters" from each row.
> >
> > 0: jdbc:drill:zk=local> select batters from dfs.`c:\tmp\sample.json`;
> > +---------+
> > | batters |
> > +---------+
> > |
> {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's
> Food"}]} |
> > | {"batter":[{"id":"1001","type":"Regular"}]} |
> > |
> {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"}]}
> |
> > +---------+
> > 3 rows selected (0.243 seconds)
> >
> >
> > B)      select "batters.batter" also returns the expected results, which
> are the array values for "batters.batter" from each row.
> >
> >
> > 0: jdbc:drill:zk=local> select t.batters.batter from
> dfs.`c:\tmp\sample.json` t;
> > +--------+
> > | EXPR$0 |
> > +--------+
> > |
> [{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's
> Food"}] |
> > | [{"id":"1001","type":"Regular"}] |
> > | [{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"}] |
> > +--------+
> > 3 rows selected (0.198 seconds)
> >
> >
> > C)  select "batters.batter.id" returns something unexpected:
> >
> > 0: jdbc:drill:zk=local> select t.batters.batter.id from
> dfs.`c:\tmp\sample.json` t;
> > +---------+
> > | EXPR$0  |
> > +---------+
> > | 1001    |
> > | 1002    |
> > | 1003    |
> > +---------+
> >
> > The above result doesn't make sense.  The result looks like the 3 values
> from row 1. Should the result be the following instead?
> >
> > +---------+
> > | EXPR$0  |
> > +---------+
> > | [1001, 1002, 1003, 1004]    |
> > | [1001]    |
> > | [1001, 1002]    |
> > +---------+
> >
> > Any hints on what is happening here?  Thanks.
> >
> > -- Jiang
> >
>

Reply via email to