RE: selecting JSON nested field in storage plugin

2016-03-07 Thread Jiang Wu
Thank you very much.

-- Jiang

-Original Message-
From: Jacques Nadeau [mailto:jacq...@dremio.com] 
Sent: Friday, March 4, 2016 10:23 PM
To: user <user@drill.apache.org>
Subject: Re: selecting JSON nested field in storage plugin

You'll need to put your leaf fields inside a map vector to refer to them in the 
way you want.
On Mar 4, 2016 5:04 PM, "Jiang Wu" <jiang...@numerxdata.com> wrote:

> We are working on a custom Drill storage plugin to retrieve data from 
> a proprietary a JSON based storage.  The drill version being used is 1.4.0.
> Assuming the data is a JSON looking like this:
>
>  {
> ...
>   "topping": {
> "id": "5001, 5002, ...",
> ...
> }
> ...
>  }
>
> When we submit this query:
>
> select t.topping.id from meld.project1.event.table1 t;
>
> The plugin receives a SchemaPath object for "topping.id".  The plugin 
> then creates an output vector with the provided SchemaPath as the 
> field name, e.g.
>
> final MajorType type = Types.optional(MinorType.VARCHAR);   // assuming we
> always use this type
> final MaterializedField field = MaterializedField.create(schemaPath,
> type); // schemaPath is given to the plugin
>
> final Class clazz = (Class ValueVector>) TypeHelper.getValueVectorClass(type.getMinorType(), 
> type.getMode());
>
> ValueVector vector = output.addField(field, clazz);
>
> The data is then added into the vector and returned. However, the 
> returned field cannot be matched with what Drill expects.  So we get 
> something like
> this:
>
> 0: jdbc:drill:zk=local> select t.topping.id from
> meld.project1.event.table1 t;
> +-+
> | EXPR$0  |
> +-+
> | null|
> | null|
> | null|
> | null|
> | null|
> | null|
> +-+
>
> When we are expecting to get this:
>
> 0: jdbc:drill:zk=local> select t.`topping.id` from
> meld.project1.event.table1 t;
> +---+
> |topping.id |
> +---+
> | 5001, 5002, 5003, 5004|
> | 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
> | 5001, 5002, 5003, 5004|
> | 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
> | 5001, 5002, 5005, 5003, 5004  |
> | 5001, 5002, 5005, 5003, 5004  |
> +---+
>
> In the second query, the SchemaPath is a nested path.  Our plugin 
> accepts this specification and retrieve the same results.  So what are 
> we doing wrong here?  How do we correctly return values for a nested JSON 
> field?
>
> Thanks.
>
> -- Jiang
>


Re: selecting JSON nested field in storage plugin

2016-03-04 Thread Jacques Nadeau
You'll need to put your leaf fields inside a map vector to refer to them in
the way you want.
On Mar 4, 2016 5:04 PM, "Jiang Wu"  wrote:

> We are working on a custom Drill storage plugin to retrieve data from a
> proprietary a JSON based storage.  The drill version being used is 1.4.0.
> Assuming the data is a JSON looking like this:
>
>  {
> ...
>   "topping": {
> "id": "5001, 5002, ...",
> ...
> }
> ...
>  }
>
> When we submit this query:
>
> select t.topping.id from meld.project1.event.table1 t;
>
> The plugin receives a SchemaPath object for "topping.id".  The plugin
> then creates an output vector with the provided SchemaPath as the field
> name, e.g.
>
> final MajorType type = Types.optional(MinorType.VARCHAR);   // assuming we
> always use this type
> final MaterializedField field = MaterializedField.create(schemaPath,
> type); // schemaPath is given to the plugin
>
> final Class clazz = (Class)
> TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());
>
> ValueVector vector = output.addField(field, clazz);
>
> The data is then added into the vector and returned. However, the returned
> field cannot be matched with what Drill expects.  So we get something like
> this:
>
> 0: jdbc:drill:zk=local> select t.topping.id from
> meld.project1.event.table1 t;
> +-+
> | EXPR$0  |
> +-+
> | null|
> | null|
> | null|
> | null|
> | null|
> | null|
> +-+
>
> When we are expecting to get this:
>
> 0: jdbc:drill:zk=local> select t.`topping.id` from
> meld.project1.event.table1 t;
> +---+
> |topping.id |
> +---+
> | 5001, 5002, 5003, 5004|
> | 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
> | 5001, 5002, 5003, 5004|
> | 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
> | 5001, 5002, 5005, 5003, 5004  |
> | 5001, 5002, 5005, 5003, 5004  |
> +---+
>
> In the second query, the SchemaPath is a nested path.  Our plugin accepts
> this specification and retrieve the same results.  So what are we doing
> wrong here?  How do we correctly return values for a nested JSON field?
>
> Thanks.
>
> -- Jiang
>