Re: selecting JSON nested field in storage plugin

2016-03-04 Thread Jacques Nadeau
You'll need to put your leaf fields inside a map vector to refer to them in
the way you want.
On Mar 4, 2016 5:04 PM, "Jiang Wu"  wrote:

> We are working on a custom Drill storage plugin to retrieve data from a
> proprietary a JSON based storage.  The drill version being used is 1.4.0.
> Assuming the data is a JSON looking like this:
>
>  {
> ...
>   "topping": {
> "id": "5001, 5002, ...",
> ...
> }
> ...
>  }
>
> When we submit this query:
>
> select t.topping.id from meld.project1.event.table1 t;
>
> The plugin receives a SchemaPath object for "topping.id".  The plugin
> then creates an output vector with the provided SchemaPath as the field
> name, e.g.
>
> final MajorType type = Types.optional(MinorType.VARCHAR);   // assuming we
> always use this type
> final MaterializedField field = MaterializedField.create(schemaPath,
> type); // schemaPath is given to the plugin
>
> final Class clazz = (Class)
> TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());
>
> ValueVector vector = output.addField(field, clazz);
>
> The data is then added into the vector and returned. However, the returned
> field cannot be matched with what Drill expects.  So we get something like
> this:
>
> 0: jdbc:drill:zk=local> select t.topping.id from
> meld.project1.event.table1 t;
> +-+
> | EXPR$0  |
> +-+
> | null|
> | null|
> | null|
> | null|
> | null|
> | null|
> +-+
>
> When we are expecting to get this:
>
> 0: jdbc:drill:zk=local> select t.`topping.id` from
> meld.project1.event.table1 t;
> +---+
> |topping.id |
> +---+
> | 5001, 5002, 5003, 5004|
> | 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
> | 5001, 5002, 5003, 5004|
> | 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
> | 5001, 5002, 5005, 5003, 5004  |
> | 5001, 5002, 5005, 5003, 5004  |
> +---+
>
> In the second query, the SchemaPath is a nested path.  Our plugin accepts
> this specification and retrieve the same results.  So what are we doing
> wrong here?  How do we correctly return values for a nested JSON field?
>
> Thanks.
>
> -- Jiang
>


selecting JSON nested field in storage plugin

2016-03-04 Thread Jiang Wu
We are working on a custom Drill storage plugin to retrieve data from a 
proprietary a JSON based storage.  The drill version being used is 1.4.0.  
Assuming the data is a JSON looking like this:

 {
...
  "topping": {
"id": "5001, 5002, ...",
...
}
...
 }

When we submit this query:

select t.topping.id from meld.project1.event.table1 t;

The plugin receives a SchemaPath object for "topping.id".  The plugin then 
creates an output vector with the provided SchemaPath as the field name, e.g.

final MajorType type = Types.optional(MinorType.VARCHAR);   // assuming we 
always use this type
final MaterializedField field = MaterializedField.create(schemaPath, type); 
// schemaPath is given to the plugin

final Class clazz = (Class) 
TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());

ValueVector vector = output.addField(field, clazz);

The data is then added into the vector and returned. However, the returned 
field cannot be matched with what Drill expects.  So we get something like this:

0: jdbc:drill:zk=local> select t.topping.id from meld.project1.event.table1 t;
+-+
| EXPR$0  |
+-+
| null|
| null|
| null|
| null|
| null|
| null|
+-+

When we are expecting to get this:

0: jdbc:drill:zk=local> select t.`topping.id` from meld.project1.event.table1 t;
+---+
|topping.id |
+---+
| 5001, 5002, 5003, 5004|
| 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
| 5001, 5002, 5003, 5004|
| 5001, 5002, 5005, 5007, 5006, 5003, 5004  |
| 5001, 5002, 5005, 5003, 5004  |
| 5001, 5002, 5005, 5003, 5004  |
+---+

In the second query, the SchemaPath is a nested path.  Our plugin accepts this 
specification and retrieve the same results.  So what are we doing wrong here?  
How do we correctly return values for a nested JSON field?

Thanks.

-- Jiang


Re: Is using an aggregate value in a where clause not supported?

2016-03-04 Thread Stefán Baxter
Yes, this works.

Thank you,
 -Stefán

On Fri, Mar 4, 2016 at 8:55 PM, Jinfeng Ni  wrote:

> Can you try this:
>
> select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
> group by sold_to
> having count(*) > 70;
>
> This is because column alias in the SELECT clause could not be used in
> WHERE clause, GroupBy, or Having clause [1]
>
> In your original query, the "trans_count" in where/having clause is
> resolved to a regular column whose name is "trans_count" in the table.
>
>
> [1]
> http://stackoverflow.com/questions/2068682/why-cant-i-use-alias-in-a-count-column-and-reference-it-in-a-having-clause
>
> On Fri, Mar 4, 2016 at 12:40 PM, Stefán Baxter
>  wrote:
> > Having fails as well
> >
> > On Fri, Mar 4, 2016 at 8:00 PM, Bob Rumsby  wrote:
> >
> >> Without trying it or seeing your tables/files, I would expect this to
> work:
> >>
> >> select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
> >> group by sold_to
> >> having trans_count > 70;
> >>
> >> On Fri, Mar 4, 2016 at 11:53 AM, Stefán Baxter <
> ste...@activitystream.com>
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > Having adds to the trouble and claims that the field needs to be
> grouped
> >> > and then fails the same way if it's added to group by.
> >> >
> >> > I ended up wrapping this in a "with <> as ()" but that is far from
> ideal.
> >> >
> >> > Regards,
> >> >  -Stefán
> >> >
> >> > On Fri, Mar 4, 2016 at 7:50 PM, Bob Rumsby 
> wrote:
> >> >
> >> > > Try using the HAVING clause. The WHERE clause cannot constrain the
> >> > results
> >> > > of aggregate functions.
> >> > > http://drill.apache.org/docs/having-clause/
> >> > >
> >> > > On Fri, Mar 4, 2016 at 11:34 AM, Stefán Baxter <
> >> > ste...@activitystream.com>
> >> > > wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > I'm using parquet+drill and the following statement works just
> fine:
> >> > > >
> >> > > > select sold_to, count(*) as trans_count from
> >> > > > dfs.asa.`/processed/venuepoint/transactions` where group by
> sold_to;
> >> > > >
> >> > > > When addin this where clause nothing is returned:
> >> > > >
> >> > > > select sold_to, count(*) as trans_count from
> dfs.asa.`/transactions`
> >> > > where
> >> > > > trans_count > 70 group by sold_to;
> >> > > >
> >> > > >
> >> > > > Is this a known limitation or a bug?
> >> > > >
> >> > > > Regards,
> >> > > >  -Stefán
> >> > > >
> >> > >
> >> >
> >>
>


Re: Is using an aggregate value in a where clause not supported?

2016-03-04 Thread Stefán Baxter
Having fails as well

On Fri, Mar 4, 2016 at 8:00 PM, Bob Rumsby  wrote:

> Without trying it or seeing your tables/files, I would expect this to work:
>
> select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
> group by sold_to
> having trans_count > 70;
>
> On Fri, Mar 4, 2016 at 11:53 AM, Stefán Baxter 
> wrote:
>
> > Hi,
> >
> > Having adds to the trouble and claims that the field needs to be grouped
> > and then fails the same way if it's added to group by.
> >
> > I ended up wrapping this in a "with <> as ()" but that is far from ideal.
> >
> > Regards,
> >  -Stefán
> >
> > On Fri, Mar 4, 2016 at 7:50 PM, Bob Rumsby  wrote:
> >
> > > Try using the HAVING clause. The WHERE clause cannot constrain the
> > results
> > > of aggregate functions.
> > > http://drill.apache.org/docs/having-clause/
> > >
> > > On Fri, Mar 4, 2016 at 11:34 AM, Stefán Baxter <
> > ste...@activitystream.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm using parquet+drill and the following statement works just fine:
> > > >
> > > > select sold_to, count(*) as trans_count from
> > > > dfs.asa.`/processed/venuepoint/transactions` where group by sold_to;
> > > >
> > > > When addin this where clause nothing is returned:
> > > >
> > > > select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
> > > where
> > > > trans_count > 70 group by sold_to;
> > > >
> > > >
> > > > Is this a known limitation or a bug?
> > > >
> > > > Regards,
> > > >  -Stefán
> > > >
> > >
> >
>


Re: Is using an aggregate value in a where clause not supported?

2016-03-04 Thread Bob Rumsby
Without trying it or seeing your tables/files, I would expect this to work:

select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
group by sold_to
having trans_count > 70;

On Fri, Mar 4, 2016 at 11:53 AM, Stefán Baxter 
wrote:

> Hi,
>
> Having adds to the trouble and claims that the field needs to be grouped
> and then fails the same way if it's added to group by.
>
> I ended up wrapping this in a "with <> as ()" but that is far from ideal.
>
> Regards,
>  -Stefán
>
> On Fri, Mar 4, 2016 at 7:50 PM, Bob Rumsby  wrote:
>
> > Try using the HAVING clause. The WHERE clause cannot constrain the
> results
> > of aggregate functions.
> > http://drill.apache.org/docs/having-clause/
> >
> > On Fri, Mar 4, 2016 at 11:34 AM, Stefán Baxter <
> ste...@activitystream.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I'm using parquet+drill and the following statement works just fine:
> > >
> > > select sold_to, count(*) as trans_count from
> > > dfs.asa.`/processed/venuepoint/transactions` where group by sold_to;
> > >
> > > When addin this where clause nothing is returned:
> > >
> > > select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
> > where
> > > trans_count > 70 group by sold_to;
> > >
> > >
> > > Is this a known limitation or a bug?
> > >
> > > Regards,
> > >  -Stefán
> > >
> >
>


Re: Is using an aggregate value in a where clause not supported?

2016-03-04 Thread Stefán Baxter
Hi,

Having adds to the trouble and claims that the field needs to be grouped
and then fails the same way if it's added to group by.

I ended up wrapping this in a "with <> as ()" but that is far from ideal.

Regards,
 -Stefán

On Fri, Mar 4, 2016 at 7:50 PM, Bob Rumsby  wrote:

> Try using the HAVING clause. The WHERE clause cannot constrain the results
> of aggregate functions.
> http://drill.apache.org/docs/having-clause/
>
> On Fri, Mar 4, 2016 at 11:34 AM, Stefán Baxter 
> wrote:
>
> > Hi,
> >
> > I'm using parquet+drill and the following statement works just fine:
> >
> > select sold_to, count(*) as trans_count from
> > dfs.asa.`/processed/venuepoint/transactions` where group by sold_to;
> >
> > When addin this where clause nothing is returned:
> >
> > select sold_to, count(*) as trans_count from dfs.asa.`/transactions`
> where
> > trans_count > 70 group by sold_to;
> >
> >
> > Is this a known limitation or a bug?
> >
> > Regards,
> >  -Stefán
> >
>


Is using an aggregate value in a where clause not supported?

2016-03-04 Thread Stefán Baxter
Hi,

I'm using parquet+drill and the following statement works just fine:

select sold_to, count(*) as trans_count from
dfs.asa.`/processed/venuepoint/transactions` where group by sold_to;

When addin this where clause nothing is returned:

select sold_to, count(*) as trans_count from dfs.asa.`/transactions` where
trans_count > 70 group by sold_to;


Is this a known limitation or a bug?

Regards,
 -Stefán


Re: Drill crashes on simple 'like' query

2016-03-04 Thread Jacques Nadeau
Hey Assaf, is there anything in the logs? Can you email me them directly or
put on gist (can't attach to mailing list emails).

If nothing there, can you enable Java assertions by adding -ea to
drill-env.sh and see if we get a more helpful log or error messages?

thanks,
Jacques


--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Fri, Mar 4, 2016 at 2:46 AM, Assaf Lowenstein  wrote:

> Hello Drillers!
> My Drill setup is very simple, querying static gz files that hold jsons.
> Everything was running smooth but we're now seeing what seems to be a crash
> with a very simple query. here are the details.
>
> This query works
> *select `column1`, column2, myTable.`user`, column3 from dfs.drill.myTable
> limit 10;*
>
> but this one crashes drill for some reason:
> *select `column1 `, column2, myTable.`user`, column3 from
> dfs.drill. myTable where myTable.`user` = 'some--u...@hotmail.com
> ' limit 10;*
>
> I'm using web UI and in console I simply see -
> [..]
> 0: jdbc:drill:zk=local> SLF4J: Failed to load class
> "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> *Killed*
> ​[..]​
> ​and prompt returns at this stage and I need to restart drill.
>
> Am I doing anything wrong? missing anything?
> Thanks!
>
> Assaf
>


Re: Drill crashes on simple 'like' query

2016-03-04 Thread Jeff Maass


Run the bad query multiple times.

Prior to each run, remove the first character of the like string.

Stop when the query no longer kills drill.

If the condition here, then, that would SEEM to mean that the problem is
in parsing the query.

If that doesn't work, change the 2nd query to = '%' limit 10;
If that doesn't work, change the limit from 10 to 1.
If either of the above changes causes the condition to resolve, then, that
would seem to mean that the problem is in opening, reading and parsing one
of the text files.


Do you have a way to validate the formatting of the text files?

You could also check that there isn't an invalid / unopenable gzip file.
We used to have a problem with another product where tons and tons of gzip
files would be involved.  If only 1 of the files was unopenable, our query
would die.  Sorry, I don't recall the formatting the command, but, we
would just run a bash oneliner upon the involved directories.



On 3/4/16, 4:46 AM, "Assaf Lowenstein"  wrote:

>Hello Drillers!
>My Drill setup is very simple, querying static gz files that hold jsons.
>Everything was running smooth but we're now seeing what seems to be a
>crash
>with a very simple query. here are the details.
>
>This query works
>*select `column1`, column2, myTable.`user`, column3 from dfs.drill.myTable
>limit 10;*
>
>but this one crashes drill for some reason:
>*select `column1 `, column2, myTable.`user`, column3 from
>dfs.drill. myTable where myTable.`user` = 'some--u...@hotmail.com
>' limit 10;*
>
>I'm using web UI and in console I simply see -
>[..]
>0: jdbc:drill:zk=local> SLF4J: Failed to load class
>"org.slf4j.impl.StaticLoggerBinder".
>SLF4J: Defaulting to no-operation (NOP) logger implementation
>SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
>details.
>*Killed*
>​[..]​
>​and prompt returns at this stage and I need to restart drill.
>
>Am I doing anything wrong? missing anything?
>Thanks!
>
>Assaf



Re: Drill crashes on simple 'like' query

2016-03-04 Thread John Omernik
is there a space between 1 and back tick in your first field?

On Friday, March 4, 2016, Assaf Lowenstein  wrote:

> Hello Drillers!
> My Drill setup is very simple, querying static gz files that hold jsons.
> Everything was running smooth but we're now seeing what seems to be a crash
> with a very simple query. here are the details.
>
> This query works
> *select `column1`, column2, myTable.`user`, column3 from dfs.drill.myTable
> limit 10;*
>
> but this one crashes drill for some reason:
> *select `column1 `, column2, myTable.`user`, column3 from
> dfs.drill. myTable where myTable.`user` = 'some--u...@hotmail.com
> 
> >' limit 10;*
>
> I'm using web UI and in console I simply see -
> [..]
> 0: jdbc:drill:zk=local> SLF4J: Failed to load class
> "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> *Killed*
> ​[..]​
> ​and prompt returns at this stage and I need to restart drill.
>
> Am I doing anything wrong? missing anything?
> Thanks!
>
> Assaf
>


-- 
Sent from my iThing


Apache Drill, Hbase & Tableau on Ubuntu

2016-03-04 Thread Drugalea Ion
 Hello,
i have configured a cluster with 2 Slaves. I have already succesfull hadoop, 
apache hbase, apache drill installed and configured.In the next step i want to 
vizualize data from hbase with tableau, but i don't find any posibility to 
install Tableau on Ubuntu. 

How can i do this on ubuntu???

Sorry for my englisch.

Thank for your answer





Best regards
Drugalea Ion

Drill crashes on simple 'like' query

2016-03-04 Thread Assaf Lowenstein
Hello Drillers!
My Drill setup is very simple, querying static gz files that hold jsons.
Everything was running smooth but we're now seeing what seems to be a crash
with a very simple query. here are the details.

This query works
*select `column1`, column2, myTable.`user`, column3 from dfs.drill.myTable
limit 10;*

but this one crashes drill for some reason:
*select `column1 `, column2, myTable.`user`, column3 from
dfs.drill. myTable where myTable.`user` = 'some--u...@hotmail.com
' limit 10;*

I'm using web UI and in console I simply see -
[..]
0: jdbc:drill:zk=local> SLF4J: Failed to load class
"org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
*Killed*
​[..]​
​and prompt returns at this stage and I need to restart drill.

Am I doing anything wrong? missing anything?
Thanks!

Assaf