Re: Drill crashes on simple 'like' query

2016-03-04 Thread Jeff Maass


Run the bad query multiple times.

Prior to each run, remove the first character of the like string.

Stop when the query no longer kills drill.

If the condition here, then, that would SEEM to mean that the problem is
in parsing the query.

If that doesn't work, change the 2nd query to = '%' limit 10;
If that doesn't work, change the limit from 10 to 1.
If either of the above changes causes the condition to resolve, then, that
would seem to mean that the problem is in opening, reading and parsing one
of the text files.


Do you have a way to validate the formatting of the text files?

You could also check that there isn't an invalid / unopenable gzip file.
We used to have a problem with another product where tons and tons of gzip
files would be involved.  If only 1 of the files was unopenable, our query
would die.  Sorry, I don't recall the formatting the command, but, we
would just run a bash oneliner upon the involved directories.



On 3/4/16, 4:46 AM, "Assaf Lowenstein"  wrote:

>Hello Drillers!
>My Drill setup is very simple, querying static gz files that hold jsons.
>Everything was running smooth but we're now seeing what seems to be a
>crash
>with a very simple query. here are the details.
>
>This query works
>*select `column1`, column2, myTable.`user`, column3 from dfs.drill.myTable
>limit 10;*
>
>but this one crashes drill for some reason:
>*select `column1 `, column2, myTable.`user`, column3 from
>dfs.drill. myTable where myTable.`user` = 'some--u...@hotmail.com
>' limit 10;*
>
>I'm using web UI and in console I simply see -
>[..]
>0: jdbc:drill:zk=local> SLF4J: Failed to load class
>"org.slf4j.impl.StaticLoggerBinder".
>SLF4J: Defaulting to no-operation (NOP) logger implementation
>SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
>details.
>*Killed*
>​[..]​
>​and prompt returns at this stage and I need to restart drill.
>
>Am I doing anything wrong? missing anything?
>Thanks!
>
>Assaf



Re: Drill error with large sort

2016-02-25 Thread Jeff Maass

If you are open to changing the query:
  # try removing the functions on the 5th column
  # is there any way you could further limit the query?
  # does the query finish if u add a limit / top clause?
  # what do the logs say?


From: Paul Friedman 
Sent: Thursday, February 25, 2016 7:07:12 PM
To: user@drill.apache.org
Subject: Drill error with large sort

I’ve got a query reading from a large directory of parquet files (41 GB)
and I’m consistently getting this error:



Error: RESOURCE ERROR: One or more nodes ran out of memory while executing
the query.



Unable to allocate sv2 for 1023 records, and not enough batchGroups to
spill.

batchGroups.size 0

spilledBatchGroups.size 0

allocated memory 224287987

allocator limit 178956970

Fragment 0:0



[Error Id: 878d604c-4656-4a5a-8b46-ff38a6ae020d on
chai.dev.streetlightdata.com:31010] (state=,code=0)



Direct memory is set to 48GB and heap is 8GB.



The query is:



select probe_id, provider_id, is_moving, mode,  cast(convert_to(points,
'JSON') as varchar(1))

from dfs.`/home/paul/data`

where

start_lat between 24.4873780449008 and 60.0108911181433 and

start_lon between -139.065890469841 and -52.8305074899881 and

provider_id = '343' and

mod(abs(hash(probe_id)),  100) = 0

order by probe_id, start_time;



I’m also using the “example” drill-override configuration.



Any help would be appreciated.



Thanks.



---Paul


Add rest server to each drill node

2016-02-25 Thread Jeff Maass
What is the prescribed / appropriate way to do the below in apache drill?


We want is to do as one can do with elasticsearch:
  * Write our rest service endpoint in java
  * consume the elasticsearch library
  * deploy our application
  * have an elasticsearch cluster that also has our code running in it