If you are open to changing the query:
  # try removing the functions on the 5th column
  # is there any way you could further limit the query?
  # does the query finish if u add a limit / top clause?
  # what do the logs say?

________________________________________
From: Paul Friedman <paul.fried...@streetlightdata.com>
Sent: Thursday, February 25, 2016 7:07:12 PM
To: user@drill.apache.org
Subject: Drill error with large sort

I’ve got a query reading from a large directory of parquet files (41 GB)
and I’m consistently getting this error:



Error: RESOURCE ERROR: One or more nodes ran out of memory while executing
the query.



Unable to allocate sv2 for 1023 records, and not enough batchGroups to
spill.

batchGroups.size 0

spilledBatchGroups.size 0

allocated memory 224287987

allocator limit 178956970

Fragment 0:0



[Error Id: 878d604c-4656-4a5a-8b46-ff38a6ae020d on
chai.dev.streetlightdata.com:31010] (state=,code=0)



Direct memory is set to 48GB and heap is 8GB.



The query is:



select probe_id, provider_id, is_moving, mode,  cast(convert_to(points,
'JSON') as varchar(100000000))

from dfs.`/home/paul/data`

where

start_lat between 24.4873780449008 and 60.0108911181433 and

start_lon between -139.065890469841 and -52.8305074899881 and

provider_id = '343' and

mod(abs(hash(probe_id)),  100) = 0

order by probe_id, start_time;



I’m also using the “example” drill-override configuration.



Any help would be appreciated.



Thanks.



---Paul

Reply via email to