[ 
https://issues.apache.org/jira/browse/DRILL-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052597#comment-16052597
 ] 

Paul Rogers commented on DRILL-5503:
------------------------------------

>From the log:

{code}
ExternalSortBatch - Config: memory limit = 62,600,000, spill file size = 
268435456, spill batch size = 8388608, merge limit = 2147483647, merge batch 
size = 16777216
...
ExternalSortBatch - Actual batch schema & sizes {
  T0¦¦columns(type: VARCHAR, std col. size: 54, actual col. size: 19130, total 
size: 268992512, data size: 154876315, row capacity: 8191, density: -6)
  EXPR$1(type: VARCHAR, std col. size: 54, actual col. size: 9, total size: 
73728, data size: 71761, row capacity: 8191, density: 98)
  Records: 8096, Total size: 269082624, Gross row width:33238, Net row 
width:19139, Density:92}
2017-05-10 15:09:00,835 [26ec7089-5bc7-f6fd-42fc-254c8312d358:frag:0:0] INFO  
o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: Unable to allocate sv2 
buffer (Unable to allocate sv2 buffer)
{code}

Note the two key numbers:

Memory given to sort: 62,600,000
Size of first batch: 269,082,624

Clearly, one cannot fit 270 MB of data into 63 MB of memory. So, the fact that 
we get an OOM exception is expected in this case: there is nothing the sort 
itself can do to handle the situation.

Solutions:

* Provide more memory to the sort. Preferably at least 10x the batch size. (Or 
3x at a bare minimum.) We need at least two input batches and one output batch 
to make progress.
* Reduce the size of the input batch. The batch is 8096 records in size, so 
each record is about 33K (according to the log.)

The second option will be possible as a result of DRILL-5211. So, the only 
workaround available today is to increase the memory:

{code}
Was:
alter session set `planner.memory.max_query_memory_per_node` = 62600000;
Change to:
alter session set `planner.memory.max_query_memory_per_node` = 200000000;
{code}

Closing this as not a bug.

> Disabling exchanges results in "Unable to allocate sv2 buffer" error within 
> the managed external sort code
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-5503
>                 URL: https://issues.apache.org/jira/browse/DRILL-5503
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.10.0
>            Reporter: Rahul Challapalli
>            Assignee: Paul Rogers
>         Attachments: drill5503.log, failure.sys.drill, success.sys.drill
>
>
> Setup :
> {code}
> git.commit.id.abbrev=1e0a14c
> No of drillbits : 1
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> The below successfully completes
> {code}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.width.max_per_node` = 1;
> alter session set `planner.memory.max_query_memory_per_node` = 62600000;
> alter session set `planner.width.max_per_query` = 17;
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/5kwidecolumns_500k.tbl` order by 
> columns[0]) d where d.columns[0] = '4041054511';
> +---------+
> | EXPR$0  |
> +---------+
> | 0       |
> +---------+
> 1 row selected (814.104 seconds)
> {code}
> However if I disable exchanges, I get the following error
> {code}
> alter session set `planner.disable_exchanges` = false;
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/5kwidecolumns_500k.tbl` order by 
> columns[0]) d where d.columns[0] = '4041054511';
> +---------+
> | EXPR$0  |
> +---------+
> | 0       |
> +---------+
> 1 row selected (814.104 seconds)
> {code}
> I attached the profile and the log file. The data set used is too large to 
> attach here. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to