[
https://issues.apache.org/jira/browse/DRILL-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883418#comment-15883418
]
Paul Rogers edited comment on DRILL-5294 at 2/24/17 8:35 PM:
-------------------------------------------------------------
Original test case works fine with latest code. Tested with the long query
using a single slice (all that can be done on the Mac) and 2 GB sort memory.
{code}
Results: 0 records, 1 batches, 208,341 ms
{code}
Tested with an adaptation of the second query using the 18 GB "250wide.tbl"
file:
{code}
select * from (select * from `dfs.data`.`250wide.tbl` d
where cast(d.columns[1] as int) > 0 order by columns[0]) d1 where
d1.columns[0] = 'kjhf'
Results: 0 records, 1 batches, 356,243 ms
{code}
The second use case completes, but is slow because it does a binary merge:
merging two batches, then spilling and repeating until only two runs remain:
{code}
select * from (select * from `dfs.data`.`250wide-small.tbl` order by
columns[0])d where d.columns[0] = 'ljdfhwuehnoiueyf'
Results: 0 records, 1 batches, 26,753 ms
{code}
The third case also succeeds:
{code}
select * from (select * from `dfs.data`.`250wide_files` d
where cast(d.columns[1] as int) > 0 order by columns[0]) d1 where
d1.columns[0] = 'kjhf'
Results: 0 records, 1 batches, 9,987 ms
{code}
One minor fix was found, will be pushed to the Sort-Rollup branch and included
in the DRILL-5284 PR.
was (Author: paul-rogers):
Original test case works fine with latest code. Tested with the long query
using a single slice (all that can be done on the Mac) and 2 GB sort memory.
{code}
Results: 0 records, 1 batches, 208,341 ms
{code}
Tested with an adaptation of the second query using the 18 GB "250wide.tbl"
file:
{code}
select * from (select * from `dfs.data`.`250wide.tbl` d
where cast(d.columns[1] as int) > 0 order by columns[0]) d1 where
d1.columns[0] = 'kjhf'
Results: 0 records, 1 batches, 356,243 ms
{code}
The second use case completes, but is slow because it does a binary merge:
merging two batches, then spilling and repeating until only two runs remain:
{code}
select * from (select * from `dfs.data`.`250wide-small.tbl` order by
columns[0])d where d.columns[0] = 'ljdfhwuehnoiueyf'
Results: 0 records, 1 batches, 26,753 ms
{code}
> Managed External Sort throws an OOM during the merge and spill phase
> --------------------------------------------------------------------
>
> Key: DRILL-5294
> URL: https://issues.apache.org/jira/browse/DRILL-5294
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Reporter: Rahul Challapalli
> Assignee: Paul Rogers
> Fix For: 1.10.0
>
> Attachments: 2751ce6d-67e6-ae08-3b68-e33b29f9d2a3.sys.drill,
> drillbit.log, drillbit_scenario2.log, drillbit_scenario3.log,
> scenario2_profile.sys.drill, scenario3_profile.sys.drill
>
>
> commit # : 38f816a45924654efd085bf7f1da7d97a4a51e38
> The below query fails with managed sort while it succeeds on the old sort
> {code}
> select * from (select columns[433] col433, columns[0],
> columns[1],columns[2],columns[3],columns[4],columns[5],columns[6],columns[7],columns[8],columns[9],columns[10],columns[11]
> from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50])
> d where d.col433 = 'sjka skjf';
> Error: RESOURCE ERROR: External Sort encountered an error while spilling to
> disk
> Fragment 1:11
> [Error Id: 0aa20284-cfcc-450f-89b3-645c280f33a4 on qa-node190.qa.lab:31010]
> (state=,code=0)
> {code}
> Env :
> {code}
> No of Drillbits : 1
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> Attached the logs and profile. Data is too large for a jira
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)