[ 
https://issues.apache.org/jira/browse/DRILL-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-2893:
------------------------------------
    Description: 
- set _drill.exec.memory.top.max_ in _drill-override.conf_ to some low value (I 
used _75000000_)
- disable hash aggregate (set _plannder.enable_hashagg_ to false)
- disable exchanges (set _plannder.disable_exchanges_ to true)
- run the following query
{noformat}
select count(*) from (select * from dfs.data.`tpch1/lineitem.parquet` order by 
l_orderkey);
{noformat}
and you should get the following error message:
{noformat}
Query failed: SYSTEM ERROR: null

Fragment 0:0

[e05ff3c2-e130-449e-b721-b3442796e29b on 172.30.1.1:31010]
{noformat}

We have 2 problems here:

1st: 
- ScanBatch detects that it can't allocate it's field value vectors and right 
before returning _OUT_OF_MEMORY_ downstream it calls _clear() on the field 
vectors
- one of those vectors actually threw a NullPointerException in it's 
_allocateNew()_ methods after it cleared it's buffer and couldn't allocate a 
new one
- when ScanBatch tries to clean that vector, it will throw a 
NullPointerException which will prevent the ScanBatch from returning 
_OUT_OF_MEMORY_ and will cancel the query instead

2nd problem:
- once the query has been canceled, _ScanBatch.cleanup()_ will throw another 
_NullPointerException_ when cleaning the field vectors, which will prevent the 
cleanup of the remaining resources and will cause a memory leak

  was:
- set _drill.exec.memory.top.max_ in _drill-override.conf_ to some low value (I 
used _1000000000_)
- disable hash aggregate (set _plannder.enable_hashagg_ to false)
- disable exchanges (set _plannder.disable_exchanges_ to true)
- run the following query
{noformat}
select count(*) from (select * from dfs.data.`tpch1/lineitem.parquet` order by 
l_orderkey);
{noformat}
and you should get the following error message:
{noformat}
Query failed: SYSTEM ERROR: null

Fragment 0:0

[e05ff3c2-e130-449e-b721-b3442796e29b on 172.30.1.1:31010]
{noformat}

We have 2 problems here:

1st: 
- ScanBatch detects that it can't allocate it's field value vectors and right 
before returning _OUT_OF_MEMORY_ downstream it calls _clear() on the field 
vectors
- one of those vectors actually threw a NullPointerException in it's 
_allocateNew()_ methods after it cleared it's buffer and couldn't allocate a 
new one
- when ScanBatch tries to clean that vector, it will throw a 
NullPointerException which will prevent the ScanBatch from returning 
_OUT_OF_MEMORY_ and will cancel the query instead

2nd problem:
- once the query has been canceled, _ScanBatch.cleanup()_ will throw another 
_NullPointerException_ when cleaning the field vectors, which will prevent the 
cleanup of the remaining resources and will cause a memory leak


> ScanBatch throws a NullPointerException instead of returning OUT_OF_MEMORY
> --------------------------------------------------------------------------
>
>                 Key: DRILL-2893
>                 URL: https://issues.apache.org/jira/browse/DRILL-2893
>             Project: Apache Drill
>          Issue Type: Sub-task
>          Components: Execution - Relational Operators
>            Reporter: Deneche A. Hakim
>            Assignee: Deneche A. Hakim
>             Fix For: 1.0.0
>
>
> - set _drill.exec.memory.top.max_ in _drill-override.conf_ to some low value 
> (I used _75000000_)
> - disable hash aggregate (set _plannder.enable_hashagg_ to false)
> - disable exchanges (set _plannder.disable_exchanges_ to true)
> - run the following query
> {noformat}
> select count(*) from (select * from dfs.data.`tpch1/lineitem.parquet` order 
> by l_orderkey);
> {noformat}
> and you should get the following error message:
> {noformat}
> Query failed: SYSTEM ERROR: null
> Fragment 0:0
> [e05ff3c2-e130-449e-b721-b3442796e29b on 172.30.1.1:31010]
> {noformat}
> We have 2 problems here:
> 1st: 
> - ScanBatch detects that it can't allocate it's field value vectors and right 
> before returning _OUT_OF_MEMORY_ downstream it calls _clear() on the field 
> vectors
> - one of those vectors actually threw a NullPointerException in it's 
> _allocateNew()_ methods after it cleared it's buffer and couldn't allocate a 
> new one
> - when ScanBatch tries to clean that vector, it will throw a 
> NullPointerException which will prevent the ScanBatch from returning 
> _OUT_OF_MEMORY_ and will cancel the query instead
> 2nd problem:
> - once the query has been canceled, _ScanBatch.cleanup()_ will throw another 
> _NullPointerException_ when cleaning the field vectors, which will prevent 
> the cleanup of the remaining resources and will cause a memory leak



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to