[ 
https://issues.apache.org/jira/browse/DRILL-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5211:
-------------------------------
    Comment: was deleted

(was: Attached are two proposals. The first provides background information 
about the the issue, including the solutions considered. The second is a 
detailed proposal for enforcing vector size limits in the lowest levels of the 
code: the vectors themselves and the "mutator" that writes data to the vectors.

To follow are higher-level proposals for creating a new version of the scan 
batch operator, and related mechanisms, to allow us to retrofit readers with 
the size-aware "mutator.")

> Queries fail due to direct memory fragmentation
> -----------------------------------------------
>
>                 Key: DRILL-5211
>                 URL: https://issues.apache.org/jira/browse/DRILL-5211
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.9.0
>
>         Attachments: ApacheDrillMemoryFragmentationBackground.pdf, 
> ApacheDrillVectorSizeLimits.pdf
>
>
> Consider a test of the external sort as follows:
> * Direct memory: 3GB
> * Input file: 18 GB, with one Varchar column of 8K width
> The sort runs, spilling to disk. Once all data arrives, the sort beings to 
> merge the results. But, to do that, it must first do an intermediate merge. 
> For example, in this sort, there are 190 spill files, but only 19 can be 
> merged at a time. (Each merge file contains 128 MB batches, and only 19 can 
> fit in memory, giving a total footprint of 2.5 GB, well below the 3 GB limit.
> Yet, when loading batch xx, Drill fails with an OOM error. At that point, 
> total available direct memory is 3,817,865,216. (Obtained from {{maxMemory}} 
> in the {{Bits}} class in the JDK.)
> It appears that Drill wants to allocate 58,257,868 bytes, but the 
> {{totalCapacity}} (again in {{Bits}}) is already 3,800,769,206, causing an 
> OOM.
> The problem is that, at this point, the external sort should not ask the 
> system for more memory. The allocator for the external sort is at just 
> 1,192,350,366 before the allocation request. Plenty of spare memory should be 
> available, released when the in-memory batches were spilled to disk prior to 
> merging. Indeed, earlier in the run, the sort had reached a peak memory usage 
> of 2,710,716,416 bytes. This memory should be available for reuse during 
> merging, and is plenty sufficient to fill the particular request in question.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to