[ 
https://issues.apache.org/jira/browse/DRILL-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871341#comment-15871341
 ] 

Paul Rogers commented on DRILL-5273:
------------------------------------

The issue occurs with "Managed buffers". These are clever, but the timing is 
problematic.

* Each record reader can allocate a managed buffer to hold intermediate results.
* Managed buffers go onto a list managed by the operator context, the 
BufferManagerImpl in particular.
* When closing an operator, normally the operator closes its own allocator.
* However, during fragment shutdown, the FragmentManager calls the close for 
each operator *before* calling the close of teh BufferManagerImpl.
* As a result, we get the errors mentioned above if we close the allocator 
during the operator close.

Instead:

* Operators *should not* close their own allocators. Let the BufferManagerImpl 
do it.

This will clean up the managed buffers. And, this is why we saw no memory leaks 
earlier.

However, very bad things happen with the {{ScanBatch}}. Each one of the 5000 
operators allocates a managed buffer of 1 MB in size. These are released only 
when the entire {{ScanBatch}} {{BufferManagerImpl}} is closed. But, by then, 
we've consumed 5 GB of memory.

Somehow, we've got to release the managed buffers earlier, or not use managed 
buffers.

> CompliantTextReader exhausts 4 GB memory when reading 5000 small files
> ----------------------------------------------------------------------
>
>                 Key: DRILL-5273
>                 URL: https://issues.apache.org/jira/browse/DRILL-5273
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.10
>
>
> A test case was created that consists of 5000 text files, each with a single 
> line with the file number: 1 to 5001. Each file has a single record, and at 
> most 4 characters per record.
> Run the following query:
> {code}
> SELECT * FROM `dfs.data`.`5000files/text
> {code}
> The query will fail with an OOM in the scan batch on around record 3700 on a 
> Mac with 4GB of direct memory.
> The code to read records in {ScanBatch} is complex. The following appears to 
> occur:
> * Iterate over the record readers for each file.
> * For each, call setup
> The setup code is:
> {code}
>   public void setup(OperatorContext context, OutputMutator outputMutator) 
> throws ExecutionSetupException {
>     oContext = context;
>     readBuffer = context.getManagedBuffer(READ_BUFFER);
>     whitespaceBuffer = context.getManagedBuffer(WHITE_SPACE_BUFFER);
> {code}
> The two buffers are in direct memory. There is no code that releases the 
> buffers.
> The sizes are:
> {code}
>   private static final int READ_BUFFER = 1024*1024;
>   private static final int WHITE_SPACE_BUFFER = 64*1024;
> = 1,048,576 + 65536 = 1,114,112
> {code}
> This is exactly the amount of memory that accumulates per call to 
> {{ScanBatch.next()}}
> {code}
> Ctor: 0  -- Initial memory in constructor
> Init setup: 1114112  -- After call to first record reader setup
> Entry Memory: 1114112  -- first next() call, returns one record
> Entry Memory: 1114112  -- second next(), eof and start second reader
> Entry Memory: 2228224 -- third next(), second reader returns EOF
> ...
> {code}
> If we leak 1 MB per file, with 5000 files we would leak 5 GB of memory, which 
> would explain the OOM when given only 4 GB.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to