paul-rogers commented on issue #2871:
URL: https://github.com/apache/drill/issues/2871#issuecomment-1895132614

   @weijunlu, thanks for this report. You've encountered one of the two 
memory-related issues that confuse many new users of Drill (or Java).
   
   Your report appears to say that you a) configure direct memory, b) run a 
query that uses quite a bit of memory, and c) find that Drill has not returned 
the memory to the OS after the query completes. As it turns out, this is by 
design.
   
   Drill uses the Netty memory manager. Nettywill request memory from the OS as 
needed, but will not release it back. Once the memory is allocated, Netty adds 
that memory block to a free list inside Drill itself where Drill can reuse it 
for a later query.
   
   You can see this. After you run your first batch of queries, check the 
direct memory level. Then, without restarting Drill, run the same batch of 
queries a second time. You should see that the amount of memory allocated to 
Drill stays about the same: the direct memory was used, released, and then 
reused -- all within Drill. Memory might grow a bit as explained below.
   
   Now, it might be possible to release memory back to the OS, but only in a 
very lightly used system in which there are times when no queries are active. 
Why? Drill uses a "binary buddy" memory allocation starting with blocks of 16MB 
in size, then slicing memory up into smaller pieces as needed. If any portion 
of that 16MB is allocated, the 16MB block cannot be released back to the OS. 
Drill was designed for an environment with heavy usage, in which case every 
16MB block will have at least some of its memory in use at all times. Given 
that environment, it did not make sense to try to free 16MB blocks only to 
immediately request them again: Drill would just thrash the memory subsystem.
   
   Drill (actually Netty) will request more memory from the OS once the current 
set of 16MB blocks become allocated. So, in a busy system, memory will continue 
to grow. Many people see this and say, "memory leak!" But, it is not a leak, it 
is by design. Drill will continue to allocate memory until it reaches the limit 
you set in the configuration. If Drill still needs more memory after that, 
queries will fail with an out-of-memory (OOM) error.
   
   Given that Drill is now most often used for smaller use cases, some very 
clever person might be able to find a way to ask the Netty memory allocator to 
return to the OS any 16MB block which is entirely free. That's a good 
enhancement project.
   
   I mentioned that this is one of two issues that confuse people. The other is 
the normal Java heap. Java itself will continue to allocate more memory for the 
heap as Drill runs, up to the configured limit. Java never releases its memory 
either: the unused memory is simply available on the heap for later reuse 
within Java.
   
   The summary is that Drill will eventually use all the heap and direct memory 
you allocate to it. Once allocated, the memory will not go back to the OS, even 
if Drill is idle. This is why it is important to configure the memory level to 
suite your use case: Drill won't grab and free OS memory based on load.
   
   Finally, of course, Drill will release all its memory back to the OS when 
Drill exits. So, if you run a query only every once in a while, simply use a 
script to bounce the Drill server after it has been idle for a while.
   
   Does this make sense?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to