paul-rogers commented on issue #2871: URL: https://github.com/apache/drill/issues/2871#issuecomment-1895132614
@weijunlu, thanks for this report. You've encountered one of the two memory-related issues that confuse many new users of Drill (or Java). Your report appears to say that you a) configure direct memory, b) run a query that uses quite a bit of memory, and c) find that Drill has not returned the memory to the OS after the query completes. As it turns out, this is by design. Drill uses the Netty memory manager. Nettywill request memory from the OS as needed, but will not release it back. Once the memory is allocated, Netty adds that memory block to a free list inside Drill itself where Drill can reuse it for a later query. You can see this. After you run your first batch of queries, check the direct memory level. Then, without restarting Drill, run the same batch of queries a second time. You should see that the amount of memory allocated to Drill stays about the same: the direct memory was used, released, and then reused -- all within Drill. Memory might grow a bit as explained below. Now, it might be possible to release memory back to the OS, but only in a very lightly used system in which there are times when no queries are active. Why? Drill uses a "binary buddy" memory allocation starting with blocks of 16MB in size, then slicing memory up into smaller pieces as needed. If any portion of that 16MB is allocated, the 16MB block cannot be released back to the OS. Drill was designed for an environment with heavy usage, in which case every 16MB block will have at least some of its memory in use at all times. Given that environment, it did not make sense to try to free 16MB blocks only to immediately request them again: Drill would just thrash the memory subsystem. Drill (actually Netty) will request more memory from the OS once the current set of 16MB blocks become allocated. So, in a busy system, memory will continue to grow. Many people see this and say, "memory leak!" But, it is not a leak, it is by design. Drill will continue to allocate memory until it reaches the limit you set in the configuration. If Drill still needs more memory after that, queries will fail with an out-of-memory (OOM) error. Given that Drill is now most often used for smaller use cases, some very clever person might be able to find a way to ask the Netty memory allocator to return to the OS any 16MB block which is entirely free. That's a good enhancement project. I mentioned that this is one of two issues that confuse people. The other is the normal Java heap. Java itself will continue to allocate more memory for the heap as Drill runs, up to the configured limit. Java never releases its memory either: the unused memory is simply available on the heap for later reuse within Java. The summary is that Drill will eventually use all the heap and direct memory you allocate to it. Once allocated, the memory will not go back to the OS, even if Drill is idle. This is why it is important to configure the memory level to suite your use case: Drill won't grab and free OS memory based on load. Finally, of course, Drill will release all its memory back to the OS when Drill exits. So, if you run a query only every once in a while, simply use a script to bounce the Drill server after it has been idle for a while. Does this make sense? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org