Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/24402
Change subject: IMPALA-14900: Add support for turning off aggressive decommit ...................................................................... IMPALA-14900: Add support for turning off aggressive decommit Impala has used TCMalloc's aggressive decommit setting for several years, but it increases the OS allocation / deallocation rate and can lead to contention on TCMalloc's central structures. There are many pieces of code that still rely on malloc for their memory, including performance sensitive pieces of query execution. Retaining some malloc memory can accelerate those codepaths by avoiding OS allocation / deallocation cycles. TCMalloc holds a lock while allocating and deallocating memory, and retaining memory can also avoid extreme cases with high lock contention. For example, there have been previous issues using 1MB Parquet data pages, because the large allocations bypass the thread caches and come directly from the central structures. There is a long history behind this setting: - As of late 2015 / Impala 2.3, Impala would let tcmalloc retain memory. It had two mechanisms for releasing memory. The first was a periodic check to see if the overhead of tcmalloc exceeded the memory used. The second was a garbage collection function that ran when hitting the process memory limit. Both mechanisms would free ALL excess tcmalloc heap memory via a single call to ReleaseFreeMemory(). TCMalloc is holding a lock for this call, and this can stall other work until it completes. It could be freeing dozens of GBs and this could hold the lock for 15 seconds. This issue was reported via IMPALA-2800. - In IMPALA-3162, Impala moved to gperftools 2.5, which had aggressive decommit enabled by default. This frees memory immediately, so the mechanisms to free memory had nothing to do. This solved IMPALA-2800. The obsolete code for the periodic check and garbage collection function were removed in IMPALA-5220. - Gperftools only had aggressive decommit enabled by default for a short period of time. It was enabled by default in 2.4 and was disabled by default in 2.6. - When Impala upgraded gperftools later, we added code to manually set aggressive decommit. This adds back an option to turn off aggressive decommit. The shape is similar to the old mechanisms: there is a background thread doing a periodic check to manage the memory overhead and a garbage collection function that gets called when hitting the process memory limit. This has been redesigned to avoid the issue from IMPALA-2800 (based on an early approach to IMPALA-2800 by Todd Lipcon): - Both enforcement locations are freeing a specific amount of memory rather than all accumulated memory (i.e. it calls ReleaseToSystem() with a target amount of memory to free). The background thread is maintaining an overhead specified by the tcmalloc_max_free_bytes startup option. This can be an absolute value or a percentage of the process memory limit. It defaults to 5% of the process memory limit. The garbage collection function is freeing enough memory to avoid hitting the process memory limit, plus a bit extra (512MB) to avoid calling the GC function too frequently. - Both enforcement locations free memory in small chunks to avoid holding the lock for extended periods of time. The chunk size is specified by the tcmalloc_garbage_collection_chunk_size startup option and defaults to 10MB. - The implementation retains significantly less memory and frees it without holding the lock for extended periods of time. - Other things have changed since then: The buffer pool retains memory and frees it gradually over time. This also reduces the need for freeing a large amount of memory immediately. Turning off aggressive decommit is currently incompatible with the madvise_huge_pages=true startup option. This modifies the startup check so that aggressive decommit can be false if madvise_huge_pages is false. A future change may provide a way to mmap huge buffers to allow these to work together. Change-Id: If6022f14093f362a5de9a854f4f4496c90b049b8 --- M be/src/common/daemon-env.cc M be/src/common/global-flags.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/util/malloc-util-gperftools.h M be/src/util/malloc-util-libc.h M be/src/util/malloc-util-sanitizers.h M be/src/util/malloc-util.h 12 files changed, 229 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/24402/1 -- To view, visit http://gerrit.cloudera.org:8080/24402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If6022f14093f362a5de9a854f4f4496c90b049b8 Gerrit-Change-Number: 24402 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell <[email protected]>
