Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/24403
Change subject: IMPALA-14702: Add ability to build against Google Tcmalloc ...................................................................... IMPALA-14702: Add ability to build against Google Tcmalloc Impala currently uses Gperftools TCMalloc, which was originally developed by Google but is now its own open source community. Google continued development internally and created a new open source project with their improved version. The biggest changes are: - Google TCMalloc uses Linux RSEQ functionality to use CPU caches rather than thread caches. This avoids stranding memory in inactive threads. It also avoids work when threads start and stop. - Google TCMalloc adds native huge page support. It backs most allocations with huge pages, which can reduce TLB misses. There are many other changes across many other areas, including profiling and NUMA support. This adds support for building against Google TCMalloc. It is currently controlled by the IMPALA_MALLOC_IMPL environment variable, which defaults to "gperftools". When set to "googletcmalloc", it builds against Google TCMalloc. This is using a custom CMake build of Google TCMalloc with a couple patches to make it work. Unlike the regular Google TCMalloc, this uses madvise() with MADV_HUGEPAGE to allow it to function on systems with only madvise huge page support. Google TCMalloc requires Abseil, so this adds an Abseil dependency. Google TCMalloc retains unused memory, and Impala uses the same integration points as gperftools with aggressive decommit off. We start a background thread that periodically releases memory. Unlike gpeftools, Google TCMalloc provides a MallocExtension::ProcessBackgroundActions() function that does various maintenance actions and releases memory periodically to control the memory overhead. Rather than implementing our own logic, we use that logic and rely on its decisions about retaining memory. We also register a garbage collection function to free memory immediately when hitting the process memory limit. Since Google TCMalloc is aware of huge pages, this changes the buffer pool's madvise_huge_page to avoid using madvise() when the malloc implementation natively supports huge pages. Google TCMalloc's per-CPU caches rely on RSEQ support, and it's use of RSEQ currently conflicts with glibc's use of RSEQ. This disables glibc's use of RSEQ via the GLIBC_TUNABLES=glibc.pthread.rseq=0 when using Google TCMalloc in the dev environment. There will be future changes to package this properly. Change-Id: I5a84eacb66eb0a216bfb2159542a0d7e4ddf8ec2 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/common/global-flags.cc M be/src/runtime/bufferpool/system-allocator.cc M be/src/runtime/bufferpool/system-allocator.h M be/src/util/CMakeLists.txt A be/src/util/malloc-util-googletcmalloc.h M be/src/util/malloc-util.cc M be/src/util/malloc-util.h M bin/bootstrap_toolchain.py M bin/impala-config.sh M bin/run-binary.sh M bin/run-jvm-binary.sh M common/thrift/metrics.json 14 files changed, 413 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/24403/1 -- To view, visit http://gerrit.cloudera.org:8080/24403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5a84eacb66eb0a216bfb2159542a0d7e4ddf8ec2 Gerrit-Change-Number: 24403 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell <[email protected]>
