thirdparty: use libc++ instead libstdc++ for TSAN builds This all began because I wanted two things: 1. To use the new gcc 5 ABI on platforms that default to it (such as Ubuntu Xenial). Other applications compiled on these platforms will use the new ABI, and the fact that the Kudu client forces them to use the old ABI is quite unfriendly. 2. To have working local TSAN builds again, which broke following the gcc 5 ABI transition in Xenial.
There are a number of interconnected issues at play: A. Until 3.9, LLVM did not recognize gcc's new ABI tags, which prevented Kudu's codegen module from building properly against the new ABI. B. For TSAN builds, we rebuild some thirdparty dependencies against the libstdc++ from thirdparty, but the LLVM libraries are not one of them. This may work when the system libstdc++ and the thirdparty libstdc++ are of the same version, but becomes increasingly untenable as the versions differ. Why? Because libstdc++ only guarantees forward compatibility; that is, a binary linked against libstdc++ can only be used with a libstdc++ of the same version or newer. Attempts to run with an older libstdc++ can lead to run time errors about missing GLIBCXX symbols. As a point of reference, on Xenial the two libraries are more than a major version apart. C. Continuing B, libstdc++ from gcc 5 actually breaks backward compatibility for certain C++11-only symbols by moving them to an inline namespace (e.g. std::error_category is now std::_V2::error_category). The LLVM libraries use these symbols, which means LLVM built against a gcc 5 libstdc++ cannot link against the older libstdc++ in thirdparty. D. As the libstdc++ in thirdparty is from gcc 4, it is not multilib and does not provide new ABI symbols (e.g. std::__cxx11::string). Meaning, if the rest of Kudu tried to use the new ABI, TSAN builds would fail because the libstdc++ in use lacks new ABI symbols. After upgrading LLVM, the path of least resistance was to upgrade libstdc++ in thirdparty, but what a saga that turned out to be. After much trial and error, I gave up; I could not build libstdc++ from gcc 5 with clang, and we must use clang to realize the latest -fsanitize=thread support. Are there any alternatives? Well, we can follow Chromium's lead and use libc++ for TSAN instead of libstdc++. I think this makes sense for several reasons: - The LLVM build, such as it is, is much more friendly than gcc's build. Building libstdc++ out of all of gcc was always a little hacky. - There's at least one large open-source project (Chromium) that's successfully gone down this path. That brings us to this patch, which is largely about replacing libstdc++ with libc++. Here are additional interesting details: o We now build entire set of TSAN-duplicated dependencies with -fsanitize=thread, not just protobuf. It doesn't affect correctness much either way, but it's simpler and an easier concept to extend to future sanitizers that DO care (e.g. MSAN). o We now build LLVM twice: once against the system libstdc++ for build tools and the regular LLVM libraries, and a second time against libc++ for instrumented LLVM libraries. The first build is a little hokey: it'd be more "pure" to build LLVM three times: once for build tools, once for LLVM libraries, and once for instrumented LLVM libraries. But these builds are super long so we optimize by combining the first two. The downside is that the first build now places build tools in 'installed-deps' instead of 'installed'. I played around with placing build tools in 'installed' while placing the libraries in 'installed-deps', but found that to be too hacky. o The full thirdparty build is now quite a bit longer on account of the second LLVM library build. I tried to mitigate this by reducing the number of extra cruft built each time. An upcoming patch will address this further by splitting thirdparty into separate modules. o libc++ depends on libc++abi, so we build that first. o The libc++ and libc++abi builds are done standalone rather than with the LLVM library build, because it isn't possible to do them together AND have the LLVM libraries depend on libc++. o build_python may now be invoked more than once, so I've changed it to be idempotent within the same run of build-thirdparty.sh. Change-Id: Id9e68126ae21e04469053009c5b3e4b588415895 Reviewed-on: http://gerrit.cloudera.org:8080/4511 Reviewed-by: Dan Burkert <d...@cloudera.com> Tested-by: Dan Burkert <d...@cloudera.com> Tested-by: Kudu Jenkins Project: http://git-wip-us.apache.org/repos/asf/kudu/repo Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/fd8a50a8 Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/fd8a50a8 Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/fd8a50a8 Branch: refs/heads/master Commit: fd8a50a81ade1f4f43e3f186514919a68ad830bc Parents: f9a005f Author: Adar Dembo <a...@cloudera.com> Authored: Tue Sep 20 16:26:46 2016 -0700 Committer: Adar Dembo <a...@cloudera.com> Committed: Thu Sep 29 17:43:30 2016 +0000 ---------------------------------------------------------------------- CMakeLists.txt | 30 +++-- build-support/dist_test.py | 2 +- build-support/run-test.sh | 2 +- build-support/run_dist_test.py | 2 +- cmake_modules/FindPmem.cmake | 4 +- src/kudu/codegen/CMakeLists.txt | 2 +- thirdparty/build-definitions.sh | 134 +++++++++++++------ thirdparty/build-thirdparty.sh | 118 ++++++++-------- thirdparty/download-thirdparty.sh | 20 --- .../patches/libstdcxx-fix-string-dtor.patch | 54 -------- .../patches/libstdcxx-fix-tr1-shared-ptr.patch | 21 --- thirdparty/vars.sh | 9 -- 12 files changed, 179 insertions(+), 219 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/CMakeLists.txt ---------------------------------------------------------------------- diff --git a/CMakeLists.txt b/CMakeLists.txt index 6ddba6f..3c01ef7 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -81,6 +81,11 @@ set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake_modules") include(CMakeParseArguments) set(BUILD_SUPPORT_DIR ${CMAKE_CURRENT_SOURCE_DIR}/build-support) +set(THIRDPARTY_DIR ${CMAKE_CURRENT_SOURCE_DIR}/thirdparty) +set(THIRDPARTY_INSTALL_DIR ${THIRDPARTY_DIR}/installed) +set(THIRDPARTY_INSTALL_COMMON_DIR ${THIRDPARTY_INSTALL_DIR}/common) +set(THIRDPARTY_INSTALL_UNINSTRUMENTED_DIR ${THIRDPARTY_INSTALL_DIR}/uninstrumented) +set(THIRDPARTY_INSTALL_TSAN_DIR ${THIRDPARTY_INSTALL_DIR}/tsan) # Allow "make install" to not depend on all targets. # @@ -98,7 +103,7 @@ set(CMAKE_ENABLE_EXPORTS true) # Make sure thirdparty stuff is up-to-date. if ("$ENV{NO_REBUILD_THIRDPARTY}" STREQUAL "") execute_process( - COMMAND ${CMAKE_SOURCE_DIR}/thirdparty/build-if-necessary.sh + COMMAND ${THIRDPARTY_DIR}/build-if-necessary.sh RESULT_VARIABLE THIRDPARTY_SCRIPT_RESULT) if (NOT (${THIRDPARTY_SCRIPT_RESULT} EQUAL 0)) message(FATAL_ERROR "Thirdparty was built unsuccessfully, terminating.") @@ -351,11 +356,12 @@ if (${KUDU_USE_TSAN}) add_definitions("-D_GLIBCXX_EXTERN_TEMPLATE=0") # Compile and link against the thirdparty TSAN instrumented libstdcxx. - set(TSAN_GCC_DIR "${CMAKE_SOURCE_DIR}/thirdparty/installed/tsan/gcc") - set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,-rpath,${TSAN_GCC_DIR}/lib -fsanitize=thread") - set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -nostdinc++ -L${TSAN_GCC_DIR}/lib") - set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -isystem ${TSAN_GCC_DIR}/include/c++/4.9.3") - set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -isystem ${TSAN_GCC_DIR}/include/c++/4.9.3/backward") + set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,-rpath,${THIRDPARTY_INSTALL_TSAN_DIR}/lib") + set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=thread") + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++") + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -nostdinc++") + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -L${THIRDPARTY_INSTALL_TSAN_DIR}/lib") + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -isystem ${THIRDPARTY_INSTALL_TSAN_DIR}/include/c++/v1") # Strictly speaking, TSAN doesn't require dynamic linking. But it does # require all code to be position independent, and the easiest way to @@ -733,14 +739,11 @@ function(ADD_THIRDPARTY_LIB LIB_NAME) endfunction() # Look in thirdparty prefix paths before anywhere else for system dependencies. -set(THIRDPARTY_PREFIX ${CMAKE_CURRENT_SOURCE_DIR}/thirdparty/installed/common) -set(CMAKE_PREFIX_PATH ${THIRDPARTY_PREFIX} ${CMAKE_PREFIX_PATH}) +set(CMAKE_PREFIX_PATH ${THIRDPARTY_INSTALL_COMMON_DIR} ${CMAKE_PREFIX_PATH}) if (${KUDU_USE_TSAN}) - set(CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/thirdparty/installed/tsan - ${CMAKE_PREFIX_PATH}) + set(CMAKE_PREFIX_PATH ${THIRDPARTY_INSTALL_TSAN_DIR} ${CMAKE_PREFIX_PATH}) else() - set(CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/thirdparty/installed/uninstrumented - ${CMAKE_PREFIX_PATH}) + set(CMAKE_PREFIX_PATH ${THIRDPARTY_INSTALL_UNINSTRUMENTED_DIR} ${CMAKE_PREFIX_PATH}) endif() ## Cyrus SASL @@ -881,7 +884,6 @@ ADD_THIRDPARTY_LIB(crcutil ## llvm # Note that llvm has a unique cmake setup. See kudu/codegen/CMakeLists.txt # for details. -set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${THIRDPARTY_PREFIX}/share/llvm) find_package(LLVM REQUIRED CONFIG) if(${LLVM_PACKAGE_VERSION} VERSION_LESS 3.4) message(FATAL_ERROR "LLVM version (${LLVM_PACKAGE_VERSION}) must be at least 3.4") @@ -913,7 +915,7 @@ endif() set(Boost_NO_BOOST_CMAKE ON) # Look for Boost in thirdparty only. -set(BOOST_ROOT ${THIRDPARTY_PREFIX}) +set(BOOST_ROOT ${THIRDPARTY_INSTALL_COMMON_DIR}) # As of cmake 3.5, there's still no way to force FindBoost.cmake to omit system # paths from its search. No combination of Boost_NO_SYSTEM_PATHS, BOOST_ROOT, or http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/build-support/dist_test.py ---------------------------------------------------------------------- diff --git a/build-support/dist_test.py b/build-support/dist_test.py index 1cf7b7b..0be81fe 100755 --- a/build-support/dist_test.py +++ b/build-support/dist_test.py @@ -63,7 +63,7 @@ DEPS_FOR_ALL = \ "build-support/lsan-suppressions.txt", # The LLVM symbolizer is necessary for suppressions to work - "thirdparty/installed/common/bin/llvm-symbolizer", + "thirdparty/installed/uninstrumented/bin/llvm-symbolizer", # Tests that use the external minicluster require these. # TODO: declare these dependencies per-test. http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/build-support/run-test.sh ---------------------------------------------------------------------- diff --git a/build-support/run-test.sh b/build-support/run-test.sh index d3c8f25..ff33756 100755 --- a/build-support/run-test.sh +++ b/build-support/run-test.sh @@ -98,7 +98,7 @@ fi # Suppressions require symbolization. We'll default to using the symbolizer in # thirdparty. if [ -z "$ASAN_SYMBOLIZER_PATH" ]; then - export ASAN_SYMBOLIZER_PATH=$SOURCE_ROOT/thirdparty/clang-toolchain/bin/llvm-symbolizer + export ASAN_SYMBOLIZER_PATH=$SOURCE_ROOT/thirdparty/installed/uninstrumented/bin/llvm-symbolizer fi # Configure TSAN (ignored if this isn't a TSAN build). http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/build-support/run_dist_test.py ---------------------------------------------------------------------- diff --git a/build-support/run_dist_test.py b/build-support/run_dist_test.py index 3dc48d5..61083a7 100755 --- a/build-support/run_dist_test.py +++ b/build-support/run_dist_test.py @@ -128,7 +128,7 @@ def main(): env['GTEST_OUTPUT'] = 'xml:' + os.path.abspath( os.path.join(test_dir, "..", "test-logs")) + '/' - env['ASAN_SYMBOLIZER_PATH'] = os.path.join(ROOT, "thirdparty/installed/common/bin/llvm-symbolizer") + env['ASAN_SYMBOLIZER_PATH'] = os.path.join(ROOT, "thirdparty/installed/uninstrumented/bin/llvm-symbolizer") rc = subprocess.call([os.path.join(ROOT, "build-support/run-test.sh")] + args, env=env) sys.exit(rc) http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/cmake_modules/FindPmem.cmake ---------------------------------------------------------------------- diff --git a/cmake_modules/FindPmem.cmake b/cmake_modules/FindPmem.cmake index 84522c5..059075f 100644 --- a/cmake_modules/FindPmem.cmake +++ b/cmake_modules/FindPmem.cmake @@ -24,10 +24,10 @@ # PMEMOBJ_DEPS, dependencies required for using libpmemobj set(PMEM_SEARCH_LIB_PATH - ${THIRDPARTY_PREFIX}/lib + ${THIRDPARTY_INSTALL_COMMON_DIR}/lib ) set(PMEM_SEARCH_HEADER_PATHS - ${THIRDPARTY_PREFIX}/include + ${THIRDPARTY_INSTALL_COMMON_DIR}/include ) find_path(VMEM_INCLUDE_DIR libvmem.h PATHS ${PMEM_SEARCH_HEADER_PATHS} http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/src/kudu/codegen/CMakeLists.txt ---------------------------------------------------------------------- diff --git a/src/kudu/codegen/CMakeLists.txt b/src/kudu/codegen/CMakeLists.txt index f29a0d0..0357e38 100644 --- a/src/kudu/codegen/CMakeLists.txt +++ b/src/kudu/codegen/CMakeLists.txt @@ -51,7 +51,7 @@ llvm_map_components_to_libnames(llvm_LIBRARIES "${LLVM_REQ_COMPONENTS}") ####################################### ## Create .ll file for precompiled functions (and their dependencies) -set(CLANG_EXEC ${THIRDPARTY_PREFIX}/bin/clang++) +set(CLANG_EXEC ${THIRDPARTY_DIR}/clang-toolchain/bin/clang++) set(IR_SOURCE ${CMAKE_CURRENT_SOURCE_DIR}/precompiled.cc) set(IR_OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/precompiled.ll) set(IR_OUTPUT_CC ${IR_OUTPUT}.cc) http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/thirdparty/build-definitions.sh ---------------------------------------------------------------------- diff --git a/thirdparty/build-definitions.sh b/thirdparty/build-definitions.sh index 761a3eb..d9a5320 100644 --- a/thirdparty/build-definitions.sh +++ b/thirdparty/build-definitions.sh @@ -63,9 +63,56 @@ build_cmake() { popd } -build_llvm() { +build_libcxxabi() { + LIBCXXABI_BDIR=$TP_BUILD_DIR/llvm-$LLVM_VERSION.libcxxabi$MODE_SUFFIX + mkdir -p $LIBCXXABI_BDIR + pushd $LIBCXXABI_BDIR + rm -Rf CMakeCache.txt CMakeFiles/ + cmake \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_INSTALL_PREFIX=$PREFIX \ + -DCMAKE_CXX_FLAGS="$EXTRA_CXXFLAGS $EXTRA_LDFLAGS" \ + -DLLVM_PATH=$LLVM_SOURCE \ + $LLVM_SOURCE/projects/libcxxabi + make -j$PARALLEL install + popd +} - # Build Python if necessary. +build_libcxx() { + local BUILD_TYPE=$1 + case $BUILD_TYPE in + "tsan") + SANITIZER_TYPE=Thread + ;; + *) + echo "Unknown build type: $BUILD_TYPE" + exit 1 + ;; + esac + + LIBCXX_BDIR=$TP_BUILD_DIR/llvm-$LLVM_VERSION.libcxx$MODE_SUFFIX + mkdir -p $LIBCXX_BDIR + pushd $LIBCXX_BDIR + rm -Rf CMakeCache.txt CMakeFiles/ + cmake \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_INSTALL_PREFIX=$PREFIX \ + -DCMAKE_CXX_FLAGS="$EXTRA_CXXFLAGS $EXTRA_LDFLAGS" \ + -DLLVM_PATH=$LLVM_SOURCE \ + -DLIBCXX_CXX_ABI=libcxxabi \ + -DLIBCXX_CXX_ABI_INCLUDE_PATHS=$LLVM_SOURCE/projects/libcxxabi/include \ + -DLLVM_USE_SANITIZER=$SANITIZER_TYPE \ + $LLVM_SOURCE/projects/libcxx + make -j$PARALLEL install + popd +} + +build_or_find_python() { + if [ -n "$PYTHON_EXECUTABLE" ]; then + return + fi + + # Build Python only if necessary. if [[ $(python2.7 -V 2>&1) =~ "Python 2.7." ]]; then PYTHON_EXECUTABLE=$(which python2.7) elif [[ $(python -V 2>&1) =~ "Python 2.7." ]]; then @@ -74,11 +121,41 @@ build_llvm() { PYTHON_BDIR=$TP_BUILD_DIR/$PYTHON_NAME$MODE_SUFFIX mkdir -p $PYTHON_BDIR pushd $PYTHON_BDIR - $PYTHON_SOURCE/configure --prefix=$PREFIX + $PYTHON_SOURCE/configure make -j$PARALLEL PYTHON_EXECUTABLE="$PYTHON_BDIR/python" popd fi +} + +build_llvm() { + local TOOLS_ARGS= + local BUILD_TYPE=$1 + + build_or_find_python + + # Always disabled; these subprojects are built standalone. + TOOLS_ARGS="$TOOLS_ARGS -DLLVM_TOOL_LIBCXX_BUILD=OFF" + TOOLS_ARGS="$TOOLS_ARGS -DLLVM_TOOL_LIBCXXABI_BUILD=OFF" + + case $BUILD_TYPE in + "normal") + # Default build: core LLVM libraries, clang, compiler-rt, and all tools. + ;; + "tsan") + # Build just the core LLVM libraries, dependent on libc++. + TOOLS_ARGS="$TOOLS_ARGS -DLLVM_ENABLE_LIBCXX=ON" + TOOLS_ARGS="$TOOLS_ARGS -DLLVM_INCLUDE_TOOLS=OFF" + TOOLS_ARGS="$TOOLS_ARGS -DLLVM_TOOL_COMPILER_RT_BUILD=OFF" + + # Configure for TSAN. + TOOLS_ARGS="$TOOLS_ARGS -DLLVM_USE_SANITIZER=Thread" + ;; + *) + echo "Unknown LLVM build type: $BUILD_TYPE" + exit 1 + ;; + esac LLVM_BDIR=$TP_BUILD_DIR/llvm-$LLVM_VERSION$MODE_SUFFIX mkdir -p $LLVM_BDIR @@ -98,50 +175,27 @@ build_llvm() { cmake \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=$PREFIX \ + -DLLVM_INCLUDE_DOCS=OFF \ + -DLLVM_INCLUDE_EXAMPLES=OFF \ + -DLLVM_INCLUDE_TESTS=OFF \ + -DLLVM_INCLUDE_UTILS=OFF \ -DLLVM_TARGETS_TO_BUILD=X86 \ -DLLVM_ENABLE_RTTI=ON \ - -DLLVM_TOOL_LIBCXX_BUILD=OFF \ - -DLLVM_TOOL_LIBCXXABI_BUILD=OFF \ - -DCMAKE_CXX_FLAGS="$EXTRA_CXXFLAGS" \ + -DCMAKE_CXX_FLAGS="$EXTRA_CXXFLAGS $EXTRA_LDFLAGS" \ -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \ + $TOOLS_ARGS \ $LLVM_SOURCE make -j$PARALLEL install - # Create a link from Clang to thirdparty/clang-toolchain. This path is used - # for compiling Kudu with sanitizers. The link can't point to the Clang - # installed in the prefix directory, since this confuses CMake into believing - # the thirdparty prefix directory is the system-wide prefix, and it omits the - # thirdparty prefix directory from the rpath of built binaries. - ln -sfn $LLVM_BDIR $TP_DIR/clang-toolchain - popd -} - -build_libstdcxx() { - GCC_BDIR=$TP_BUILD_DIR/$GCC_NAME$MODE_SUFFIX - - # Remove the GCC build directory to remove cached build configuration. - rm -rf $GCC_BDIR - - mkdir -p $GCC_BDIR - pushd $GCC_BDIR - CFLAGS=$EXTRA_CFLAGS \ - CXXFLAGS=$EXTRA_CXXFLAGS \ - $GCC_SOURCE/libstdc++-v3/configure \ - --enable-multilib=no \ - --prefix="$PREFIX" - - # On Ubuntu distros (tested on 14.04 and 16.04), the configure script has a - # nasty habit of disabling TLS support when -fsanitize=thread is used. This - # appears to be an interaction between TSAN and the GCC_CHECK_TLS m4 macro - # used by configure. It doesn't manifest on el6 because the devtoolset - # causes an early conftest to fail, which passes the macro's smell test. - # - # This is a silly hack to force TLS support back on, but it's only temporary, - # as we're about to replace all of this with libc++. - sed -ie 's|/\* #undef HAVE_TLS \*/|#define HAVE_TLS 1|' config.h - - make -j$PARALLEL install + if [[ "$BUILD_TYPE" == "normal" ]]; then + # Create a link from Clang to thirdparty/clang-toolchain. This path is used + # for all Clang invocations. The link can't point to the Clang installed in + # the prefix directory, since this confuses CMake into believing the + # thirdparty prefix directory is the system-wide prefix, and it omits the + # thirdparty prefix directory from the rpath of built binaries. + ln -sfn $LLVM_BDIR $TP_DIR/clang-toolchain + fi popd } http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/thirdparty/build-thirdparty.sh ---------------------------------------------------------------------- diff --git a/thirdparty/build-thirdparty.sh b/thirdparty/build-thirdparty.sh index af929e6..4bba95c 100755 --- a/thirdparty/build-thirdparty.sh +++ b/thirdparty/build-thirdparty.sh @@ -22,8 +22,7 @@ # used, corresponding to build type: # # * /thirdparty/installed/common - prefix directory for libraries and binary tools -# common to all build types, e.g. LLVM, Clang, and -# CMake. +# common to all build types, e.g. CMake, C dependencies. # * /thirdparty/installed/uninstrumented - prefix directory for libraries built with # normal options (no sanitizer instrumentation). # * /thirdparty/installed/tsan - prefix directory for libraries built @@ -47,7 +46,7 @@ source $TP_DIR/build-definitions.sh # read the docs) $TP_DIR/preflight.py -for PREFIX_DIR in $PREFIX_COMMON $PREFIX_DEPS $PREFIX_DEPS_TSAN $PREFIX_LIBSTDCXX $PREFIX_LIBSTDCXX_TSAN; do +for PREFIX_DIR in $PREFIX_COMMON $PREFIX_DEPS $PREFIX_DEPS_TSAN; do mkdir -p $PREFIX_DIR/lib mkdir -p $PREFIX_DIR/include @@ -127,7 +126,6 @@ else "crcutil") F_CRCUTIL=1 ;; "libunwind") F_LIBUNWIND=1 ;; "llvm") F_LLVM=1 ;; - "libstdcxx") F_LIBSTDCXX=1 ;; "trace-viewer") F_TRACE_VIEWER=1 ;; "nvml") F_NVML=1 ;; "boost") F_BOOST=1 ;; @@ -150,19 +148,14 @@ if [ -n "$F_ALL" -o -n "$F_CMAKE" ]; then build_cmake fi -if [ -n "$F_ALL" -o -n "$F_LLVM" ]; then - build_llvm -fi +save_env # Enable debug symbols so that stacktraces and linenumbers are available at -# runtime. CMake and LLVM are compiled without debug symbols since CMake is a -# compile-time only tool, and the LLVM debug symbols take up more than 20GiB of -# disk space. +# runtime. CMake is compiled without debug symbols since it is a compile-time +# only tool. EXTRA_CFLAGS="-g $EXTRA_CFLAGS" EXTRA_CXXFLAGS="-g $EXTRA_CXXFLAGS" -save_env - # Specifying -Wl,-rpath has different default behavior on GNU binutils ld vs. # the GNU gold linker. ld sets RPATH (due to defaulting to --disable-new-dtags) # and gold sets RUNPATH (due to defaulting to --enable-new-dtags). At the time @@ -234,8 +227,19 @@ PREFIX=$PREFIX_DEPS MODE_SUFFIX="" save_env + EXTRA_LDFLAGS="-Wl,-rpath,$PREFIX/lib $EXTRA_LDFLAGS" +if [ -n "$F_ALL" -o -n "$F_LLVM" ]; then + build_llvm normal +fi + +# Enable debug symbols so that stacktraces and linenumbers are available at +# runtime. LLVM is compiled without debug symbols as they take up more than +# 20GiB of disk space. +EXTRA_CFLAGS="-g $EXTRA_CFLAGS" +EXTRA_CXXFLAGS="-g $EXTRA_CXXFLAGS" + if [ -n "$F_ALL" -o -n "$F_GFLAGS" ]; then build_gflags fi @@ -270,23 +274,27 @@ restore_env if [ -n "$F_TSAN" ]; then - # Achieving good results with TSAN requires that the C++ standard - # library be instrumented with TSAN. Additionally, dependencies which - # internally use threads or synchronization should be instrumented. - # libstdc++ requires that all shared objects linked into an executable should - # be built against the same version of libstdc++. As a result, we must build - # libstdc++ twice: once instrumented, and once uninstrumented, in order to - # guarantee that the versions match. + # Achieving good results with TSAN requires that: + # 1. The C++ standard library should be instrumented with TSAN. + # 2. Dependencies which internally use threads or synchronization be + # instrumented with TSAN. + # 3. As a corollary to 1, the C++ standard library requires that all shared + # objects linked into an executable be built against the same version of + # the C++ standard library version. + # + # At the very least, we must build our own C++ standard library. We use libc++ + # because it's easy to build with clang, which has better TSAN support than gcc. # - # Currently protobuf is the only thirdparty dependency that we build with - # instrumentation. + # To satisfy all of the above requirements, we first build libc++ instrumented + # with TSAN, then build a second copy of every C++ dependency against that + # libc++. Later on in the build process, Kudu is also built against libc++. # # Special flags for TSAN builds: # * -fsanitize=thread - enable the thread sanitizer during compilation. - # * -L ... - add the instrumented libstdc++ to the library search paths. - # * -isystem ... - Add libstdc++ headers to the system header search paths. + # * -L ... - add the instrumented libc++ to the library search paths. + # * -isystem ... - Add libc++ headers to the system header search paths. # * -nostdinc++ - Do not automatically link the system C++ standard library. - # * -Wl,-rpath,... - Add instrumented libstdc++ location to the rpath so that + # * -Wl,-rpath,... - Add instrumented libc++ location to the rpath so that # it can be found at runtime. if which ccache >/dev/null ; then @@ -302,49 +310,47 @@ if [ -n "$F_TSAN" ]; then PREFIX=$PREFIX_DEPS_TSAN MODE_SUFFIX=".tsan" - if [ -n "$F_ALL" -o -n "$F_LIBSTDCXX" ]; then - save_env + save_env - # Build uninstrumented libstdcxx - PREFIX=$PREFIX_LIBSTDCXX - MODE_SUFFIX="" - EXTRA_CFLAGS= - EXTRA_CXXFLAGS= - build_libstdcxx + # Build libc++abi first as it is a dependency for libc++. Its build has no + # built-in support for sanitizers, so we build it regularly. + if [ -n "$F_ALL" -o -n "$F_LLVM" ]; then + build_libcxxabi + fi - # Build instrumented libstdxx - PREFIX=$PREFIX_LIBSTDCXX_TSAN - MODE_SUFFIX=".tsan" - EXTRA_CFLAGS="-fsanitize=thread" - EXTRA_CXXFLAGS="-fsanitize=thread" - build_libstdcxx + # The libc++ build needs to be able to find libc++abi. + EXTRA_CXXFLAGS="-L$PREFIX/lib $EXTRA_CXXFLAGS" + EXTRA_LDFLAGS="-Wl,-rpath,$PREFIX/lib $EXTRA_LDFLAGS" - restore_env + # Build libc++ with TSAN enabled. + if [ -n "$F_ALL" -o -n "$F_LLVM" ]; then + build_libcxx tsan fi - # Build dependencies that require TSAN instrumentation + # Build the rest of the dependencies against the TSAN-instrumented libc++ + # instead of the system's C++ standard library. + EXTRA_CXXFLAGS="-nostdinc++ $EXTRA_CXXFLAGS" + EXTRA_CXXFLAGS="-stdlib=libc++ $EXTRA_CXXFLAGS" + EXTRA_CXXFLAGS="-isystem $PREFIX/include/c++/v1 $EXTRA_CXXFLAGS" - save_env + # Build the rest of the dependencies with TSAN instrumentation. EXTRA_CFLAGS="-fsanitize=thread $EXTRA_CFLAGS" - EXTRA_CXXFLAGS="-nostdinc++ -fsanitize=thread $EXTRA_CXXFLAGS" + EXTRA_CXXFLAGS="-fsanitize=thread $EXTRA_CXXFLAGS" EXTRA_CXXFLAGS="-DTHREAD_SANITIZER $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-isystem $PREFIX_LIBSTDCXX_TSAN/include/c++/$GCC_VERSION/backward $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-isystem $PREFIX_LIBSTDCXX_TSAN/include/c++/$GCC_VERSION $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-L$PREFIX_LIBSTDCXX_TSAN/lib $EXTRA_CXXFLAGS" - EXTRA_LDFLAGS="-Wl,-rpath,$PREFIX_LIBSTDCXX_TSAN/lib,-rpath,$PREFIX/lib $EXTRA_LDFLAGS" - if [ -n "$F_ALL" -o -n "$F_PROTOBUF" ]; then - build_protobuf + if [ -n "$F_ALL" -o -n "$F_LLVM" ]; then + build_llvm tsan fi - restore_env - # Build dependencies that do not require TSAN instrumentation + # Enable debug symbols so that stacktraces and linenumbers are available at + # runtime. LLVM is compiled without debug symbols because the LLVM debug symbols + # take up more than 20GiB of disk space. + EXTRA_CFLAGS="-g $EXTRA_CFLAGS" + EXTRA_CXXFLAGS="-g $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-nostdinc++ $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-isystem $PREFIX_LIBSTDCXX/include/c++/$GCC_VERSION/backward $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-isystem $PREFIX_LIBSTDCXX/include/c++/$GCC_VERSION $EXTRA_CXXFLAGS" - EXTRA_CXXFLAGS="-L$PREFIX_LIBSTDCXX/lib $EXTRA_CXXFLAGS" - EXTRA_LDFLAGS="-Wl,-rpath,$PREFIX_LIBSTDCXX/lib,-rpath,$PREFIX/lib $EXTRA_LDFLAGS" + if [ -n "$F_ALL" -o -n "$F_PROTOBUF" ]; then + build_protobuf + fi if [ -n "$F_ALL" -o -n "$F_GFLAGS" ]; then build_gflags @@ -369,6 +375,8 @@ if [ -n "$F_TSAN" ]; then if [ -n "$F_ALL" -o -n "$F_CRCUTIL" ]; then build_crcutil fi + + restore_env fi echo "---------------------" http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/thirdparty/download-thirdparty.sh ---------------------------------------------------------------------- diff --git a/thirdparty/download-thirdparty.sh b/thirdparty/download-thirdparty.sh index f2bfe89..1a74a3e 100755 --- a/thirdparty/download-thirdparty.sh +++ b/thirdparty/download-thirdparty.sh @@ -218,26 +218,6 @@ if [ ! -d $LLVM_SOURCE ]; then echo fi -GCC_PATCHLEVEL=2 -delete_if_wrong_patchlevel $GCC_SOURCE $GCC_PATCHLEVEL -if [[ "$OSTYPE" =~ ^linux ]] && [[ ! -d $GCC_SOURCE ]]; then - fetch_and_expand gcc-${GCC_VERSION}.tar.gz - pushd $GCC_SOURCE/libstdc++-v3 - patch -p0 < $TP_DIR/patches/libstdcxx-fix-string-dtor.patch - patch -p0 < $TP_DIR/patches/libstdcxx-fix-tr1-shared-ptr.patch - cd .. - touch patchlevel-$GCC_PATCHLEVEL - popd - - # Configure libstdcxx to use posix threads by default. Normally this symlink - # would be created automatically while building libgcc as part of the overall - # GCC build, but since we are only building libstdcxx we must configure it - # manually. - ln -sf $GCC_SOURCE/libgcc/gthr-posix.h $GCC_SOURCE/libgcc/gthr-default.h - - echo -fi - LZ4_PATCHLEVEL=1 delete_if_wrong_patchlevel $LZ4_SOURCE $LZ4_PATCHLEVEL if [ ! -d $LZ4_SOURCE ]; then http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/thirdparty/patches/libstdcxx-fix-string-dtor.patch ---------------------------------------------------------------------- diff --git a/thirdparty/patches/libstdcxx-fix-string-dtor.patch b/thirdparty/patches/libstdcxx-fix-string-dtor.patch deleted file mode 100644 index 28978c5..0000000 --- a/thirdparty/patches/libstdcxx-fix-string-dtor.patch +++ /dev/null @@ -1,54 +0,0 @@ -Index: include/bits/basic_string.h -=================================================================== ---- include/bits/basic_string.h (revision 227400) -+++ include/bits/basic_string.h (working copy) -@@ -2601,11 +2601,32 @@ - - bool - _M_is_leaked() const _GLIBCXX_NOEXCEPT -- { return this->_M_refcount < 0; } -+ { -+#if defined(__GTHREADS) -+ // _M_refcount is mutated concurrently by _M_refcopy/_M_dispose, -+ // so we need to use an atomic load. However, _M_is_leaked -+ // predicate does not change concurrently (i.e. the string is either -+ // leaked or not), so a relaxed load is enough. -+ return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; -+#else -+ return this->_M_refcount < 0; -+#endif -+ } - - bool - _M_is_shared() const _GLIBCXX_NOEXCEPT -- { return this->_M_refcount > 0; } -+ { -+#if defined(__GTHREADS) -+ // _M_refcount is mutated concurrently by _M_refcopy/_M_dispose, -+ // so we need to use an atomic load. Another thread can drop last -+ // but one reference concurrently with this check, so we need this -+ // load to be acquire to synchronize with release fetch_and_add in -+ // _M_dispose. -+ return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0; -+#else -+ return this->_M_refcount > 0; -+#endif -+ } - - void - _M_set_leaked() _GLIBCXX_NOEXCEPT -@@ -2654,6 +2675,14 @@ - { - // Be race-detector-friendly. For more info see bits/c++config. - _GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(&this->_M_refcount); -+ // Decrement of _M_refcount is acq_rel, because: -+ // - all but last decrements need to release to synchronize with -+ // the last decrement that will delete the object. -+ // - the last decrement needs to acquire to synchronize with -+ // all the previous decrements. -+ // - last but one decrement needs to release to synchronize with -+ // the acquire load in _M_is_shared that will conclude that -+ // the object is not shared anymore. - if (__gnu_cxx::__exchange_and_add_dispatch(&this->_M_refcount, - -1) <= 0) - { http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/thirdparty/patches/libstdcxx-fix-tr1-shared-ptr.patch ---------------------------------------------------------------------- diff --git a/thirdparty/patches/libstdcxx-fix-tr1-shared-ptr.patch b/thirdparty/patches/libstdcxx-fix-tr1-shared-ptr.patch deleted file mode 100644 index f21fa73..0000000 --- a/thirdparty/patches/libstdcxx-fix-tr1-shared-ptr.patch +++ /dev/null @@ -1,21 +0,0 @@ -diff -ur ./include/tr1/shared_ptr.h ../../gcc-4.9.3.patched/libstdc++-v3/include/tr1/shared_ptr.h ---- ./include/tr1/shared_ptr.h 2014-01-02 14:30:10.000000000 -0800 -+++ ../../gcc-4.9.3.patched/libstdc++-v3/include/tr1/shared_ptr.h 2016-02-01 22:45:11.808475373 -0800 -@@ -188,7 +188,7 @@ - { - // No memory barrier is used here so there is no synchronization - // with other threads. -- return const_cast<const volatile _Atomic_word&>(_M_use_count); -+ return __atomic_load_n(&_M_use_count, __ATOMIC_RELAXED); - } - - private: -@@ -230,7 +230,7 @@ - _M_add_ref_lock() - { - // Perform lock-free add-if-not-zero operation. -- _Atomic_word __count = _M_use_count; -+ _Atomic_word __count = _M_get_use_count(); - do - { - if (__count == 0) http://git-wip-us.apache.org/repos/asf/kudu/blob/fd8a50a8/thirdparty/vars.sh ---------------------------------------------------------------------- diff --git a/thirdparty/vars.sh b/thirdparty/vars.sh index b4b2011..6c5bfea 100644 --- a/thirdparty/vars.sh +++ b/thirdparty/vars.sh @@ -34,11 +34,6 @@ PREFIX_COMMON=$TP_DIR/installed/common PREFIX_DEPS=$TP_DIR/installed/uninstrumented PREFIX_DEPS_TSAN=$TP_DIR/installed/tsan -# libstdcxx needs its own prefix so that it is not inadvertently -# included in the library search path during non-TSAN builds. -PREFIX_LIBSTDCXX=$PREFIX_DEPS/gcc -PREFIX_LIBSTDCXX_TSAN=$PREFIX_DEPS_TSAN/gcc - GFLAGS_VERSION=2.1.2 GFLAGS_NAME=gflags-$GFLAGS_VERSION GFLAGS_SOURCE=$TP_SOURCE_DIR/$GFLAGS_NAME @@ -157,10 +152,6 @@ PYTHON_VERSION=2.7.10 PYTHON_NAME=python-$PYTHON_VERSION PYTHON_SOURCE=$TP_SOURCE_DIR/$PYTHON_NAME -GCC_VERSION=4.9.3 -GCC_NAME=gcc-$GCC_VERSION -GCC_SOURCE=$TP_SOURCE_DIR/$GCC_NAME - # Our trace-viewer repository is separate since it's quite large and # shouldn't change frequently. We upload the built artifacts (HTML/JS) # when we need to roll to a new revision.