Hello Dan Burkert, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/4511

to review the following change.

Change subject: thirdparty: use libc++ instead libstdc++ for TSAN builds
......................................................................

thirdparty: use libc++ instead libstdc++ for TSAN builds

This all began because I wanted two things:
1. To use the new gcc 5 ABI on platforms that default to it (such as Ubuntu
   Xenial). Other applications compiled on these platforms will use the new
   ABI, and the fact that the Kudu client forces them to use the old ABI is
   quite unfriendly.
2. To have working local TSAN builds again, which broke following the gcc 5
   ABI transition in Xenial.

There are a number of interconnected issues at play:
A. Until 3.9, LLVM did not recognize gcc's new ABI tags, which prevented
   Kudu's codegen module from building properly against the new ABI.
B. For TSAN builds, we rebuild some thirdparty dependencies against the
   libstdc++ from thirdparty, but the LLVM libraries are not one of them.
   This may work when the system libstdc++ and the thirdparty libstdc++ are
   of the same version, but becomes increasingly unsafe as the versions
   differ. Why? Because libstdc++ only guarantees forward compatibility
   (i.e. a binary compiled against an old libstdc++ can be used with a new
   libstdc++). As an example, on Xenial the two libraries are more than a
   major version apart.
C. Continuing B, libstdc++ from gcc 5 actually breaks backward compatibility
   for certain C++11-only symbols by moving them to an inline namespace
   (e.g. std::error_category is now std::_V2::error_category). The LLVM
   libraries use these symbols, which means LLVM built against a gcc 5
   libstdc++ cannot link against the older libstdc++ in thirdparty.
D. As the libstdc++ in thirdparty is from gcc 4, it is not multilib and does
   not provide new ABI symbols (e.g. std::__cxx11::string). Meaning, if the
   rest of Kudu tried to use the new ABI, TSAN builds would fail because the
   libstdc++ in use lacks new ABI symbols.

After upgrading LLVM, the path of least resistance was to upgrade libstdc++
in thirdparty, but what a saga that turned out to be. After much trial and
error, I gave up; I could not build libstdc++ from gcc 5 with clang, and we
must use clang to realize the latest -fsanitize=thread support.

Are there any alternatives? Well, we can follow Chromium's lead and use
libc++ for TSAN instead of libstdc++. I think this makes sense for several
reasons:
- The LLVM build, such as it is, is much more friendly than gcc's build.
  Building libstdc++ out of all of gcc was always a little hacky.
- There's at least one large open-source project (Chromium) that's
  successfully gone down this path.

That brings us to this patch, which is largely about replacing libstdc++
with libc++. Here are additional interesting details:
o We now build entire set of TSAN-duplicated dependencies with
  -fsanitize=thread, not just protobuf. It doesn't affect correctness much
  either way, but it's simpler and an easier concept to extend to future
  sanitizers that DO care (e.g. MSAN).
o We now build LLVM twice: once against the system libstdc++ for build tools
  and the regular LLVM libraries, and a second time against libc++ for
  instrumented LLVM libraries. The first build is a little hokey: it'd be
  more "pure" to build LLVM three times: once for build tools, once for LLVM
  libraries, and once for instrumented LLVM libraries. But these builds are
  super long so we optimize by combining the first two. The downside is that
  the first build now places build tools in 'installed-deps' instead of
  'installed'. I played around with placing build tools in 'installed' while
  placing the libraries in 'installed-deps', but found that to be too hacky.
o The full thirdparty build is now quite a bit longer on account of the
  second LLVM library build. I tried to mitigate this by reducing the number
  of extra cruft built each time. An upcoming patch will address this
  further by splitting thirdparty into separate modules.
o libc++ depends on libc++abi, so we build that first.
o The libc++ and libc++abi builds are done standalone rather than with the
  LLVM library build, because it isn't possible to do them together AND have
  the LLVM libraries depend on libc++.
o build_python may now be invoked more than once, so I've changed it to be
  idempotent within the same run of build-thirdparty.sh.

Change-Id: Id9e68126ae21e04469053009c5b3e4b588415895
---
M CMakeLists.txt
M build-support/dist_test.py
M build-support/run-test.sh
M build-support/run_dist_test.py
M cmake_modules/FindPmem.cmake
M src/kudu/codegen/CMakeLists.txt
M thirdparty/build-definitions.sh
M thirdparty/build-thirdparty.sh
M thirdparty/download-thirdparty.sh
D thirdparty/patches/libstdcxx-fix-string-dtor.patch
D thirdparty/patches/libstdcxx-fix-tr1-shared-ptr.patch
M thirdparty/vars.sh
12 files changed, 195 insertions(+), 202 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/4511/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id9e68126ae21e04469053009c5b3e4b588415895
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to