Hello Dan Burkert, Todd Lipcon,
I'd like you to do a code review. Please visit
to review the following change.
Change subject: thirdparty: use libc++ instead libstdc++ for TSAN builds
thirdparty: use libc++ instead libstdc++ for TSAN builds
This all began because I wanted two things:
1. To use the new gcc 5 ABI on platforms that default to it (such as Ubuntu
Xenial). Other applications compiled on these platforms will use the new
ABI, and the fact that the Kudu client forces them to use the old ABI is
2. To have working local TSAN builds again, which broke following the gcc 5
ABI transition in Xenial.
There are a number of interconnected issues at play:
A. Until 3.9, LLVM did not recognize gcc's new ABI tags, which prevented
Kudu's codegen module from building properly against the new ABI.
B. For TSAN builds, we rebuild some thirdparty dependencies against the
libstdc++ from thirdparty, but the LLVM libraries are not one of them.
This may work when the system libstdc++ and the thirdparty libstdc++ are
of the same version, but becomes increasingly unsafe as the versions
differ. Why? Because libstdc++ only guarantees forward compatibility
(i.e. a binary compiled against an old libstdc++ can be used with a new
libstdc++). As an example, on Xenial the two libraries are more than a
major version apart.
C. Continuing B, libstdc++ from gcc 5 actually breaks backward compatibility
for certain C++11-only symbols by moving them to an inline namespace
(e.g. std::error_category is now std::_V2::error_category). The LLVM
libraries use these symbols, which means LLVM built against a gcc 5
libstdc++ cannot link against the older libstdc++ in thirdparty.
D. As the libstdc++ in thirdparty is from gcc 4, it is not multilib and does
not provide new ABI symbols (e.g. std::__cxx11::string). Meaning, if the
rest of Kudu tried to use the new ABI, TSAN builds would fail because the
libstdc++ in use lacks new ABI symbols.
After upgrading LLVM, the path of least resistance was to upgrade libstdc++
in thirdparty, but what a saga that turned out to be. After much trial and
error, I gave up; I could not build libstdc++ from gcc 5 with clang, and we
must use clang to realize the latest -fsanitize=thread support.
Are there any alternatives? Well, we can follow Chromium's lead and use
libc++ for TSAN instead of libstdc++. I think this makes sense for several
- The LLVM build, such as it is, is much more friendly than gcc's build.
Building libstdc++ out of all of gcc was always a little hacky.
- There's at least one large open-source project (Chromium) that's
successfully gone down this path.
That brings us to this patch, which is largely about replacing libstdc++
with libc++. Here are additional interesting details:
o We now build entire set of TSAN-duplicated dependencies with
-fsanitize=thread, not just protobuf. It doesn't affect correctness much
either way, but it's simpler and an easier concept to extend to future
sanitizers that DO care (e.g. MSAN).
o We now build LLVM twice: once against the system libstdc++ for build tools
and the regular LLVM libraries, and a second time against libc++ for
instrumented LLVM libraries. The first build is a little hokey: it'd be
more "pure" to build LLVM three times: once for build tools, once for LLVM
libraries, and once for instrumented LLVM libraries. But these builds are
super long so we optimize by combining the first two. The downside is that
the first build now places build tools in 'installed-deps' instead of
'installed'. I played around with placing build tools in 'installed' while
placing the libraries in 'installed-deps', but found that to be too hacky.
o The full thirdparty build is now quite a bit longer on account of the
second LLVM library build. I tried to mitigate this by reducing the number
of extra cruft built each time. An upcoming patch will address this
further by splitting thirdparty into separate modules.
o libc++ depends on libc++abi, so we build that first.
o The libc++ and libc++abi builds are done standalone rather than with the
LLVM library build, because it isn't possible to do them together AND have
the LLVM libraries depend on libc++.
o build_python may now be invoked more than once, so I've changed it to be
idempotent within the same run of build-thirdparty.sh.
12 files changed, 195 insertions(+), 202 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/4511/1
To view, visit http://gerrit.cloudera.org:8080/4511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>