Dan Burkert created KUDU-2068:
---------------------------------
Summary: Kudu/Centos 7/devtoolset miscompilation
Key: KUDU-2068
URL: https://issues.apache.org/jira/browse/KUDU-2068
Project: Kudu
Issue Type: Bug
Components: build
Affects Versions: 1.4.0
Reporter: Dan Burkert
There are a number of issues related to building Kudu on Centos/RHEL 7 with
devtoolset, and on Centos/RHEL 7 machines with devtoolset installed (but not
enabled).
A couple of bits of background info:
1) RHEL 7 ships with system GCC/libstdcxx version 4.8.
2) devtoolset-3 ships with GCC/libstdcxx version 4.9.
3) devtoolset-6 ships with GCC/libstdcxx version 6.2.
4) GCC incompatibly changed the {{unordered_set}} and {{unordered_map}} ABIs
between libstdcxx version 4.7 and 4.8. In 4.7, {{sizeof(unordered_set<int>)}}
is 48 bytes, whereas in 4.8 and above it is 56 bytes (and similarly for
{{unordered_map}}).
5) Clang will automatically search for devtoolset installations, and if found,
use those headers as opposed to the system headers.
6) Clang 4.0 (the current version bundled by Kudu) will find devtoolset
versions [up to
devtoolset-4|https://github.com/llvm-mirror/clang/blob/release_40/lib/Driver/ToolChains.cpp#L1445-L1449].
The next version of Clang (Clang 5) will find devtoolset version [up to
devtoolset-6|https://github.com/llvm-mirror/clang/blob/9a973f3ee99d42a283cafc26407f081daaa8ac21/lib/Driver/ToolChains/Gnu.cpp#L1717-L1720].
7) G++ compilers provided by a devtoolset will use the libstdcxx headers
provided by that devtoolset.
8) Among other things, Kudu uses its bundled clang to pre-compile C++ source
files into clang bitcode. One of the classes included in this pre-compilation
is kudu::Schema, which includes an {{unordered_set}} field.
As a result, with certain configurations, the precompiled codegen will cause
crashes at runtime:
Kudu compiled with system gcc on a Centos 7 machine with devtoolset-3
installed: Kudu will be compiled against the system headers, where
{{unordered_set}} is 48 bytes. The precompiled code will be compiled by clang
against the devtoolset-3 headers, where {{unordered_set}} is 56 bytes. This
results in runtime crashes when calling in to codegenerated functions
(codegen-test segfaults reliably).
Kudu compiled with devtoolset-6 gcc on Centos7: Kudu will be compiled against
the devtoolset-6 headers, where {{unordered_set}} is 54 bytes. The precompiled
code will be compiled by clang against the system headers, where
{{unordered_set}} is 48 bytes (clang 4.0 will not find the devtoolset-6
installation). This results in runtime crashes when calling in to code
generated functions (codegen-test segfaults reliably).
Passing the {{\-gcc-toolchain}} flag to clang with a value appropriate for the
currently enabled g++ when precompiling sources will fix this issue, but I
haven't found a clean way to figure out how to determine the 'correct' value as
part of a script. For system gcc the value should be {{/usr}}, and for
devtoolset-6 the value should be {{/opt/rh/devtoolset-6/root/usr}}. This
corresponds to the {{--prefix}} flag value under the configure flags that {{gcc
-v}} spits out, so maybe we can parse it from that.
Another option is to replace {{unordered_set}}/{{unordered_map}} fields in any
objects passed through codegen boundaries, but obviously this is brittle (and
there may be other types out there whose ABI is similarly unstable).
Finally, I'll note that this issue is a good reason why we should maintain the
no-c++11 types in public-APIs rule that we currently follow with the client.
Also related: [devtoolset ABI
guarantees|https://access.redhat.com/documentation/en-US/Red_Hat_Developer_Toolset/6/html/User_Guide/sect-GCC-CPP.html#sect-GCC-CPP-Compatibility].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)