Hello Michael Ho, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/8888
to look at the new patch set (#2).
Change subject: IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround
crash after dynamic linking.
......................................................................
IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround crash after
dynamic linking.
Replaces boost::algorithm::trim() with std::string methods when parsing
/proc/self/smaps and adds a trivial unit test for MemInfo::ParseSmaps().
I did *not* replace other uses of trim() with equivalents from
be/src/gutil/strings/strip.h at this moment.
The backstory here is that
TestAdmissionControllerStress::test_admission_controller_with_flags
fails occasionally on dynamically linked builds of Impala. I was able
to reproduce the failure reliably (within 3 tries) with the following:
$ ./buildall.sh -notests -so -noclean
$ bin/start-impala-cluster.py
--impalad_args="--memory_maintenance_sleep_time_ms=1"
$ impala-shell.sh --query 'select max(t.c1), avg(t.c2), min(t.c3), avg(c4),
avg(c5), avg(c6) from (select max(tinyint_col) over (order by int_col) c1,
avg(tinyint_col) over (order by smallint_col) c2, min(tinyint_col) over (order
by smallint_col desc) c3, rank() over (order by int_col desc) c4, dense_rank()
over (order by bigint_col) c5, first_value(tinyint_col) over (order by
bigint_col desc) c6 from functional.alltypes) t;'
The stack trace looks like:
(gdb) bt
#0 0x00007fe230df2428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007fe230df402a in __GI_abort () at abort.c:89
#2 0x00007fe23312026d in __gnu_cxx::__verbose_terminate_handler() () at
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00007fe2330d8b66 in __cxxabiv1::__terminate(void (*)())
(handler=<optimized out>) at
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x00007fe2330d8bb1 in std::terminate() () at
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5 0x00007fe2330d8cb8 in __cxxabiv1::__cxa_throw(void*, std::type_info*,
void (*)(void*)) (obj=0x8e54080, tinfo=0x7fe233356210 <typeinfo for
std::bad_cast>, dest=0x7fe23311ea70 <std::bad_cast::~bad_cast()>) at
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/eh_throw.cc:87
#6 0x00007fe233110332 in std::__throw_bad_cast() () at
../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/functexcept.cc:63
#7 0x00007fe2330e8ad7 in std::use_facet<std::ctype<char> >(std::locale
const&) (__loc=...) at
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/locale_classes.tcc:137
#8 0x00000000008d2cdf in void
boost::algorithm::trim<std::string>(std::string&, std::locale const&) ()
#9 0x00007fe2396d5057 in impala::MemInfo::ParseSmaps() () at
/home/philip/src/Impala/be/src/util/mem-info.cc:132
...
My best theory is that there's a race/bug, wherein the std::locale* static
initialization
work is getting somehow 'reset' by the dynamic linker, when more libraries are
linked
in as a result of the query. My evidence to support this theory is scant, but
I do notice that LD_DEBUG=all prints the following when the query is executed
(but not right at startup):
binding file /home/philip/src/Impala/toolchain/gcc-4.9.2/lib64/libstdc++.so.6
[0] to
/home/philip/src/Impala/toolchain/gflags-2.2.0-p1/lib/libgflags.so.2.2 [0]:
normal symbol `std::locale::facet::_S_destroy_c_locale(__locale_struct*&)'
Note that there are BSS segments for some of std::locale::facet::* inside
of libgflags.so.
$nm toolchain/gflags-2.2.0-p1/lib/libgflags.so | c++filt | grep facet | grep
' B '
00000000002e2d10 B std::locale::facet::_S_c_locale
00000000002e2d0c B std::locale::facet::_S_once
I'm not the first to run into variants of these issues, though the results
are fairly unhelpful:
http://www.boost.org/doc/libs/1_58_0/libs/locale/doc/html/faq.html
https://stackoverflow.com/questions/26990412/c-boost-crashes-while-using-locale
https://svn.boost.org/trac10/ticket/4671
http://clang-developers.42468.n3.nabble.com/std-use-facet-lt-std-ctype-lt-char-gt-gt-crashes-on-linux-td4033967.html
https://unix.stackexchange.com/questions/719/can-we-get-compiler-information-from-an-elf-binary
https://stackoverflow.com/questions/42376100/linking-with-library-causes-collate-facet-to-be-missing-from-char
http://lists.llvm.org/pipermail/cfe-dev/2012-July/023289.html
https://gcc.gnu.org/ml/libstdc++/2014-11/msg00122.html
Change-Id: I8dd807f869a9359d991ba515177fb2298054520e
---
M be/src/util/mem-info.cc
M be/src/util/proc-info-test.cc
2 files changed, 14 insertions(+), 5 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/8888/2
--
To view, visit http://gerrit.cloudera.org:8080/8888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8dd807f869a9359d991ba515177fb2298054520e
Gerrit-Change-Number: 8888
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>