Control: tags -1 patch

One way to reproduce this bug:
# apt-get install pocl-opencl-icd blender
$ gdb blender
open User Preferences -> System
This promptly crashes; beignet-opencl-icd 1.2.1-1 (LLVM 3.8, like pocl) also triggers it, but beignet-opencl-icd 1.2.1-2 and mesa-opencl-icd (LLVM 3.9, like the graphics part of mesa) don't.

#0  llvm::cl::Option::setArgStr(llvm::StringRef) ()
    at /build/llvm-toolchain-3.9-3.9.1/include/llvm/ADT/SmallPtrSet.h:224
No locals.
#1 0x00007fffbbe9aacb in llvm::cl::opt<(anonymous namespace)::HelpPrinter, true, llvm::cl::parser<bool> >::opt<char [17], llvm::cl::desc, llvm::cl::LocationClass<(anonymous namespace)::HelpPrinter>, llvm::cl::OptionHidden, llvm::cl::ValueExpected, llvm::cl::cat>(char const (&) [17], llvm::cl::desc const&, llvm::cl::LocationClass<(anonymous namespace)::HelpPrinter> const&, llvm::cl::OptionHidden const&, llvm::cl::ValueExpected const&, llvm::cl::cat const&) [clone .constprop.300] () at /build/llvm-toolchain-3.8-3.8.1/include/llvm/Support/CommandLine.h:1041
No locals.
#2  0x00007fffbbe9ad08 in _GLOBAL__sub_I_CommandLine.cpp ()
    at /build/llvm-toolchain-3.8-3.8.1/lib/Support/CommandLine.cpp:1661
No locals.
#3  0x00007ffff7de95da in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#4  0x00007ffff7de96eb in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#5  0x00007ffff7dedc68 in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#6  0x00007ffff7de9484 in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#7  0x00007ffff7ded419 in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#8  0x00007fffee0f0ee9 in ?? () from /lib/x86_64-linux-gnu/libdl.so.2
No symbol table info available.
#9  0x00007ffff7de9484 in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#10 0x00007fffee0f1521 in ?? () from /lib/x86_64-linux-gnu/libdl.so.2
No symbol table info available.
#11 0x00007fffee0f0f82 in dlopen () from /lib/x86_64-linux-gnu/libdl.so.2
No symbol table info available.
#12 0x00007fffbf9f9212 in ?? () from /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
No symbol table info available.
#13 0x00007fffbf9f9360 in ?? () from /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
No symbol table info available.
#14 0x00007fffbf9f98d0 in ?? () from /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
No symbol table info available.
#15 0x00007fffbf9fa0d3 in clGetPlatformIDs ()
   from /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
No symbol table info available.

The relevant lines of the LLVM build log appear to be

Scanning dependencies of target LLVM
make[4]: Leaving directory '/«PKGBUILDDIR»/build-llvm'
/usr/bin/make -f tools/llvm-shlib/CMakeFiles/LLVM.dir/build.make tools/llvm-shlib/CMakeFiles/LLVM.dir/build
make[4]: Entering directory '/«PKGBUILDDIR»/build-llvm'
[ 76%] Building CXX object tools/llvm-shlib/CMakeFiles/LLVM.dir/libllvm.cpp.o cd /«PKGBUILDDIR»/build-llvm/tools/llvm-shlib && /usr/bin/g++-6 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/«PKGBUILDDIR»/build-llvm/tools/llvm-shlib -I/«PKGBUILDDIR»/tools/llvm-shlib -I/«PKGBUILDDIR»/build-llvm/include -I/«PKGBUILDDIR»/include -std=c++0x -gsplit-dwarf -Wl,-fuse-ld=gold -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -Werror=date-time -std=c++11 -ffunction-sections -fdata-sections -O2 -g -DNDEBUG -fPIC -fno-exceptions -o CMakeFiles/LLVM.dir/libllvm.cpp.o -c /«PKGBUILDDIR»/tools/llvm-shlib/libllvm.cpp
[ 76%] Linking CXX shared library ../../lib/libLLVM-3.9.so
cd /«PKGBUILDDIR»/build-llvm/tools/llvm-shlib && /usr/bin/cmake -E cmake_link_script CMakeFiles/LLVM.dir/link.txt --verbose=1 /usr/bin/g++-6 -fPIC -std=c++0x -gsplit-dwarf -Wl,-fuse-ld=gold -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -Werror=date-time -std=c++11 -ffunction-sections -fdata-sections -O2 -g -DNDEBUG -Wl,-O3 -Wl,--gc-sections -Wl,-z,relro -Wl,-z,defs -shared -Wl,-soname,libLLVM-3.9.so.1 -o ../../lib/libLLVM-3.9.so.1 CMakeFiles/LLVM.dir/libllvm.cpp.o -Wl,-rpath,"\$ORIGIN/../lib" -Wl,--whole-archive [big list of ../../lib/libLLVM(something).a libraries] -lrt -ldl -ltinfo -lpthread -lz -lm cd /«PKGBUILDDIR»/build-llvm/tools/llvm-shlib && /usr/bin/cmake -E cmake_symlink_library ../../lib/libLLVM-3.9.so.1 ../../lib/libLLVM-3.9.so.1 ../../lib/libLLVM-3.9.so
make[4]: Leaving directory '/«PKGBUILDDIR»/build-llvm'
[ 76%] Built target LLVM

i.e. the main libLLVM shared library doesn't use a version script at all. (Some of its other libraries do have "version scripts", but these are used for their other function of limiting which symbols are public, and don't have versions.)

This suggests the fix (warning: untested and not my area of expertise - and if it is using that line, why is there no -Wl,--no-whole-archive in the build log?):

--- a/tools/llvm-shlib/CMakeLists.txt
+++ b/tools/llvm-shlib/CMakeLists.txt
@@ -42,7 +42,7 @@
 list(REMOVE_DUPLICATES LIB_NAMES)
if("${CMAKE_SYSTEM_NAME}" STREQUAL "Linux" OR "${CMAKE_SYSTEM_NAME}" STREQUAL "GNU" OR "${CMAKE_SYSTEM_NAME}" STREQUAL "kFreeBSD") # FIXME: It should be "GNU ld for elf"
   # GNU ld doesn't resolve symbols in the version script.
-  set(LIB_NAMES -Wl,--whole-archive ${LIB_NAMES} -Wl,--no-whole-archive)
+ set(LIB_NAMES -Wl,--version-script,../../../tools/llvm-shlib/simple_version_script.map -Wl,--whole-archive ${LIB_NAMES} -Wl,--no-whole-archive)
 elseif("${CMAKE_SYSTEM_NAME}" STREQUAL "Darwin")
   set(LIB_NAMES -Wl,-all_load ${LIB_NAMES})
 endif()
--- a/dev/null
+++ b/simple_version_script.map
@@ -0,0 +1,1 @@
+LLVM_3.9 { global: *; };

(Should also work with the obvious change for 3.8; I haven't checked 3.7. Deliberately not making 3.9 "depend" on 3.8, as the whole point is to make the linker treat them as separate libraries)

Some LLVM-using libraries use -Bsymbolic to avoid similar problems (e.g. #768185), but I don't know whether enabling that on LLVM itself would help: it may well be papering over a problem that symbol versioning would really solve.

In case my patch doesn't work, should we (the OpenCL team, and potentially other LLVM rdeps) treat this bug's RC status as "this will be fixed, however long it takes", or as "fix this reasonably soon or all but one of the versions gets removed"? Moving pocl-opencl-icd to LLVM 3.9 probably requires upgrading it to an upstream git snapshot (Fedora did), so we need to decide soon whether to do so.

Reply via email to