[ https://issues.apache.org/jira/browse/ARROW-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384879#comment-16384879 ]
Wes McKinney commented on ARROW-2247: ------------------------------------- One possibility is that we could refactor the parquet-cpp build system to utilize the Arrow build system via a git submodule, so that the {{libparquet}} target can be used within a single unified build system. So we would go from two loosely-connected build systems to one. > [Python] Statically-linking boost_regex in both libarrow and libparquet > results in segfault > ------------------------------------------------------------------------------------------- > > Key: ARROW-2247 > URL: https://issues.apache.org/jira/browse/ARROW-2247 > Project: Apache Arrow > Issue Type: Bug > Reporter: Wes McKinney > Priority: Major > > This is a backtrace loading {{libparquet.so}} on Ubuntu 14.04 using boost > 1.66.1 from conda-forge. Both libarrow and libparquet contain {{boost_regex}} > statically linked. > {code} > In [1]: import ctypes > In [2]: ctypes.CDLL('libparquet.so') > Program received signal SIGSEGV, Segmentation fault. > 0x00007fffed4ad3fb in std::basic_string<char, std::char_traits<char>, > std::allocator<char> >::basic_string(std::string const&) () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > (gdb) bt > #0 0x00007fffed4ad3fb in std::basic_string<char, std::char_traits<char>, > std::allocator<char> >::basic_string(std::string const&) () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #1 0x00007fffed74c1fc in > boost::re_detail_106600::cpp_regex_traits_char_layer<char>::init() () > from /home/wesm/cpp-toolchain/lib/libboost_regex.so.1.66.0 > #2 0x00007fffed794803 in > boost::object_cache<boost::re_detail_106600::cpp_regex_traits_base<char>, > boost::re_detail_106600::cpp_regex_traits_implementation<char> > >::do_get(boost::re_detail_106600::cpp_regex_traits_base<char> const&, > unsigned long) () from /home/wesm/cpp-toolchain/lib/libboost_regex.so.1.66.0 > #3 0x00007fffed79e62b in boost::basic_regex<char, boost::regex_traits<char, > boost::cpp_regex_traits<char> > >::do_assign(char const*, char const*, > unsigned int) () from /home/wesm/cpp-toolchain/lib/libboost_regex.so.1.66.0 > #4 0x00007fffee58561b in boost::basic_regex<char, boost::regex_traits<char, > boost::cpp_regex_traits<char> > >::assign (this=0x7fffffff3780, > p1=0x7fffee600602 > "(.*?)\\s*(?:(version\\s*(?:([^(]*?)\\s*(?:\\(\\s*build\\s*([^)]*?)\\s*\\))?)?)?)", > > p2=0x7fffee60064a "", f=0) at > /home/wesm/cpp-toolchain/include/boost/regex/v4/basic_regex.hpp:381 > #5 0x00007fffee5855a7 in boost::basic_regex<char, boost::regex_traits<char, > boost::cpp_regex_traits<char> > >::assign (this=0x7fffffff3780, > p=0x7fffee600602 > "(.*?)\\s*(?:(version\\s*(?:([^(]*?)\\s*(?:\\(\\s*build\\s*([^)]*?)\\s*\\))?)?)?)", > f=0) > at /home/wesm/cpp-toolchain/include/boost/regex/v4/basic_regex.hpp:366 > #6 0x00007fffee5683f3 in boost::basic_regex<char, boost::regex_traits<char, > boost::cpp_regex_traits<char> > >::basic_regex (this=0x7fffffff3780, > p=0x7fffee600602 > "(.*?)\\s*(?:(version\\s*(?:([^(]*?)\\s*(?:\\(\\s*build\\s*([^)]*?)\\s*\\))?)?)?)", > f=0) > at /home/wesm/cpp-toolchain/include/boost/regex/v4/basic_regex.hpp:335 > #7 0x00007fffee5656d0 in parquet::ApplicationVersion::ApplicationVersion ( > Python Exception <class 'gdb.error'> There is no member named _M_dataplus.: > this=0x7fffee8f1fb8 > <parquet::ApplicationVersion::PARQUET_251_FIXED_VERSION>, created_by=) > at ../src/parquet/metadata.cc:452 > #8 0x00007fffee41c271 in __cxx_global_var_init.1(void) () at > ../src/parquet/metadata.cc:35 > #9 0x00007fffee41c44e in _GLOBAL__sub_I_metadata.tmp.wesm_desktop.4838.ii () > from /home/wesm/local/lib/libparquet.so > #10 0x00007ffff7dea1da in call_init (l=<optimized out>, argc=argc@entry=2, > argv=argv@entry=0x7fffffff5d88, > env=env@entry=0x7fffffff5da0) at dl-init.c:78 > #11 0x00007ffff7dea2c3 in call_init (env=<optimized out>, argv=<optimized > out>, argc=<optimized out>, > l=<optimized out>) at dl-init.c:36 > #12 _dl_init (main_map=main_map@entry=0x13fb220, argc=2, argv=0x7fffffff5d88, > env=0x7fffffff5da0) > at dl-init.c:126 > {code} > This seems to be caused by static initializations in libparquet: > https://github.com/apache/parquet-cpp/blob/master/src/parquet/metadata.cc#L34 > We should see if removing these static initializations makes the problem go > away. If not, then statically-linking boost_regex in both libraries is not > advisable. > For this reason and more, I really wish that Arrow and Parquet shared a > common build system and monorepo structure -- it would make handling these > toolchain and build-related issues much simpler. -- This message was sent by Atlassian JIRA (v7.6.3#76005)