> However, extension modules are always going to have to share the Python > process, so this policy kind of says, you can't use external C++ extension > code with conda.
This is a bit too extreme. What I meant is that you should try not to mix C++ build toolchains. I think this is good advice even without conda/conda-forge in the loop. If conda-forge were supplying the library / build toolchain for the rest of your projects, then everything would be OK. > Given the policy, it seems slightly better to link Boost dynamically. We could do this, but it seems like a last resort workaround to the core problem, which is the mixed build toolchain issue. I don't know what Boost's ABI guarantees are, but dynamic linking isn't guaranteed to solve using two libraries built against different versions of Boost in the same process. The boost-cpp package is a pretty chunky runtime dependency also. We could give it a shot and see how it goes in the next release cycle. - Wes On Sat, Feb 17, 2018 at 4:06 PM, Alex Samuel <a...@alexsamuel.net> wrote: > OK. > > I'll probably be able to work around this problem. Just a couple of > thoughts for the long term: > > 1. It seems mostly reasonable to treat conda as closed ecosystem as you > describe; other C++ stuff can be deployed by other means. However, > extension modules are always going to have to share the Python process, so > this policy kind of says, you can't use external C++ extension code with > conda. > > 2. Given the policy, it seems slightly better to link Boost dynamically. I > had checked package metadata and shared lib dependencies, and didn't even > realize it used Boost, until one of my colleagues actually looked at the > symbol table and pointed this out. As it stands, Boost is an undeclared > dependency, at least "dependency" in the sense of pinning a version. > > Thanks for your help. > Alex > > > > On 02/17/2018 11:32 AM, Wes McKinney wrote: >> >> It sounds like for your use case that it would be better for you to >> build your own Arrow packages that use the same Boost as the rest of >> your repo. You can possible use the scourge tool that Phillip built to >> help with this (we're using it to build nightlies). >> >> conda-forge is a fairly closed ecosystem under the present >> circumstances -- the intent is that libraries within it are >> interoperable with each other, and that packages built with >> conda-forge binaries as their third party dependencies (e.g. if you >> were using the boost-cpp conda-forge package) will also be able to >> work. Using the conda-forge stack as an add-on to a substantial >> independent C++ library stack is not (IIUC) an intended use case. >> >> Note that there are libstdc++-related issues using conda-forge >> binaries with Anaconda >= 5.0 due to the change in compilers. >> Hopefully this will get fixed in the next few months. >> >> - Wes >> >> On Sat, Feb 17, 2018 at 11:14 AM, Alex Samuel <a...@alexsamuel.net> wrote: >>> >>> OK, though if both modules linked Boost statically, I believe they would >>> have distinct copies of global variables. Whether or not this causes >>> problems depends on whether they are purely internal or tied to external >>> state. My hunch is that for Boost::regex, there wouldn't be an issue >>> with >>> two complete copies in the same process, as long as they remained >>> separate. >>> I still suspect our module is picking up some symbols from the copy of >>> Boost >>> statically linked to Parquet rather than the one it pulls in as a shared >>> lib >>> dependency. One thing I could try is to link ours statically as well. >>> >>> I wasn't aware of bcp; I'll take a look at that. >>> >>> It may or may not be possible for us to build our C++ stuff against >>> conda-forge's Boost; I'm not sure. We have a large C++ codebase and >>> distribute parts of it, particularly Python extension modules, via conda. >>> >>> In general, I suspect as conda-forge grows and packages more C++ code, >>> issues like this are likely to become an increasing problem. Do you know >>> if >>> there is a general policy regarding how extension modules in conda >>> packages >>> should link common C++ libraries like Boost? >>> >>> Thanks, >>> Alex >>> >>> >>> >>> >>> On 02/17/2018 11:04 AM, Uwe L. Korn wrote: >>>> >>>> >>>> Static linking does not really solve all problems in the Boost case as >>>> there are global variables that are picked up across different Boost >>>> versions. Thus if you link it statically it still has an effect on other >>>> dependencies. In the case of boost, you can get around this by using the >>>> bcp >>>> tool http://www.boost.org/doc/libs/1_66_0/tools/bcp/doc/html/index.html >>>> to >>>> rename your local boost to a different namespace to avoid collisions. We >>>> will also do thus in future with our wheel in Arrow, too. But we will >>>> not do >>>> this for conda-packages as there the assumption is that all artefacts >>>> will >>>> be linked against the same Boost version. >>>> >>>> Uwe >>>> >>>>> Am 17.02.2018 um 16:55 schrieb Alex Samuel <a...@alexsamuel.net>: >>>>> >>>>> Yes, we do link our internal code with a different Boost version; we >>>>> build and (conda) package it ourselves. >>>>> >>>>> Why should this matter if Parquet links it statically? If static >>>>> linking >>>>> won't allow us to use our own version, why bother linking statically at >>>>> all? >>>>> >>>>> Thanks, >>>>> Alex >>>>> >>>>> >>>>> >>>>>> On 02/17/2018 10:52 AM, Uwe L. Korn wrote: >>>>>> Hi, >>>>>> The issue is here no that Boost is linked statically in one and >>>>>> dynamically in another but that you link against two different boost >>>>>> versions. The stacktrace shows links to Boost 1.55 whereas Arrow >>>>>> should be >>>>>> linked against 1.65 or 1.66 (the one coming from conda-forge). Arrow >>>>>> requires at least a Boost version of 1.60+ to work. The most likely >>>>>> guess >>>>>> from my side would be that your internal modules are linked against >>>>>> the >>>>>> system boost, not the conda-provided boost. >>>>>> Uwe >>>>>>> >>>>>>> >>>>>>> Am 17.02.2018 um 16:47 schrieb Alex Samuel <a...@alexsamuel.net>: >>>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> Sure, I'll append top of the stack. You can see our internal >>>>>>> function >>>>>>> "asd::infra::util::start_of_date"; everything else is Boost, Python, >>>>>>> or >>>>>>> libstdc++. >>>>>>> >>>>>>> My understanding (though I haven't demonstrated this conclusively) is >>>>>>> that, because Python loads extension modules RTLD_GLOBAL, an >>>>>>> extension >>>>>>> module can pick up symbols from another or its dependencies, even if >>>>>>> the >>>>>>> former "usually" satisfy relocations from their own shared lib >>>>>>> dependencies. >>>>>>> So, one module linking Boost statically may interfere with another >>>>>>> that >>>>>>> links it dynamically, by injecting its symbols. >>>>>>> >>>>>>> If necessary I can try to put together a minimal test case, but no >>>>>>> guarantee it will actually trigger the bug. But it might be worth >>>>>>> testing >>>>>>> my theory above first, with gdb or by some other means. >>>>>>> >>>>>>> Thanks! >>>>>>> Alex >>>>>>> >>>>>>> >>>>>>> >>>>>>> #0 0x00007f6ddeba4c8b in std::basic_string<char, >>>>>>> std::char_traits<char>, std::allocator<char> >>>>>>>> >>>>>>>> ::basic_string(std::basic_string<char, std::char_traits<char>, >>>>>>> >>>>>>> std::allocator<char> > const&) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/numexpr/../../../libstdc++.so.6 >>>>>>> >>>>>>> #1 0x00007f6dd4f978d6 in >>>>>>> boost::re_detail::cpp_regex_traits_char_layer<char>::init() () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../.././libboost_regex.so.1.55.0 >>>>>>> >>>>>>> #2 0x00007f6dd4fdbd88 in >>>>>>> boost::object_cache<boost::re_detail::cpp_regex_traits_base<char>, >>>>>>> boost::re_detail::cpp_regex_traits_implementation<char> >>>>>>>> >>>>>>>> ::do_get(boost::re_detail::cpp_regex_traits_base<char> const&, >>>>>>>> unsigned >>>>>>> >>>>>>> long) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../.././libboost_regex.so.1.55.0 >>>>>>> >>>>>>> #3 0x00007f6dd4fe5bb5 in boost::basic_regex<char, >>>>>>> boost::regex_traits<char, boost::cpp_regex_traits<char> > >>>>>>> >::do_assign(char >>>>>>> const*, char const*, unsigned int) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../.././libboost_regex.so.1.55.0 >>>>>>> >>>>>>> #4 0x00007f6dd575a90e in >>>>>>> asd::infra::util::start_of_date(std::basic_string<char, >>>>>>> std::char_traits<char>, std::allocator<char> > const&, char const*) >>>>>>> () >>>>>>> >>>>>>> at >>>>>>> >>>>>>> /prod/sys/sysasd/opt/tudor-devtools/v1.3/Linux.el6.x86_64-corei7-avx-gcc4.83-anaconda2.0.1/include/boost/regex/v4/basic_regex.hpp:382 >>>>>>> >>>>>>> #5 0x00007f6dd2c36734 in >>>>>>> >>>>>>> boost::python::objects::caller_py_function_impl<boost::python::detail::caller<unsigned >>>>>>> long (*)(std::basic_string<char, std::char_traits<char>, >>>>>>> std::allocator<char> > const&, char const*), >>>>>>> boost::python::default_call_policies, boost::mpl::vector3<unsigned >>>>>>> long, >>>>>>> std::basic_string<char, std::char_traits<char>, std::allocator<char> >>>>>>> > >>>>>>> const&, char const*> > >::operator()(_object*, _object*) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/util.so >>>>>>> >>>>>>> #6 0x00007f6dd52ac71a in >>>>>>> boost::python::objects::function::call(_object*, _object*) const () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0 >>>>>>> >>>>>>> #7 0x00007f6dd52aca68 in >>>>>>> >>>>>>> boost::detail::function::void_function_ref_invoker0<boost::python::objects::(anonymous >>>>>>> namespace)::bind_return, >>>>>>> void>::invoke(boost::detail::function::function_buffer&) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0 >>>>>>> >>>>>>> #8 0x00007f6dd52b4cd3 in >>>>>>> >>>>>>> boost::python::detail::exception_handler::operator()(boost::function0<void> >>>>>>> const&) const () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0 >>>>>>> >>>>>>> #9 0x00007f6dd2c32c03 in >>>>>>> >>>>>>> boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, >>>>>>> boost::python::detail::translate_exception<asd::infra::Exception, >>>>>>> void >>>>>>> (*)(asd::infra::Exception const&)>, boost::_bi::list3<boost::arg<1>, >>>>>>> boost::arg<2>, boost::_bi::value<void (*)(asd::infra::Exception >>>>>>> const&)> > >>>>>>>> >>>>>>>> , bool, boost::python::detail::exception_handler const&, >>>>>>> >>>>>>> boost::function0<void> >>>>>>> const&>::invoke(boost::detail::function::function_buffer&, >>>>>>> boost::python::detail::exception_handler const&, >>>>>>> boost::function0<void> >>>>>>> const&) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/util.so >>>>>>> >>>>>>> #10 0x00007f6dd52b4a9d in >>>>>>> boost::python::handle_exception_impl(boost::function0<void>) () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0 >>>>>>> >>>>>>> #11 0x00007f6dd52ab2b3 in function_call () >>>>>>> >>>>>>> from >>>>>>> >>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0 >>>>>>> >>>>>>> #12 0x00007f6df2bc8e93 in PyObject_Call (func=0x2799850, arg=<value >>>>>>> optimized out>, >>>>>>> >>>>>>> kw=<value optimized out>) at Objects/abstract.c:2547 >>>>>>> >>>>>>> #13 0x00007f6df2c7b80d in do_call (f=<value optimized out>, >>>>>>> throwflag=<value optimized out>) >>>>>>> >>>>>>> at Python/ceval.c:4569 >>>>>>> >>>>>>> #14 call_function (f=<value optimized out>, throwflag=<value >>>>>>> optimized >>>>>>> out>) at Python/ceval.c:4374 >>>>>>> >>>>>>> #15 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value >>>>>>> optimized out>) at Python/ceval.c:2989 >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 02/17/2018 10:31 AM, Uwe L. Korn wrote: >>>>>>>> Hello, >>>>>>>> I am not sure why we are linking statically in the conda-forge >>>>>>>> packages, as a gut feeling we should link dynamically there. Wes, >>>>>>>> can you >>>>>>>> remember why? >>>>>>>> Alex, would it be possible for you to send us the part of the >>>>>>>> segmentation fault that is not private to your modules. That would >>>>>>>> be a good >>>>>>>> indicator for us what is going wrong. >>>>>>>> Typically it is best when you enable coredumps with `ulimit -c >>>>>>>> unlimited` and then run your program as usual. There should be no >>>>>>>> performance penalty. When ist segfaults, run `gdb python core` (note >>>>>>>> that >>>>>>>> the core file might also be postfixed with the PID but that depends >>>>>>>> on your >>>>>>>> system). In gdb type 'thread apply all bt full'. Post thd output pf >>>>>>>> that >>>>>>>> command and strip away the parts we should not see. Most relevant >>>>>>>> will be >>>>>>>> the stacktrace of the thread that segfaulted. >>>>>>>> Uwe >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 16.02.2018 um 23:17 schrieb Alex Samuel <a...@alexsamuel.net>: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am having some troubles using the Continuum PyArrow conda package >>>>>>>>> dependencies in conjunction with internal C++ extension modules. >>>>>>>>> >>>>>>>>> Apparently, Arrow and Parquet link Boost statically. We have some >>>>>>>>> internal packages containing C++ code that linking Boost libs >>>>>>>>> dynamicaly. >>>>>>>>> If we import Feather as well as our own extension modules into the >>>>>>>>> same >>>>>>>>> Python process, we get random segfaults in Boost. I think what's >>>>>>>>> happening >>>>>>>>> is that our extension modules are picking up Boost's symbols from >>>>>>>>> Arrow and >>>>>>>>> Parquet already loaded into the process, rather than from our own >>>>>>>>> Boost >>>>>>>>> shared libs. >>>>>>>>> >>>>>>>>> Could anyone explain the policy for linking Boost in binary >>>>>>>>> distributions, particularly conda packages? What is your >>>>>>>>> expectation for >>>>>>>>> how other C++ extension modules should be built? >>>>>>>>> >>>>>>>>> Thanks in advance, >>>>>>>>> Alex >>>>>>>>> >>>> >>> >