> However, extension modules are always going to have to share the Python 
> process, so this policy kind of says, you can't use external C++ extension 
> code with conda.

This is a bit too extreme. What I meant is that you should try not to
mix C++ build toolchains. I think this is good advice even without
conda/conda-forge in the loop. If conda-forge were supplying the
library / build toolchain for the rest of your projects, then
everything would be OK.

> Given the policy, it seems slightly better to link Boost dynamically.

We could do this, but it seems like a last resort workaround to the
core problem, which is the mixed build toolchain issue. I don't know
what Boost's ABI guarantees are, but dynamic linking isn't guaranteed
to solve using two libraries built against different versions of Boost
in the same process. The boost-cpp package is a pretty chunky runtime
dependency also. We could give it a shot and see how it goes in the
next release cycle.

- Wes

On Sat, Feb 17, 2018 at 4:06 PM, Alex Samuel <a...@alexsamuel.net> wrote:
> OK.
>
> I'll probably be able to work around this problem.  Just a couple of
> thoughts for the long term:
>
> 1. It seems mostly reasonable to treat conda as closed ecosystem as you
> describe; other C++ stuff can be deployed by other means.  However,
> extension modules are always going to have to share the Python process, so
> this policy kind of says, you can't use external C++ extension code with
> conda.
>
> 2. Given the policy, it seems slightly better to link Boost dynamically.  I
> had checked package metadata and shared lib dependencies, and didn't even
> realize it used Boost, until one of my colleagues actually looked at the
> symbol table and pointed this out.  As it stands, Boost is an undeclared
> dependency, at least "dependency" in the sense of pinning a version.
>
> Thanks for your help.
> Alex
>
>
>
> On 02/17/2018 11:32 AM, Wes McKinney wrote:
>>
>> It sounds like for your use case that it would be better for you to
>> build your own Arrow packages that use the same Boost as the rest of
>> your repo. You can possible use the scourge tool that Phillip built to
>> help with this (we're using it to build nightlies).
>>
>> conda-forge is a fairly closed ecosystem under the present
>> circumstances -- the intent is that libraries within it are
>> interoperable with each other, and that packages built with
>> conda-forge binaries as their third party dependencies (e.g. if you
>> were using the boost-cpp conda-forge package) will also be able to
>> work. Using the conda-forge stack as an add-on to a substantial
>> independent C++ library stack is not (IIUC) an intended use case.
>>
>> Note that there are libstdc++-related issues using conda-forge
>> binaries with Anaconda >= 5.0 due to the change in compilers.
>> Hopefully this will get fixed in the next few months.
>>
>> - Wes
>>
>> On Sat, Feb 17, 2018 at 11:14 AM, Alex Samuel <a...@alexsamuel.net> wrote:
>>>
>>> OK, though if both modules linked Boost statically, I believe they would
>>> have distinct copies of global variables.  Whether or not this causes
>>> problems depends on whether they are purely internal or tied to external
>>> state.  My hunch is that for Boost::regex, there wouldn't be an issue
>>> with
>>> two complete copies in the same process, as long as they remained
>>> separate.
>>> I still suspect our module is picking up some symbols from the copy of
>>> Boost
>>> statically linked to Parquet rather than the one it pulls in as a shared
>>> lib
>>> dependency.  One thing I could try is to link ours statically as well.
>>>
>>> I wasn't aware of bcp; I'll take a look at that.
>>>
>>> It may or may not be possible for us to build our C++ stuff against
>>> conda-forge's Boost; I'm not sure.  We have a large C++ codebase and
>>> distribute parts of it, particularly Python extension modules, via conda.
>>>
>>> In general, I suspect as conda-forge grows and packages more C++ code,
>>> issues like this are likely to become an increasing problem.  Do you know
>>> if
>>> there is a general policy regarding how extension modules in conda
>>> packages
>>> should link common C++ libraries like Boost?
>>>
>>> Thanks,
>>> Alex
>>>
>>>
>>>
>>>
>>> On 02/17/2018 11:04 AM, Uwe L. Korn wrote:
>>>>
>>>>
>>>> Static linking does not really solve all problems in the Boost case as
>>>> there are global variables that are picked up across different Boost
>>>> versions. Thus if you link it statically it still has an effect on other
>>>> dependencies. In the case of boost, you can get around this by using the
>>>> bcp
>>>> tool http://www.boost.org/doc/libs/1_66_0/tools/bcp/doc/html/index.html
>>>> to
>>>> rename your local boost to a different namespace to avoid collisions. We
>>>> will also do thus in future with our wheel in Arrow, too. But we will
>>>> not do
>>>> this for conda-packages as there the assumption is that all artefacts
>>>> will
>>>> be linked against the same Boost version.
>>>>
>>>> Uwe
>>>>
>>>>> Am 17.02.2018 um 16:55 schrieb Alex Samuel <a...@alexsamuel.net>:
>>>>>
>>>>> Yes, we do link our internal code with a different Boost version; we
>>>>> build and (conda) package it ourselves.
>>>>>
>>>>> Why should this matter if Parquet links it statically?  If static
>>>>> linking
>>>>> won't allow us to use our own version, why bother linking statically at
>>>>> all?
>>>>>
>>>>> Thanks,
>>>>> Alex
>>>>>
>>>>>
>>>>>
>>>>>> On 02/17/2018 10:52 AM, Uwe L. Korn wrote:
>>>>>> Hi,
>>>>>> The issue is here no that Boost is linked statically in one and
>>>>>> dynamically in another but that you link against two different boost
>>>>>> versions. The stacktrace shows links to Boost 1.55 whereas Arrow
>>>>>> should be
>>>>>> linked against 1.65 or 1.66 (the one coming from conda-forge). Arrow
>>>>>> requires at least a Boost version of 1.60+ to work. The most likely
>>>>>> guess
>>>>>> from my side would be that your internal modules are linked against
>>>>>> the
>>>>>> system boost, not the conda-provided boost.
>>>>>> Uwe
>>>>>>>
>>>>>>>
>>>>>>> Am 17.02.2018 um 16:47 schrieb Alex Samuel <a...@alexsamuel.net>:
>>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> Sure, I'll append top of the stack.  You can see our internal
>>>>>>> function
>>>>>>> "asd::infra::util::start_of_date"; everything else is Boost, Python,
>>>>>>> or
>>>>>>> libstdc++.
>>>>>>>
>>>>>>> My understanding (though I haven't demonstrated this conclusively) is
>>>>>>> that, because Python loads extension modules RTLD_GLOBAL, an
>>>>>>> extension
>>>>>>> module can pick up symbols from another or its dependencies, even if
>>>>>>> the
>>>>>>> former "usually" satisfy relocations from their own shared lib
>>>>>>> dependencies.
>>>>>>> So, one module linking Boost statically may interfere with another
>>>>>>> that
>>>>>>> links it dynamically, by injecting its symbols.
>>>>>>>
>>>>>>> If necessary I can try to put together a minimal test case, but no
>>>>>>> guarantee it will actually trigger the bug.  But it might be worth
>>>>>>> testing
>>>>>>> my theory above first, with gdb or by some other means.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Alex
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> #0  0x00007f6ddeba4c8b in std::basic_string<char,
>>>>>>> std::char_traits<char>, std::allocator<char>
>>>>>>>>
>>>>>>>> ::basic_string(std::basic_string<char, std::char_traits<char>,
>>>>>>>
>>>>>>> std::allocator<char> > const&) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/numexpr/../../../libstdc++.so.6
>>>>>>>
>>>>>>> #1  0x00007f6dd4f978d6 in
>>>>>>> boost::re_detail::cpp_regex_traits_char_layer<char>::init() ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../.././libboost_regex.so.1.55.0
>>>>>>>
>>>>>>> #2  0x00007f6dd4fdbd88 in
>>>>>>> boost::object_cache<boost::re_detail::cpp_regex_traits_base<char>,
>>>>>>> boost::re_detail::cpp_regex_traits_implementation<char>
>>>>>>>>
>>>>>>>> ::do_get(boost::re_detail::cpp_regex_traits_base<char> const&,
>>>>>>>> unsigned
>>>>>>>
>>>>>>> long) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../.././libboost_regex.so.1.55.0
>>>>>>>
>>>>>>> #3  0x00007f6dd4fe5bb5 in boost::basic_regex<char,
>>>>>>> boost::regex_traits<char, boost::cpp_regex_traits<char> >
>>>>>>> >::do_assign(char
>>>>>>> const*, char const*, unsigned int) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../.././libboost_regex.so.1.55.0
>>>>>>>
>>>>>>> #4  0x00007f6dd575a90e in
>>>>>>> asd::infra::util::start_of_date(std::basic_string<char,
>>>>>>> std::char_traits<char>, std::allocator<char> > const&, char const*)
>>>>>>> ()
>>>>>>>
>>>>>>>      at
>>>>>>>
>>>>>>> /prod/sys/sysasd/opt/tudor-devtools/v1.3/Linux.el6.x86_64-corei7-avx-gcc4.83-anaconda2.0.1/include/boost/regex/v4/basic_regex.hpp:382
>>>>>>>
>>>>>>> #5  0x00007f6dd2c36734 in
>>>>>>>
>>>>>>> boost::python::objects::caller_py_function_impl<boost::python::detail::caller<unsigned
>>>>>>> long (*)(std::basic_string<char, std::char_traits<char>,
>>>>>>> std::allocator<char> > const&, char const*),
>>>>>>> boost::python::default_call_policies, boost::mpl::vector3<unsigned
>>>>>>> long,
>>>>>>> std::basic_string<char, std::char_traits<char>, std::allocator<char>
>>>>>>> >
>>>>>>> const&, char const*> > >::operator()(_object*, _object*) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/util.so
>>>>>>>
>>>>>>> #6  0x00007f6dd52ac71a in
>>>>>>> boost::python::objects::function::call(_object*, _object*) const ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0
>>>>>>>
>>>>>>> #7  0x00007f6dd52aca68 in
>>>>>>>
>>>>>>> boost::detail::function::void_function_ref_invoker0<boost::python::objects::(anonymous
>>>>>>> namespace)::bind_return,
>>>>>>> void>::invoke(boost::detail::function::function_buffer&) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0
>>>>>>>
>>>>>>> #8  0x00007f6dd52b4cd3 in
>>>>>>>
>>>>>>> boost::python::detail::exception_handler::operator()(boost::function0<void>
>>>>>>> const&) const ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0
>>>>>>>
>>>>>>> #9  0x00007f6dd2c32c03 in
>>>>>>>
>>>>>>> boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool,
>>>>>>> boost::python::detail::translate_exception<asd::infra::Exception,
>>>>>>> void
>>>>>>> (*)(asd::infra::Exception const&)>, boost::_bi::list3<boost::arg<1>,
>>>>>>> boost::arg<2>, boost::_bi::value<void (*)(asd::infra::Exception
>>>>>>> const&)> >
>>>>>>>>
>>>>>>>> , bool, boost::python::detail::exception_handler const&,
>>>>>>>
>>>>>>> boost::function0<void>
>>>>>>> const&>::invoke(boost::detail::function::function_buffer&,
>>>>>>> boost::python::detail::exception_handler const&,
>>>>>>> boost::function0<void>
>>>>>>> const&) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/util.so
>>>>>>>
>>>>>>> #10 0x00007f6dd52b4a9d in
>>>>>>> boost::python::handle_exception_impl(boost::function0<void>) ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0
>>>>>>>
>>>>>>> #11 0x00007f6dd52ab2b3 in function_call ()
>>>>>>>
>>>>>>>     from
>>>>>>>
>>>>>>> /space/asd/conda/envs/rd-20180212-0/lib/python2.7/site-packages/asd/infra/../../../../libboost_python.so.1.55.0
>>>>>>>
>>>>>>> #12 0x00007f6df2bc8e93 in PyObject_Call (func=0x2799850, arg=<value
>>>>>>> optimized out>,
>>>>>>>
>>>>>>>      kw=<value optimized out>) at Objects/abstract.c:2547
>>>>>>>
>>>>>>> #13 0x00007f6df2c7b80d in do_call (f=<value optimized out>,
>>>>>>> throwflag=<value optimized out>)
>>>>>>>
>>>>>>>      at Python/ceval.c:4569
>>>>>>>
>>>>>>> #14 call_function (f=<value optimized out>, throwflag=<value
>>>>>>> optimized
>>>>>>> out>) at Python/ceval.c:4374
>>>>>>>
>>>>>>> #15 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value
>>>>>>> optimized out>) at Python/ceval.c:2989
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 02/17/2018 10:31 AM, Uwe L. Korn wrote:
>>>>>>>> Hello,
>>>>>>>> I am not sure why we are linking statically in the conda-forge
>>>>>>>> packages, as a gut feeling we should link dynamically there. Wes,
>>>>>>>> can you
>>>>>>>> remember why?
>>>>>>>> Alex, would it be possible for you to send us the part of the
>>>>>>>> segmentation fault that is not private to your modules. That would
>>>>>>>> be a good
>>>>>>>> indicator for us what is going wrong.
>>>>>>>> Typically it is best when you enable coredumps with `ulimit -c
>>>>>>>> unlimited` and then run your program as usual. There should be no
>>>>>>>> performance penalty. When ist segfaults, run `gdb python core` (note
>>>>>>>> that
>>>>>>>> the core file might also be postfixed with the PID but that depends
>>>>>>>> on your
>>>>>>>> system). In gdb type 'thread apply all bt full'. Post thd output pf
>>>>>>>> that
>>>>>>>> command and strip away the parts we should not see. Most relevant
>>>>>>>> will be
>>>>>>>> the stacktrace of the thread that segfaulted.
>>>>>>>> Uwe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 16.02.2018 um 23:17 schrieb Alex Samuel <a...@alexsamuel.net>:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I am having some troubles using the Continuum PyArrow conda package
>>>>>>>>> dependencies in conjunction with internal C++ extension modules.
>>>>>>>>>
>>>>>>>>> Apparently, Arrow and Parquet link Boost statically.  We have some
>>>>>>>>> internal packages containing C++ code that linking Boost libs
>>>>>>>>> dynamicaly.
>>>>>>>>> If we import Feather as well as our own extension modules into the
>>>>>>>>> same
>>>>>>>>> Python process, we get random segfaults in Boost.  I think what's
>>>>>>>>> happening
>>>>>>>>> is that our extension modules are picking up Boost's symbols from
>>>>>>>>> Arrow and
>>>>>>>>> Parquet already loaded into the process, rather than from our own
>>>>>>>>> Boost
>>>>>>>>> shared libs.
>>>>>>>>>
>>>>>>>>> Could anyone explain the policy for linking Boost in binary
>>>>>>>>> distributions, particularly conda packages?  What is your
>>>>>>>>> expectation for
>>>>>>>>> how other C++ extension modules should be built?
>>>>>>>>>
>>>>>>>>> Thanks in advance,
>>>>>>>>> Alex
>>>>>>>>>
>>>>
>>>
>

Reply via email to