I just opened https://issues.apache.org/jira/browse/ARROW-7089 about
increasing transparency around what options are causing thirdparty
dependencies to be required

On Thu, Nov 7, 2019 at 10:05 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> hi Richard,
>
> On Thu, Nov 7, 2019 at 9:59 AM Richard Bachmann
> <richard.bachm...@cern.ch> wrote:
> >
> > Hello,
> > I'm contacting you on behalf of the LCG Releases team at CERN. We
> > provide a common software stack for LHCb, ATLAS and others to be used at
> > CERN and the worldwide computing grid.
> >
> > Right now we're looking into optimizing the way we're building Apache
> > Arrow (C++ & Python) and its dependencies. Ideally we'd like to build
> > Arrow using only the minimum of necessary dependencies to run it, and to
> > use packages already installed in the stack to fulfill these
> > dependencies. The former would be nice to keep the stack clean, the
> > latter would help us avoid duplication and failing builds due to mirrors
> > going offline.
> >
> > Our builds currently run with the ARROW_DEPENDENCY_SOURCE=AUTO
> > <https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst>
> > setting, which results in duplicate and non-essential packages being
> > downloaded by Arrow, as well as dependency on external mirrors. Setting
> > it to SYSTEM allows us to avoid the downloads, but then the build
> > process fails due to missing unused dependencies.
>
> I'm surprised to hear this based on what I know about the build system
> and from extensive local development.
>
> Can you show the exact CMake invocation you are using and indicate
> which unused dependencies are being downloaded?
>
> In this Docker minimal build (unless something has been recently
> broken) that the project can be built with only a small number of
> third party dependencies:
>
> https://github.com/apache/arrow/tree/master/cpp/examples/minimal_build
>
> Note that we support a fully "offline" build to allow thirdparty
> dependencies to be built in an air-gapped environment
>
> https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst#offline-builds
>
> > Do you know if there is a recommended way to achieve this? The problem
> > seems to stem from the fact that all listed dependencies are downloaded,
> > whether they are needed or not. We have considered patching out the
> > non-essential dependencies ('double-conversion', 'GTEST', etc.) from the
> > dependency list, as well as formally adding the unneeded dependencies to
> > the stack in order to run with the SYSTEM setting. However, if there is
> > a proper way to do it we would of course prefer to follow that course of
> > action.
>
> We'll be able to know more based on how you're calling CMake and with
> what options, but the build system should not be downloading any
> dependencies that are not needed.
>
> >
> > Any help would be very appreciated.
> > Kind regards:
> >
> >      - Richard Bachmann
> >

Reply via email to