hi Micah,

On Thu, Sep 19, 2019 at 12:41 AM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> >
> > * Should optional components be "opt in", "out out", or a mix?
> > Currently it's a mix, and that's confusing for people. I think we
> > should make them all "opt in".
>
> Agreed they should all be opt in by default.  I think active developer are
> quite adept at flipping the appropriate CMake flags.
>

Cool. I opened a tracking JIRA
https://issues.apache.org/jira/browse/ARROW-6637 and attached many
issues. Sorry for the new JIRA flood

>
> > * Do we want to bring the out-of-the-box core build down to zero
> > dependencies, including not depending on boost::filesystem and
> > possibly checking the compiled Flatbuffers files.
>
>  While it may be
> > slightly more maintenance work, I think the optics of a
> > "dependency-free" core build would be beneficial and help the project
> > marketing-wise.
>
> I'm -.5 on checking in generated artifacts but this is mostly stylistic.
> In the case of flatbuffers it seems like we might be able to get-away with
> vendoring since it should mostly be headers only.
>
> I would prefer to try come up with more granular components and be
> very conservative on what is "core".  I think it should be possible have a
> zero dependency build if only MemoryPool, Buffers, Arrays and ArrayBuilders
> in a core package [1].  This combined with discussion Antoine started on an
> ABI compatible C-layer would make basic inter-op within a process
> reasonable.  Moving up the stack to IPC and files, there is probably a way
> to package headers separately from implementations.  This would allow other
> projects wishing to integrate with Arrow to bring their own implementations
> without the baggage of boost::filesystem. Would this leave anything besides
> "flatbuffers" as a hard dependency to support IPC?
>

We could indeed split up libarrow into more shared libraries. This
would mean accepting a lot more maintenance effort though, on a team
that is already overburdened. I'm not too keen on that in the short
term.

> Thanks,
> Micah
>
>
> [1] It probably makes sense to go even further and separate out MemoryPool
> and Buffer, so we can break the circular relationship between parquet and
> arrow.

Don't think this is possible even then, particularly in light of my
recent work reading and writing Arrow columnar data "closer to the
metal"  inside Parquet, yielding beneficial performance improvements.

>
> On Wed, Sep 18, 2019 at 8:03 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > To be clear I think we should make these changes right after 0.15.0 is
> > released so we aren't playing whackamole with our packaging scripts.
> > I'm happy to take the lead on the work...
> >
> > On Wed, Sep 18, 2019 at 9:54 AM Antoine Pitrou <solip...@pitrou.net>
> > wrote:
> > >
> > > On Wed, 18 Sep 2019 09:46:54 -0500
> > > Wes McKinney <wesmck...@gmail.com> wrote:
> > > > I think these are both interesting areas to explore further. I'd like
> > > > to focus on the couple of immediate items I think we should address
> > > >
> > > > * Should optional components be "opt in", "out out", or a mix?
> > > > Currently it's a mix, and that's confusing for people. I think we
> > > > should make them all "opt in".
> > > > * Do we want to bring the out-of-the-box core build down to zero
> > > > dependencies, including not depending on boost::filesystem and
> > > > possibly checking the compiled Flatbuffers files. While it may be
> > > > slightly more maintenance work, I think the optics of a
> > > > "dependency-free" core build would be beneficial and help the project
> > > > marketing-wise.
> > > >
> > > > Both of these issues must be addressed whether we undertake a Bazel
> > > > implementation or some other refactor of the C++ build system.
> > >
> > > I think checking in the Flatbuffers files (and also Protobuf and Thrift
> > > where applicable :-)) would be fine.
> > >
> > > As for boost::filesystem, getting rid of it wouldn't be a huge task.
> > > Still worth deciding whether we want to prioritize development time for
> > > it, because it's not entirely trivial either.
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> >

Reply via email to