I think it's a great idea to make default / dev compile times faster and have clear guidelines for how to use dependencies.
Some low-hanging fruit could be moving some development dependencies under different crates to reduce compile times and bigger dependencies, i.e. those needed for criterion. I created a PR to show this here: https://github.com/apache/arrow/pull/9493 The same could be done for DataFusion, and dependencies needed for the examples as well. Op zo 14 feb. 2021 om 14:29 schreef Ruan Pearce-Authers < r...@reservoirdb.com>: > I'd be interested in helping spec this out, it's especially tricky atm to > track down issues when integrating DataFusion into the same binary as other > medium/large dependencies. > > Recently hit a really specific issue where DataFusion depends on Parquet, > which supports various compression algs, including Brotli, and actix-web > also depends on a slightly different Rust implementation of Brotli. Both of > these Brotli libs package the same underlying C lib separately, resulting > in multiply-defined symbols compiling using msvc (and maybe on other > platforms? didn't test in CI in the end). > > Got a quick interim hack [1] in place for my use case which doesn't really > use Parquet, so it's not pressing, but would be awesome to sort this > properly upstream. > > I guess the only major tradeoff of having a comprehensive feature setup is > that it could make testing slightly harder, in terms of making sure no-one > breaks the build for specific feature combinations; this can always be > mitigated with more CI though (yay, unlimited Actions minutes for public > repos). > > Also, unrelated, is there a schedule for the sync calls? Will try and > carve out some free time for the next one :) > > [1] > https://github.com/reservoirdb/arrow/commit/e63e157927a552ecf1a6f63ec401f0b6157b5468 > > -----Original Message----- > From: Andrew Lamb <al...@influxdata.com> > Sent: 14 February 2021 11:14 > To: dev <dev@arrow.apache.org> > Subject: [Rust] [DataFusion] Topic for next Rust Sync Call > > I would like to add the following item to the agenda call for the next > Rust sync call: > > Dependencies > > Background: As the dependency stack gets larger, it will be harder to use > DataFusion as an embedded query engine and the compile / dev times will get > higher. > > As we expand the supported functions of DataFusion this problem is likely > to get worse. For example > https://github.com/apache/arrow/pull/9243#discussion_r575716759 and > https://github.com/apache/arrow/pull/9139 > > Proposal: Add Rust "features" to the datafusion crate and make many of the > new dependencies optional (so that we had features like regex and unicode > and hash which would only pull in the dependencies / have those functions > if the features were enabled.) This approach has worked well for Arrow > (which has only chrono and num as required dependencies) > -- Daniël Heres