Hi Andy,

Usually I'd say that development outside of ASF should be faster because
you can publish a new release even after each commit.
In ASF you need to do a VOTE and wait for 3 binding +1s and 72 hours.
A user of datafusion-substrait could use git dependency to use latest
version even without published crate.
But I see that datafusion-substrait currently depends on datafusion 13.0
and I guess this is the main reason for moving it to arrow-datafusion.
Another solution would be datafusion-substrait to depend on
arrow-datafusion master via a git dependency.

+1 to move it as a subproject to arrow-datafusion now!
This will avoid collecting [more] (I)CLAs later and I see that there are
plans replace datafusion-proto with it at some point.

Martin

On Wed, Dec 7, 2022 at 2:50 AM Andy Grove <andygrov...@gmail.com> wrote:

> Hi,
>
> The DataFusion community has built an integration between DataFusion and
> Substrait under the datafusion-contrib GitHub organization [1].
>
> The project is now receiving regular contributions from NVIDIA (who are
> using it internally for a research project), and now GreptimeDB have
> expressed an interest in contributing as well [2].
>
> I think that we should consider moving development of this crate into
> DataFusion and wanted to see what others think of this idea.
>
> One reason for moving it into the official project is so that we can access
> this functionality from the DataFusion Python bindings without adding a
> dependency on a project that is outside ASF governance.
>
> I have created a GtiHub issue [3] where we can discuss this more, or we can
> discuss it here on the mailing list.
>
> I look forward to hearing some opinions on this.
>
> Thanks,
>
> Andy.
>
> [1] https://github.com/datafusion-contrib/datafusion-substrait
> [2] https://github.com/datafusion-contrib/datafusion-substrait/pull/34
> [3] https://github.com/apache/arrow-datafusion/issues/4536
>

Reply via email to