> but will likely also need a method on PyArrow compute expressions to convert
> to a Substrait expression.

There is a C++ method to do this (one of the arrow::engine::ToProto
overloads takes in arrow::compute::Expression and returns
substrait::Expression) but at the moment the method is internal as we
completely hide the Substrait/protobuf bindings (e.g. you just
opaquely go from bytes to Arrow execution plan and back).  Can you
describe a bit more what you'd want to accomplish with a Substrait
expression in python?

On Mon, Mar 7, 2022 at 8:16 AM Will Jones <will.jones...@gmail.com> wrote:
>
> Thanks for starting that, Andy!
>
> > I also think it could be helpful with in-memory language interoperability,
> > such as passing query plans between Python and Rust.
>
> Yes! I prototyped a datafusion-python and pyarrow datasets integration[1] a
> few weeks ago that could really benefit from this. I'll have to look into
> it more,
> but will likely also need a method on PyArrow compute expressions to convert
> to a Substrait expression.
>
> [1] https://github.com/datafusion-contrib/datafusion-python/pull/21
>
> On Mon, Mar 7, 2022 at 8:40 AM Wang Xudong <wxd963996...@gmail.com> wrote:
>
> > Thank you!
> > This is a great idea, I'll try to contribute some code when I have time!
> >
> > ---
> > xudong
> >
> > Gavin Ray <ray.gavi...@gmail.com> 于2022年3月8日周二 00:36写道:
> >
> > > Incredibly exciting! Following along eagerly =)
> > >
> > > On Mon, Mar 7, 2022 at 11:31 AM Andy Grove <andygrov...@gmail.com>
> > wrote:
> > >
> > > > I created a new repo in the datafusion-contrib GitHub org over the
> > > weekend
> > > > with a starting point for supporting DataFusion as both a producer and
> > > > consumer of Substrait plans.
> > > >
> > > > https://github.com/datafusion-contrib/datafusion-substrait
> > > >
> > > > I am hopeful that we can eventually use Substrait in Ballista as a
> > > > replacement for the current query plan protobuf format, meaning that
> > the
> > > > Ballista scheduler could potentially be used with engines other than
> > > > DataFusion.
> > > >
> > > > I also think it could be helpful with in-memory language
> > > interoperability,
> > > > such as passing query plans between Python and Rust.
> > > >
> > > > I plan on continuing to merge my own PRs here as I flesh out more of
> > > this,
> > > > at least until there are other contributors.
> > > >
> > > > Thanks,
> > > >
> > > > Andy.
> > > >
> > >
> >

Reply via email to