Will Jones created ARROW-16844:
----------------------------------
Summary: [C++][Python] Implement to/from substrait for Expression
Key: ARROW-16844
URL: https://issues.apache.org/jira/browse/ARROW-16844
Project: Apache Arrow
Issue Type: Improvement
Components: C++, Python
Reporter: Will Jones
DataFusion has the ability to convert between Substrait expressions and it's
own internal expressions. (See:
[https://github.com/datafusion-contrib/datafusion-substrait] .) It would be
cool if we had a similar conversion for Acero's Expression class.
This might unlock allowing datafusion-python to easily use PyArrow datasets, by
using Substrait as intermediate format to pass down filter and projections from
Datafusion into the scanner. (See early draft here:
[https://github.com/datafusion-contrib/datafusion-python/pull/21].)
One problem is that it's unclear what should be the type of the object in
Python representing the Substrait expression. IIUC Python doesn't have direct
bindings to the Substrait protobuf.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)