Hi - I am working with a storage format meant to be PyArrow Dataset compatible. 
PyArrow datasets support specifying a filter, written using pyarrow.compute 
expressions, - 
[link](https://arrow.apache.org/docs/dev/python/generated/pyarrow.dataset.Expression.html).

Does the pyarrow API provide a mechanism to serialize compute expressions to a 
standard format like substrait? I want to analyze the filter expression, and 
push down some of its execution to the storage engine.

Note that casting the filter expression to a string and parsing it is an 
option, but things like the isin​ operator don't produce easy to parse strings.

```py
x = pc.field("colA")
z = (x > 3) & x.isin([10, 11])
str(z)

#  '((colA > 3) and is_in(colA, {value_set=int64:[\n  10,\n  11\n], 
skip_nulls=false}))'
```

Thank you,
Ishan

Reply via email to