kylebrooks-8451 commented on issue #362: URL: https://github.com/apache/arrow-datafusion-python/issues/362#issuecomment-1547819988
Hi @wjones127 - Thanks for reaching out, I would vote in favor of adding in Substrait expressions to the Dataset API. I think it would allow plugging in other execution engines into PyArrow and standardize converting plans between frameworks. I read through the issue you linked, I think what we really want here is option 1 from that thread: An interface for consuming data from a dataset-like object, without having to be a pyarrow.dataset.Dataset (or Scanner) instance. If Substrait gets us there then great. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
