wjones127 commented on PR #35568:
URL: https://github.com/apache/arrow/pull/35568#issuecomment-1591718615

   > So is the assumption here that the producer and the consumer (in your 
diagram) are the same library? E.g. both are pyarrow (pyarrow has code for 
producing datasets and for scanning datasets)? Or is the goal to be able to 
produce datasets with one library and consume them with a different library?
   
   Different libraries. Producers are libraries like `lance`, `deltalake`, and 
`pyiceberg`. Consumers are libraries like `duckdb`, `polars`, `datafusion` and 
`dask`.
   
   You could say the status quo is that the consumer can be any library, but 
the producer is assumed to be pyarrow. This protocol helps open up the producer 
side to be other libraries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to