wjones127 commented on issue #33986:
URL: https://github.com/apache/arrow/issues/33986#issuecomment-1546424922

   I have updated the document and created a rough sketch. I've also notified 
some devs from other projects, such as PyIceberg and dask-deltatable, to get 
more feedback.
   
   Basically, I think the API that we have now for Datasets is actually very 
good. So doing as Chang originally suggested and just making a 
`typing.Protocol` out of it seems like it would be sufficient. **I think that's 
what we want, but I'm honestly not 100% sure the best way to expose / publish 
this, so I would welcome feedback on that.**
   
   There are some possible extensions of it that could be made in the future, 
but I don't think they should block us from defining a protocol now.
   
   IMO, this is a good opportunity to define something that will work well 
enough for now. I don't think it will be something that will last the next 5-10 
years. But what we learn from pushing this API to it's limits may inform us on 
the design of something that's more robust and includes input from a much wider 
part of the PyData ecosystem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to