Fokko commented on PR #33974: URL: https://github.com/apache/arrow/pull/33974#issuecomment-1416108262
@jorisvandenbossche Updating table one is a good suggestion indeed. Updated that one as well 👍🏻 > (I think it's also read_table you are using in PyIceberg, and not ParquetDataset?) Currently, [we use the `read_table` indeed](https://github.com/apache/iceberg/blob/master/python/pyiceberg/io/pyarrow.py#L510-L517). I've also played around with the ParquetDataset, and it looked very similar. However, we don't need the lazy nature of the dataset, so directly loading a table makes more sense in our situation. > Yeah, I think we should prefer some duplication if that gives better docstrings. I agree the indirection for the user right now isn't very user friendly. > We might be able to share some part of the docstring and inject that in multiple places to avoid duplicating the actual content, if that doesn't make things too complicated. Makes a lot of sense, I think sharing would be best. Let me create a separate PR for that -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org