jorisvandenbossche commented on pull request #10450: URL: https://github.com/apache/arrow/pull/10450#issuecomment-1074166073
It seems that indeed the keyword is not correctly passed through in the `pq.read_table` function to `pq.ParquetDataset` (it only does that when falling back to ParquetFile, which basically means that the keyword doesn't work in ``pq.read_table`` in most cases), and also in the `pq.ParquetDataset` constructor, the keyword is not accepted (only in the subclass). That's something we should ideally fix before the 8.0 release. Note that even when we add it to `pq.ParquetDataset`, it will only be for the new `pyarrow.dataset` based implementation, and petastorm is still using the legacy implementation (this will also start raising warnings in pyarrow 8.0). See the issue I opened about that a while ago: https://github.com/uber/petastorm/issues/613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
