jorisvandenbossche commented on a change in pull request #7692:
URL: https://github.com/apache/arrow/pull/7692#discussion_r452408096
##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -909,13 +909,24 @@ cdef class ParquetFileFragment(FileFragment):
def __reduce__(self):
buffer = self.buffer
+ if self.row_groups is not None:
Review comment:
Yeah, I actually realized we were not yet pickling the row group id's
when discussing this in the meeting we had, and was planning to open a JIRA /
do a quick PR, but you already fixed it ;)
(it didn't fail ~~because we simply didn't include any row group information
in the serialization~~ because we only tested cases where row_groups was None)
Only preserving the rowgroup id's (as you do here) should be sufficient for
now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]