[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7692: ARROW-9321: [C++][Dataset] Populate statistics opportunistically

GitBox Fri, 10 Jul 2020 00:15:32 -0700


jorisvandenbossche commented on a change in pull request #7692:
URL: https://github.com/apache/arrow/pull/7692#discussion_r452408096




##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -909,13 +909,24 @@ cdef class ParquetFileFragment(FileFragment):
 
     def __reduce__(self):
         buffer = self.buffer
+        if self.row_groups is not None:

Review comment:
       Yeah, I actually realized we were not yet pickling the row group id's 
when discussing this in the meeting we had, and was planning to open a JIRA / 
do a quick PR, but you already fixed it ;)
   
   (it didn't fail ~~because we simply didn't include any row group information 
in the serialization~~ because we only tested cases where row_groups was None)
   
   Only preserving the rowgroup id's (as you do here) should be sufficient for 
now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7692: ARROW-9321: [C++][Dataset] Populate statistics opportunistically

Reply via email to