Re: [PR] [python] Support random sample for append table [paimon]

via GitHub Fri, 16 Jan 2026 02:58:06 -0800


discivigour commented on code in PR #7014:
URL: https://github.com/apache/paimon/pull/7014#discussion_r2698039587



##########
paimon-python/pypaimon/read/split_read.py:
##########
@@ -496,18 +520,6 @@ def _create_union_reader(self, need_merge_files: 
List[DataFileMeta]) -> RecordRe
         # Split field bunches
         fields_files = self._split_field_bunches(need_merge_files)
 
-        # Validate row counts and first row IDs
-        row_count = fields_files[0].row_count()
-        first_row_id = fields_files[0].files()[0].first_row_id
-
-        for bunch in fields_files:
-            if bunch.row_count() != row_count:
-                raise ValueError("All files in a field merge split should have 
the same row count.")

Review Comment:
   When sampling, only a part of blob files for a data file were filtered out 
together. So the row numbers are different.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [python] Support random sample for append table [paimon]

Reply via email to