tustvold commented on code in PR #8733:
URL: https://github.com/apache/arrow-rs/pull/8733#discussion_r2483674920


##########
parquet/src/column/reader.rs:
##########
@@ -214,6 +219,49 @@ where
             let remaining_records = max_records - total_records_read;
             let remaining_levels = self.num_buffered_values - 
self.num_decoded_values;
 
+            if self.synthetic_page {

Review Comment:
   This might be a stupid idea, but could we construct masks per-leaf column. 
In the case of no row-group level pruning this would just be the same as 
normal, but otherwise it will be a selection vector that accounts for this.
   
   I'm a bit wary of introducing artificial pages purely because the logic 
around record boundaries and page boundaries is rather subtle, and whilst I 
don't think this will break it, it does introduce the possibility.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to