viirya commented on code in PR #3848:
URL: https://github.com/apache/arrow-rs/pull/3848#discussion_r1134640841


##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -372,29 +371,63 @@ impl RowSelection {
         self
     }
 
+    /// Applies an offset to this [`RowSelection`], skipping the first 
`offset` selected rows
+    pub(crate) fn offset(mut self, offset: usize) -> Self {
+        if offset == 0 {
+            return self;
+        }
+
+        let mut selected_count = 0;
+        let mut skipped_count = 0;
+
+        // Find the index where the selector exceeds the row count
+        let find = self
+            .selectors
+            .iter()
+            .position(|selector| match selector.skip {
+                true => {
+                    skipped_count += selector.row_count;
+                    false
+                }
+                false => {
+                    selected_count += selector.row_count;
+                    selected_count > offset
+                }
+            });
+
+        let split_idx = match find {
+            Some(idx) => idx,
+            None => {
+                self.selectors.clear();
+                return self;
+            }
+        };
+
+        let mut selectors = Vec::with_capacity(self.selectors.len() - 
split_idx + 1);
+        selectors.push(RowSelector::skip(skipped_count + offset));
+        selectors.push(RowSelector::select(selected_count - offset));

Review Comment:
   Hmm, I meant the way you calculate skpped_count and selected_count. If 
original selectors are mixed with `skip` and `select`, is the logic still 
correct?
   
   E.g., 1st selector `skip` 5 row, 2nd selector `select` 1 row, 3rd selector 
`skip` 2 rows, 4 selector `select` 5 rows.
   
   With `offset` 2, this produces 1st selector `skip` 7 + 2 rows, 2nd selector 
`select` 6 - 2 rows. But original 2nd selector's 1 row is skipped now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to