askoa commented on code in PR #3622:
URL: https://github.com/apache/arrow-rs/pull/3622#discussion_r1096541244


##########
arrow-array/src/array/run_array.rs:
##########
@@ -143,6 +143,102 @@ impl<R: RunEndIndexType> RunArray<R> {
             values,
         })
     }
+
+    /// Returns index to the physical array for the given index to the logical 
array.
+    /// Performs a binary search on the run_ends array for the input index.
+    #[inline]
+    pub fn get_physical_index(&self, logical_index: usize) -> Option<usize> {
+        if logical_index >= self.len() {
+            return None;
+        }
+        let mut st: usize = 0;
+        let mut en: usize = self.run_ends().len();
+        while st + 1 < en {
+            let mid: usize = (st + en) / 2;
+            if logical_index
+                < unsafe {
+                    // Safety:
+                    // The value of mid will always be between 1 and len - 1,
+                    // where len is length of run ends array.
+                    // This is based on the fact that `st` starts with 0 and
+                    // `en` starts with len. The condition `st + 1 < en` 
ensures
+                    // `st` and `en` differs atleast by two. So the value of 
`mid`
+                    // will never be either `st` or `en`
+                    self.run_ends().value_unchecked(mid - 1).as_usize()
+                }
+            {
+                en = mid
+            } else {
+                st = mid
+            }
+        }
+        Some(st)
+    }
+
+    /// Returns the physical indices of the input logical indices. Returns 
error if any of the logical
+    /// index cannot be converted to physical index. The logical indices are 
sorted and iterated along
+    /// with run_ends array to find matching physical index. The approach used 
here was chosen over
+    /// finding physical index for each logical index using binary search 
using the function
+    /// `get_physical_index`. Running benchmarks on both approaches showed 
that the approach used here
+    /// scaled well for larger inputs.
+    /// See 
<https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727> for more 
details.
+    #[inline]
+    pub fn get_physical_indices<I>(

Review Comment:
   @tustvold thank you!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to