Re: [PR] Move RunArray::get_physical_indices to RunEndBuffer [arrow-rs]

via GitHub Mon, 22 Dec 2025 00:18:12 -0800


Jefffrey commented on code in PR #9027:
URL: https://github.com/apache/arrow-rs/pull/9027#discussion_r2639032334



##########
arrow-buffer/src/buffer/run.rs:
##########
@@ -268,6 +268,88 @@ where
     pub fn into_inner(self) -> ScalarBuffer<E> {
         self.run_ends
     }
+
+    /// Returns the physical indices corresponding to the provided logical 
indices.
+    ///
+    /// Given a slice of logical indices, this method returns a `Vec` 
containing the
+    /// corresponding physical indices into the run-ends buffer.
+    ///
+    /// This method operates by iterating the logical indices in sorted order, 
instead of
+    /// finding the physical index for each logical index using binary search 
via
+    /// the function [`RunEndBuffer::get_physical_index`].
+    ///
+    /// Running benchmarks on both approaches showed that the approach used 
here
+    /// scaled well for larger inputs.
+    ///
+    /// See 
<https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727> for more 
details.
+    ///
+    /// # Errors
+    ///
+    /// Returns an error if any logical index is out of bounds (>= self.len()).
+    #[inline]
+    pub fn get_physical_indices<I>(&self, logical_indices: &[I]) -> 
Result<Vec<usize>, String>

Review Comment:
   Maybe we could have return type as `Result<Vec<usize>, I>`, and return the 
offending input logical index that caused the error instead of a string 
message? Leave it up to consumers to format their own message 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Move RunArray::get_physical_indices to RunEndBuffer [arrow-rs]

Reply via email to