Jefffrey commented on code in PR #9027:
URL: https://github.com/apache/arrow-rs/pull/9027#discussion_r2639032334
##########
arrow-buffer/src/buffer/run.rs:
##########
@@ -268,6 +268,88 @@ where
pub fn into_inner(self) -> ScalarBuffer<E> {
self.run_ends
}
+
+ /// Returns the physical indices corresponding to the provided logical
indices.
+ ///
+ /// Given a slice of logical indices, this method returns a `Vec`
containing the
+ /// corresponding physical indices into the run-ends buffer.
+ ///
+ /// This method operates by iterating the logical indices in sorted order,
instead of
+ /// finding the physical index for each logical index using binary search
via
+ /// the function [`RunEndBuffer::get_physical_index`].
+ ///
+ /// Running benchmarks on both approaches showed that the approach used
here
+ /// scaled well for larger inputs.
+ ///
+ /// See
<https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727> for more
details.
+ ///
+ /// # Errors
+ ///
+ /// Returns an error if any logical index is out of bounds (>= self.len()).
+ #[inline]
+ pub fn get_physical_indices<I>(&self, logical_indices: &[I]) ->
Result<Vec<usize>, String>
Review Comment:
Maybe we could have return type as `Result<Vec<usize>, I>`, and return the
offending input logical index that caused the error instead of a string
message? Leave it up to consumers to format their own message 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]