alamb commented on code in PR #9079:
URL: https://github.com/apache/arrow-rs/pull/9079#discussion_r2687515684
##########
arrow-row/src/lib.rs:
##########
@@ -1156,6 +1161,29 @@ impl Rows {
}
}
+ /// Returns the length of the row at index `row` in bytes
+ pub fn row_len(&self, row: usize) -> usize {
+ assert!(row + 1 < self.offsets.len());
+
+ unsafe { self.row_len_unchecked(row) }
+ }
+
+ /// Returns the length of the row at `index` in bytes without bounds
checking
+ ///
+ /// # Safety
+ /// Caller must ensure that `index` is less than the number of offsets
(#rows + 1)
+ pub unsafe fn row_len_unchecked(&self, index: usize) -> usize {
Review Comment:
Why do we need this new pub `unsafe` API?
in terms of `pub` it doesn't seem to be used anywhere other than `row_len`
so we could at least make it non pul
In terms of `unsafe`, Given that `row_len` does an `assert`, what is the
rationale for using `unsafe` rather than just normal `array` access?
As in, why not
```rust
pub fn row_len(&self, row: usize) -> usize {
self.offsets[row+1] - self.offsets[row]
}
```
That would be simpler and not use unsafe
It in theory may have one extra bounds check, but unless the performance
gains are compelling I think we should avoid adding new `unsafe` when possible
##########
arrow-row/src/lib.rs:
##########
@@ -1156,6 +1161,29 @@ impl Rows {
}
}
+ /// Returns the length of the row at index `row` in bytes
Review Comment:
Can you please document that "length" means bytes here (not, for example,
columns)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]