alamb commented on code in PR #7757:
URL: https://github.com/apache/arrow-rs/pull/7757#discussion_r2163472338
##########
parquet-variant/src/variant/object.rs:
##########
@@ -134,7 +134,12 @@ impl<'m, 'v> VariantObject<'m, 'v> {
}
/// Get a field's value by index in `0..self.len()`
- pub fn field(&self, i: usize) -> Result<Variant<'m, 'v>, ArrowError> {
+ pub fn field(&self, i: usize) -> Option<Variant<'m, 'v>> {
+ self.field_err(i).ok()
Review Comment:
I worry that the `ok()` discards errors and will make it hard to debug
Also creating an `ArrowError` requires allocating a `String` so it is non
trivial in terms of cost to create an Err and then discard it via `ok()`
What would you think about making errors (which can happen with invalid
variant values) `panic`s ?
Something like
```suggestion
self.field_err(i).expect("Invalid variant value")
```
We may want to special case the check for out of bounds access to return
None explicitly for performance reasons.
I think the validation strategy @scovich and I have been discussing is that
the Variant would be validated during construction and then the `get` and other
APIs would then assume that validation has already been done.
Eventually we envision there may be a way to disable the validation on
construction for usecases where we know the variant is valid.
##########
parquet-variant/src/variant/object.rs:
##########
@@ -164,22 +174,22 @@ impl<'m, 'v> VariantObject<'m, 'v> {
fn iter_checked(
&self,
) -> impl Iterator<Item = Result<(&'m str, Variant<'m, 'v>), ArrowError>>
+ '_ {
- (0..self.num_elements).map(move |i| Ok((self.field_name(i)?,
self.field(i)?)))
+ (0..self.num_elements).map(move |i| Ok((self.field_name_err(i)?,
self.field_err(i)?)))
}
/// Returns the value of the field with the specified name, if any.
///
/// `Ok(None)` means the field does not exist; `Err` means the search
encountered an error.
Review Comment:
likewise here it would be good to describe the error behavior in comments
##########
parquet-variant/src/variant/list.rs:
##########
@@ -123,7 +123,13 @@ impl<'m, 'v> VariantList<'m, 'v> {
self.len() == 0
}
- pub fn get(&self, index: usize) -> Result<Variant<'m, 'v>, ArrowError> {
+ /// Returns element by index in `0..self.len()`, if any
+ pub fn get(&self, index: usize) -> Option<Variant<'m, 'v>> {
+ self.get_err(index).ok()
Review Comment:
Same comment as below about documenting error behavior and not ignoring that
error
However, for the case where `index` is out of bounds specifically, it might
make sense to return None (but not ignore other errors)
##########
parquet-variant/src/builder.rs:
##########
@@ -112,11 +112,11 @@ fn make_room_for_header(buffer: &mut Vec<u8>, start_pos:
usize, header_size: usi
/// panic!("unexpected variant type")
/// };
/// assert_eq!(
-/// variant_object.field_by_name("first_name").unwrap(),
+/// variant_object.get("first_name"),
Review Comment:
😍
##########
parquet-variant/src/variant/object.rs:
##########
@@ -245,22 +255,35 @@ mod tests {
assert!(!variant_obj.is_empty());
// Test field access
- let active_field = variant_obj.field_by_name("active").unwrap();
+ let active_field = variant_obj.get("active");
assert!(active_field.is_some());
assert_eq!(active_field.unwrap().as_boolean(), Some(true));
- let age_field = variant_obj.field_by_name("age").unwrap();
+ let age_field = variant_obj.get("age");
assert!(age_field.is_some());
assert_eq!(age_field.unwrap().as_int8(), Some(42));
- let name_field = variant_obj.field_by_name("name").unwrap();
+ let name_field = variant_obj.get("name");
assert!(name_field.is_some());
assert_eq!(name_field.unwrap().as_string(), Some("hello"));
// Test non-existent field
- let missing_field = variant_obj.field_by_name("missing").unwrap();
+ let missing_field = variant_obj.get("missing");
assert!(missing_field.is_none());
+ // Fixme: This assertion will panic! That is not good
+ // let missing_field_name = variant_obj.field_name(3);
+ // assert!(missing_field_name.is_none());
Review Comment:
I recommend we file a ticket
##########
parquet-variant/src/variant/object.rs:
##########
@@ -134,7 +134,12 @@ impl<'m, 'v> VariantObject<'m, 'v> {
}
/// Get a field's value by index in `0..self.len()`
- pub fn field(&self, i: usize) -> Result<Variant<'m, 'v>, ArrowError> {
+ pub fn field(&self, i: usize) -> Option<Variant<'m, 'v>> {
+ self.field_err(i).ok()
+ }
+
+ /// Fallible version of `field`. Returns field value by index, capturing
validation errors
+ fn field_err(&self, i: usize) -> Result<Variant<'m, 'v>, ArrowError> {
Review Comment:
What would you think about naming this `try_field` instead of `field_err`
(and similarly for the other error variants?)
I think the `try_` prefix is somewhat more Rust "standard" though I do agree
it is somewhat of an opinion
##########
parquet-variant/src/variant/object.rs:
##########
@@ -134,7 +134,12 @@ impl<'m, 'v> VariantObject<'m, 'v> {
}
/// Get a field's value by index in `0..self.len()`
- pub fn field(&self, i: usize) -> Result<Variant<'m, 'v>, ArrowError> {
+ pub fn field(&self, i: usize) -> Option<Variant<'m, 'v>> {
Review Comment:
I think it would be good here to describe the error behavior in comments
Like
```rust
/// # Panics
// If the variant is invalid
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]