alamb commented on code in PR #8838:
URL: https://github.com/apache/arrow-rs/pull/8838#discussion_r2543160023


##########
arrow-ord/src/ord.rs:
##########
@@ -296,6 +296,70 @@ fn compare_struct(
     Ok(f)
 }
 
+fn compare_union(
+    left: &dyn Array,
+    right: &dyn Array,
+    opts: SortOptions,
+) -> Result<DynComparator, ArrowError> {
+    let left = left.as_union();
+    let right = right.as_union();
+
+    let (left_fields, left_mode) = match left.data_type() {

Review Comment:
   This is weird to have to re-check the DataTypes. 
   
   What would you think about adding `UnionArray::fields()` and 
`UnionArray::mode()` methods to make the code easier to work with?



##########
arrow-ord/src/ord.rs:
##########
@@ -296,6 +296,70 @@ fn compare_struct(
     Ok(f)
 }
 
+fn compare_union(
+    left: &dyn Array,
+    right: &dyn Array,
+    opts: SortOptions,
+) -> Result<DynComparator, ArrowError> {
+    let left = left.as_union();
+    let right = right.as_union();
+
+    let (left_fields, left_mode) = match left.data_type() {
+        DataType::Union(fields, mode) => (fields, mode),
+        _ => unreachable!(),
+    };
+    let (right_fields, right_mode) = match right.data_type() {
+        DataType::Union(fields, mode) => (fields, mode),
+        _ => unreachable!(),
+    };
+
+    if left_fields != right_fields || left_mode != right_mode {
+        return Err(ArrowError::InvalidArgumentError(
+            "Cannot compare UnionArrays with different fields or 
modes".to_string(),
+        ));
+    }
+
+    let c_opts = child_opts(opts);
+
+    let mut field_comparators = HashMap::with_capacity(left_fields.len());

Review Comment:
   rather than a hash map you could potentially just use a 128 valued `Vec<>` 
indexed by the typeids -- since typeid is i8 you know there can be at most 128 
values that might  be faster to lookup than hashing/hash table



##########
arrow-ord/src/ord.rs:
##########
@@ -296,6 +296,70 @@ fn compare_struct(
     Ok(f)
 }
 
+fn compare_union(
+    left: &dyn Array,
+    right: &dyn Array,
+    opts: SortOptions,
+) -> Result<DynComparator, ArrowError> {
+    let left = left.as_union();
+    let right = right.as_union();
+
+    let (left_fields, left_mode) = match left.data_type() {
+        DataType::Union(fields, mode) => (fields, mode),
+        _ => unreachable!(),
+    };
+    let (right_fields, right_mode) = match right.data_type() {
+        DataType::Union(fields, mode) => (fields, mode),
+        _ => unreachable!(),
+    };
+
+    if left_fields != right_fields || left_mode != right_mode {
+        return Err(ArrowError::InvalidArgumentError(
+            "Cannot compare UnionArrays with different fields or 
modes".to_string(),

Review Comment:
   I recommend adding more details to this message to help when people hit it 
-- specifically, I recommend
   1.  a separate message for different modes (and include the modes in the 
error message) 
   2. Add the fields (`{fields:?}` style) to the message 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to