alamb edited a comment on issue #850:
URL: 
https://github.com/apache/arrow-datafusion/issues/850#issuecomment-899812406


   Thanks @novemberkilo 
   
   > Perhaps is this the same as checking that for each array that appears on 
https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_plan/hash_aggregate.rs#L371
 we check that array.null_count() == 0
   
   Yes, I think that is right. So rather than
   
   ```rust
               group_values
                   .iter()
                   .zip(group_state.group_by_values.iter())
                   .all(|(array, scalar)| scalar.eq_array(array, row))
   ```
   The idea would be to write something like this
   
   ```rust
               group_values
                   .iter()
                   .zip(group_state.group_by_values.iter())
                   .all(|(array, scalar)| { 
                     if array.null_count > 0 { 
                       scalar.eq_array(array, row))
                     } else  { 
                       scalar.eq_array_no_nulls(array, row))
                     } 
                   }) 
   ```
   
   But `ScalarValue::eq_array_no_nulls` does not exist yet -- you would have to 
write it / test it
   
   Although now on second thought I think the `if` needs to be hoisted out of 
the loop:
   
   ```rust
             if (array.null_count() > 0) {
               group_values
                   .iter()
                   .zip(group_state.group_by_values.iter())
                   .all(|(array, scalar)| scalar.eq_array(array, row))
            } else {
               // special case no null values
               group_values
                   .iter()
                   .zip(group_state.group_by_values.iter())
                   .all(|(array, scalar)| scalar.eq_array_no_nulls(array, row))
            }
   ```
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to