alamb commented on a change in pull request #984:
URL: https://github.com/apache/arrow-rs/pull/984#discussion_r757885028



##########
File path: arrow/src/compute/kernels/comparison.rs
##########
@@ -200,6 +201,42 @@ macro_rules! compare_op_scalar_primitive {
     }};
 }
 
+macro_rules! compare_dict_op_scalar {
+    ($left:expr, $right:expr, $op:expr) => {{
+        let null_bit_buffer = $left
+            .data()
+            .null_buffer()
+            .map(|b| b.bit_slice($left.offset(), $left.len()));
+
+        let values = $left
+            .values()
+            .as_any()
+            .downcast_ref::<StringArray>()

Review comment:
       The values array can be anything (not always a `StringArray`) -- perhaps 
this would be a good place to use the `dyn_XX` kernels -- to compare the values 
array with `$right`)

##########
File path: arrow/src/compute/kernels/comparison.rs
##########
@@ -200,6 +201,42 @@ macro_rules! compare_op_scalar_primitive {
     }};
 }
 
+macro_rules! compare_dict_op_scalar {
+    ($left:expr, $right:expr, $op:expr) => {{
+        let null_bit_buffer = $left
+            .data()
+            .null_buffer()
+            .map(|b| b.bit_slice($left.offset(), $left.len()));
+
+        let values = $left
+            .values()
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .unwrap();
+
+        // Safety:
+        // `i < $left.len()`
+        let comparison = (0..$left.len()).map(|i| unsafe {
+            let key = $left.keys().value_unchecked(i).to_usize().unwrap();
+            $op(values.value_unchecked(key), $right)
+        });

Review comment:
       I think one of the main points of this ticket is to avoid the call here 
to `vaues.value_unchecked`
   
   I like to think about the goal in by thinking "what would happen with 
DictionaryArray with 1000000 entries but a dictionary of size 1?" -- the way 
you have this PR, I think we would call `$op` 1000000 times. The idea is to 
call `$op` 1 time.
   
   So the pattern I think we are looking, at least for the constant kernels is:
   
   In pseudo code:
   ```
   let values = dict_array.values();
   let comparison_result_on_values = apply_op_to_values();
   let result = dict_array.keys().iter().map(|index| 
comparison_result_on_values[index]).collect()
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to