alamb commented on code in PR #7057:
URL: https://github.com/apache/arrow-datafusion/pull/7057#discussion_r1273643821


##########
docs/source/user-guide/expressions.md:
##########
@@ -179,26 +179,28 @@ Unlike to some databases the math functions in Datafusion 
works the same way as
 
 ## Array Expressions
 
-| Function                             | Notes                                 
                                                                         |
-| ------------------------------------ | 
--------------------------------------------------------------------------------------------------------------
 |
-| array_append(array, element)         | Appends an element to the end of an 
array. `array_append([1, 2, 3], 4) -> [1, 2, 3, 4]`                        |
-| array_concat(array[, ..., array_n])  | Concatenates arrays. 
`array_concat([1, 2, 3], [4, 5, 6]) -> [1, 2, 3, 4, 5, 6]`                      
          |
-| array_has(array, element)            | Returns true if the array contains 
the element `array_has([1,2,3], 1) -> true`                                 |
-| array_has_all(array, sub-array)      | Returns true if all elements of 
sub-array exist in array `array_has_all([1,2,3], [1,3]) -> true`               |
-| array_has_any(array, sub-array)      | Returns true if any elements exist in 
both arrays `array_has_any([1,2,3], [1,4]) -> true`                      |
-| array_dims(array)                    | Returns an array of the array's 
dimensions. `array_dims([[1, 2, 3], [4, 5, 6]]) -> [2, 3]`                     |
-| array_fill(element, array)           | Returns an array filled with copies 
of the given value.                                                        |
-| array_length(array, dimension)       | Returns the length of the array 
dimension. `array_length([1, 2, 3, 4, 5]) -> 5`                                |
-| array_ndims(array)                   | Returns the number of dimensions of 
the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2`                      |
-| array_position(array, element)       | Searches for an element in the array, 
returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2`      |
-| array_positions(array, element)      | Searches for an element in the array, 
returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]` |
-| array_prepend(array, element)        | Prepends an element to the beginning 
of an array. `array_prepend(1, [2, 3, 4]) -> [1, 2, 3, 4]`                |
-| array_remove(array, element)         | Removes all elements equal to the 
given value from the array.                                                  |
-| array_replace(array, from, to)       | Replaces a specified element with 
another specified element.                                                   |
-| array_to_string(array, delimeter)    | Converts each element to its text 
representation. `array_to_string([1, 2, 3, 4], ',') -> 1,2,3,4`              |
-| cardinality(array)                   | Returns the total number of elements 
in the array. `cardinality([[1, 2, 3], [4, 5, 6]]) -> 6`                  |
-| make_array(value1, [value2 [, ...]]) | Returns an Arrow array using the 
specified input expressions. `make_array(1, 2, 3) -> [1, 2, 3]`               |
-| trim_array(array, n)                 | Removes the last n elements from the 
array.                                                                    |
+| Function                             | Notes                                 
                                                                                
                             |
+| ------------------------------------ | 
--------------------------------------------------------------------------------------------------------------------------------------------------
 |
+| array_append(array, element)         | Appends an element to the end of an 
array. `array_append([1, 2, 3], 4) -> [1, 2, 3, 4]`                             
                               |
+| array_concat(array[, ..., array_n])  | Concatenates arrays. 
`array_concat([1, 2, 3], [4, 5, 6]) -> [1, 2, 3, 4, 5, 6]`                      
                                              |
+| array_has(array, element)            | Returns true if the array contains 
the element `array_has([1,2,3], 1) -> true`                                     
                                |
+| array_has_all(array, sub-array)      | Returns true if all elements of 
sub-array exist in array `array_has_all([1,2,3], [1,3]) -> true`                
                                   |
+| array_has_any(array, sub-array)      | Returns true if any elements exist in 
both arrays `array_has_any([1,2,3], [1,4]) -> true`                             
                             |
+| array_dims(array)                    | Returns an array of the array's 
dimensions. `array_dims([[1, 2, 3], [4, 5, 6]]) -> [2, 3]`                      
                                   |
+| array_fill(element, array)           | Returns an array filled with copies 
of the given value.                                                             
                               |
+| array_length(array, dimension)       | Returns the length of the array 
dimension. `array_length([1, 2, 3, 4, 5]) -> 5`                                 
                                   |
+| array_ndims(array)                   | Returns the number of dimensions of 
the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2`                           
                               |
+| array_position(array, element)       | Searches for an element in the array, 
returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2`             
                             |
+| array_positions(array, element)      | Searches for an element in the array, 
returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]`        
                             |
+| array_prepend(array, element)        | Prepends an element to the beginning 
of an array. `array_prepend(1, [2, 3, 4]) -> [1, 2, 3, 4]`                      
                              |
+| array_remove(array, element)         | Removes one element from the array 
equal to the given value. `array_remove([1, 2, 2, 3, 1, 4], 2) -> [1, 2, 3, 1, 
4]`                              |
+| array_removes(array, element)        | Removes all elements from the array 
equal to the given value. `array_removes([1, 2, 2, 3, 1, 4], 2) -> [1, 3, 1, 
4]`                               |

Review Comment:
   Given there is a function named `array_has_all` that matches only if all 
arguments are present, it seems to me that more consistent names would be:
   
   * `array_removes`  --> `array_remove_all`
   * `array_replaces` --> `array_replace_all`



##########
docs/source/user-guide/expressions.md:
##########
@@ -179,26 +179,28 @@ Unlike to some databases the math functions in Datafusion 
works the same way as
 
 ## Array Expressions
 
-| Function                             | Notes                                 
                                                                         |
-| ------------------------------------ | 
--------------------------------------------------------------------------------------------------------------
 |
-| array_append(array, element)         | Appends an element to the end of an 
array. `array_append([1, 2, 3], 4) -> [1, 2, 3, 4]`                        |
-| array_concat(array[, ..., array_n])  | Concatenates arrays. 
`array_concat([1, 2, 3], [4, 5, 6]) -> [1, 2, 3, 4, 5, 6]`                      
          |
-| array_has(array, element)            | Returns true if the array contains 
the element `array_has([1,2,3], 1) -> true`                                 |
-| array_has_all(array, sub-array)      | Returns true if all elements of 
sub-array exist in array `array_has_all([1,2,3], [1,3]) -> true`               |
-| array_has_any(array, sub-array)      | Returns true if any elements exist in 
both arrays `array_has_any([1,2,3], [1,4]) -> true`                      |
-| array_dims(array)                    | Returns an array of the array's 
dimensions. `array_dims([[1, 2, 3], [4, 5, 6]]) -> [2, 3]`                     |
-| array_fill(element, array)           | Returns an array filled with copies 
of the given value.                                                        |
-| array_length(array, dimension)       | Returns the length of the array 
dimension. `array_length([1, 2, 3, 4, 5]) -> 5`                                |
-| array_ndims(array)                   | Returns the number of dimensions of 
the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2`                      |
-| array_position(array, element)       | Searches for an element in the array, 
returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2`      |
-| array_positions(array, element)      | Searches for an element in the array, 
returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]` |
-| array_prepend(array, element)        | Prepends an element to the beginning 
of an array. `array_prepend(1, [2, 3, 4]) -> [1, 2, 3, 4]`                |
-| array_remove(array, element)         | Removes all elements equal to the 
given value from the array.                                                  |
-| array_replace(array, from, to)       | Replaces a specified element with 
another specified element.                                                   |
-| array_to_string(array, delimeter)    | Converts each element to its text 
representation. `array_to_string([1, 2, 3, 4], ',') -> 1,2,3,4`              |
-| cardinality(array)                   | Returns the total number of elements 
in the array. `cardinality([[1, 2, 3], [4, 5, 6]]) -> 6`                  |
-| make_array(value1, [value2 [, ...]]) | Returns an Arrow array using the 
specified input expressions. `make_array(1, 2, 3) -> [1, 2, 3]`               |
-| trim_array(array, n)                 | Removes the last n elements from the 
array.                                                                    |
+| Function                             | Notes                                 
                                                                                
                             |
+| ------------------------------------ | 
--------------------------------------------------------------------------------------------------------------------------------------------------
 |
+| array_append(array, element)         | Appends an element to the end of an 
array. `array_append([1, 2, 3], 4) -> [1, 2, 3, 4]`                             
                               |
+| array_concat(array[, ..., array_n])  | Concatenates arrays. 
`array_concat([1, 2, 3], [4, 5, 6]) -> [1, 2, 3, 4, 5, 6]`                      
                                              |
+| array_has(array, element)            | Returns true if the array contains 
the element `array_has([1,2,3], 1) -> true`                                     
                                |
+| array_has_all(array, sub-array)      | Returns true if all elements of 
sub-array exist in array `array_has_all([1,2,3], [1,3]) -> true`                
                                   |
+| array_has_any(array, sub-array)      | Returns true if any elements exist in 
both arrays `array_has_any([1,2,3], [1,4]) -> true`                             
                             |
+| array_dims(array)                    | Returns an array of the array's 
dimensions. `array_dims([[1, 2, 3], [4, 5, 6]]) -> [2, 3]`                      
                                   |
+| array_fill(element, array)           | Returns an array filled with copies 
of the given value.                                                             
                               |
+| array_length(array, dimension)       | Returns the length of the array 
dimension. `array_length([1, 2, 3, 4, 5]) -> 5`                                 
                                   |
+| array_ndims(array)                   | Returns the number of dimensions of 
the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2`                           
                               |
+| array_position(array, element)       | Searches for an element in the array, 
returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2`             
                             |
+| array_positions(array, element)      | Searches for an element in the array, 
returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]`        
                             |
+| array_prepend(array, element)        | Prepends an element to the beginning 
of an array. `array_prepend(1, [2, 3, 4]) -> [1, 2, 3, 4]`                      
                              |
+| array_remove(array, element)         | Removes one element from the array 
equal to the given value. `array_remove([1, 2, 2, 3, 1, 4], 2) -> [1, 2, 3, 1, 
4]`                              |
+| array_removes(array, element)        | Removes all elements from the array 
equal to the given value. `array_removes([1, 2, 2, 3, 1, 4], 2) -> [1, 3, 1, 
4]`                               |
+| array_replace(array, from, to)       | Replaces one occurrence of the 
specified element with another specified element. `array_replace([1, 2, 2, 3, 
1, 4], 2, 5) -> [1, 5, 2, 3, 1, 4]`   |

Review Comment:
   ```suggestion
   | array_replace(array, from, to)       | Replaces the first occurrence of 
the specified element with another specified element. `array_replace([1, 2, 2, 
3, 1, 4], 2, 5) -> [1, 5, 2, 3, 1, 4]`   |
   ```



##########
datafusion/physical-expr/src/array_expressions.rs:
##########
@@ -1957,20 +2175,124 @@ mod tests {
 
     #[test]
     fn test_array_remove() {
-        // array_remove([1, 2, 3, 4], 3) = [1, 2, 4]
-        let list_array = return_array();
-        let arr = array_remove(&[
+        // array_remove([3, 1, 2, 3, 2, 3], 3) = [1, 2, 3, 2, 3]
+        let list_array = return_array_with_repeating_elements().into_array(1);
+        let array = array_remove(&[list_array, 
Arc::new(Int64Array::from_value(3, 1))])
+            .expect("failed to initialize function array_remove");
+        let result =
+            as_list_array(&array).expect("failed to initialize function 
array_remove");
+
+        assert_eq!(result.len(), 1);
+        assert_eq!(
+            &[1, 2, 3, 2, 3],
+            result
+                .value(0)
+                .as_any()
+                .downcast_ref::<Int64Array>()
+                .unwrap()
+                .values()
+        );
+
+        // array_remove([3, 1, 2, 3, 2, 3], 3, 2) = [1, 2, 2, 3]
+        let list_array = return_array_with_repeating_elements().into_array(1);
+        let array = array_remove(&[
             list_array,
-            ColumnarValue::Scalar(ScalarValue::Int64(Some(3))),
+            Arc::new(Int64Array::from_value(3, 1)),
+            Arc::new(Int64Array::from_value(2, 1)),
+        ])
+        .expect("failed to initialize function array_remove");
+        let result =
+            as_list_array(&array).expect("failed to initialize function 
array_remove");
+
+        assert_eq!(result.len(), 1);
+        assert_eq!(
+            &[1, 2, 2, 3],
+            result
+                .value(0)
+                .as_any()
+                .downcast_ref::<Int64Array>()
+                .unwrap()
+                .values()
+        );
+    }
+
+    #[test]
+    fn test_nested_array_remove() {
+        // array_remove(
+        //     [[1, 2, 3, 4], [5, 6, 7, 8], [1, 2, 3, 4], [9, 10, 11, 12], [5, 
6, 7, 8]],
+        //     [1, 2, 3, 4],
+        // ) = [[5, 6, 7, 8], [1, 2, 3, 4], [9, 10, 11, 12], [5, 6, 7, 8]]
+        let list_array = 
return_nested_array_with_repeating_elements().into_array(1);
+        let element_array = return_array().into_array(1);
+        let array = array_remove(&[list_array, element_array])
+            .expect("failed to initialize function array_remove");
+        let result =
+            as_list_array(&array).expect("failed to initialize function 
array_remove");
+
+        assert_eq!(result.len(), 1);
+        let data = vec![

Review Comment:
   I think we should really be testing with NULLs as well



##########
docs/source/user-guide/expressions.md:
##########
@@ -179,26 +179,28 @@ Unlike to some databases the math functions in Datafusion 
works the same way as
 
 ## Array Expressions
 
-| Function                             | Notes                                 
                                                                         |
-| ------------------------------------ | 
--------------------------------------------------------------------------------------------------------------
 |
-| array_append(array, element)         | Appends an element to the end of an 
array. `array_append([1, 2, 3], 4) -> [1, 2, 3, 4]`                        |
-| array_concat(array[, ..., array_n])  | Concatenates arrays. 
`array_concat([1, 2, 3], [4, 5, 6]) -> [1, 2, 3, 4, 5, 6]`                      
          |
-| array_has(array, element)            | Returns true if the array contains 
the element `array_has([1,2,3], 1) -> true`                                 |
-| array_has_all(array, sub-array)      | Returns true if all elements of 
sub-array exist in array `array_has_all([1,2,3], [1,3]) -> true`               |
-| array_has_any(array, sub-array)      | Returns true if any elements exist in 
both arrays `array_has_any([1,2,3], [1,4]) -> true`                      |
-| array_dims(array)                    | Returns an array of the array's 
dimensions. `array_dims([[1, 2, 3], [4, 5, 6]]) -> [2, 3]`                     |
-| array_fill(element, array)           | Returns an array filled with copies 
of the given value.                                                        |
-| array_length(array, dimension)       | Returns the length of the array 
dimension. `array_length([1, 2, 3, 4, 5]) -> 5`                                |
-| array_ndims(array)                   | Returns the number of dimensions of 
the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2`                      |
-| array_position(array, element)       | Searches for an element in the array, 
returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2`      |
-| array_positions(array, element)      | Searches for an element in the array, 
returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]` |
-| array_prepend(array, element)        | Prepends an element to the beginning 
of an array. `array_prepend(1, [2, 3, 4]) -> [1, 2, 3, 4]`                |
-| array_remove(array, element)         | Removes all elements equal to the 
given value from the array.                                                  |
-| array_replace(array, from, to)       | Replaces a specified element with 
another specified element.                                                   |
-| array_to_string(array, delimeter)    | Converts each element to its text 
representation. `array_to_string([1, 2, 3, 4], ',') -> 1,2,3,4`              |
-| cardinality(array)                   | Returns the total number of elements 
in the array. `cardinality([[1, 2, 3], [4, 5, 6]]) -> 6`                  |
-| make_array(value1, [value2 [, ...]]) | Returns an Arrow array using the 
specified input expressions. `make_array(1, 2, 3) -> [1, 2, 3]`               |
-| trim_array(array, n)                 | Removes the last n elements from the 
array.                                                                    |
+| Function                             | Notes                                 
                                                                                
                             |
+| ------------------------------------ | 
--------------------------------------------------------------------------------------------------------------------------------------------------
 |
+| array_append(array, element)         | Appends an element to the end of an 
array. `array_append([1, 2, 3], 4) -> [1, 2, 3, 4]`                             
                               |
+| array_concat(array[, ..., array_n])  | Concatenates arrays. 
`array_concat([1, 2, 3], [4, 5, 6]) -> [1, 2, 3, 4, 5, 6]`                      
                                              |
+| array_has(array, element)            | Returns true if the array contains 
the element `array_has([1,2,3], 1) -> true`                                     
                                |
+| array_has_all(array, sub-array)      | Returns true if all elements of 
sub-array exist in array `array_has_all([1,2,3], [1,3]) -> true`                
                                   |
+| array_has_any(array, sub-array)      | Returns true if any elements exist in 
both arrays `array_has_any([1,2,3], [1,4]) -> true`                             
                             |
+| array_dims(array)                    | Returns an array of the array's 
dimensions. `array_dims([[1, 2, 3], [4, 5, 6]]) -> [2, 3]`                      
                                   |
+| array_fill(element, array)           | Returns an array filled with copies 
of the given value.                                                             
                               |
+| array_length(array, dimension)       | Returns the length of the array 
dimension. `array_length([1, 2, 3, 4, 5]) -> 5`                                 
                                   |
+| array_ndims(array)                   | Returns the number of dimensions of 
the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2`                           
                               |
+| array_position(array, element)       | Searches for an element in the array, 
returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2`             
                             |
+| array_positions(array, element)      | Searches for an element in the array, 
returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]`        
                             |
+| array_prepend(array, element)        | Prepends an element to the beginning 
of an array. `array_prepend(1, [2, 3, 4]) -> [1, 2, 3, 4]`                      
                              |
+| array_remove(array, element)         | Removes one element from the array 
equal to the given value. `array_remove([1, 2, 2, 3, 1, 4], 2) -> [1, 2, 3, 1, 
4]`                              |

Review Comment:
   Maybe we could explicitly say "Removes the first element from the array 
equal to the given value":
   
   ```suggestion
   | array_remove(array, element)         | Removes the first element from the 
array equal to the given value. `array_remove([1, 2, 2, 3, 1, 4], 2) -> [1, 2, 
3, 1, 4]`                              |
   ```



##########
docs/source/user-guide/sql/scalar_functions.md:
##########
@@ -1777,24 +1779,72 @@ _Alias of [array_prepend](#array_prepend)._
 
 ### `array_remove`
 
-Removes all elements equal to the given value from the array.
+Removes one element from the array equal to the given value.

Review Comment:
   ```suggestion
   Removes the first element from the array equal to the given value.
   ```



##########
datafusion/core/tests/sqllogictests/test_files/array.slt:
##########
@@ -907,14 +907,122 @@ select array_positions(column1, make_array(4, 5, 6)), 
array_positions(make_array
 [6] []
 [1] []
 
-## array_replace
+## array_replace (replaces one occurrence of an element) (aliases: 
`list_replace`)
 
-# array_replace scalar function
+# array_replace scalar function #1
 query ???
 select array_replace(make_array(1, 2, 3, 4), 2, 3), 
array_replace(make_array(1, 4, 4, 5, 4, 6, 7), 4, 0), 
array_replace(make_array(1, 2, 3), 4, 0);
 ----
+[1, 3, 3, 4] [1, 0, 4, 5, 4, 6, 7] [1, 2, 3]
+
+# array_replace scalar function #2 (with optional argument)
+query ???
+select array_replace(make_array(1, 2, 3, 4), 2, 3, 2), 
array_replace(make_array(1, 4, 4, 5, 4, 6, 7), 4, 0, 2), 
array_replace(make_array(1, 2, 3), 4, 0, 3);

Review Comment:
   > What do you think of these features, @alamb and @jayzhan211? Should they 
be implemented or not?
   
   
   I found the fact that `array_replace` has 4 arguments very confusing 
initially. All "replace" APIs I can think of (e.g. Rust's 
[`replace`](https://doc.rust-lang.org/std/primitive.str.html#method.replace)) 
take only three arguments: the value, the pattern to search for and the 
replacement value. 
   
   If we want to implement "replace the first N" values, I think we should add 
a new function like `array_replace_n` (modeled after 
[`replacen`](https://doc.rust-lang.org/std/primitive.str.html#method.replacen)) 
   
   It would be great to share the implementation as you have done in this PR
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to