jkosh44 commented on code in PR #14532:
URL: https://github.com/apache/datafusion/pull/14532#discussion_r1949114839
##########
datafusion/expr-common/src/signature.rs:
##########
@@ -227,25 +226,13 @@ impl Display for TypeSignatureClass {
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
pub enum ArrayFunctionSignature {
- /// Specialized Signature for ArrayAppend and similar functions
- /// The first argument should be List/LargeList/FixedSizedList, and the
second argument should be non-list or list.
- /// The second argument's list dimension should be one dimension less than
the first argument's list dimension.
- /// List dimension of the List/LargeList is equivalent to the number of
List.
- /// List dimension of the non-list is 0.
- ArrayAndElement,
- /// Specialized Signature for ArrayPrepend and similar functions
- /// The first argument should be non-list or list, and the second argument
should be List/LargeList.
- /// The first argument's list dimension should be one dimension less than
the second argument's list dimension.
- ElementAndArray,
- /// Specialized Signature for Array functions of the form (List/LargeList,
Index+)
- /// The first argument should be List/LargeList/FixedSizedList, and the
next n arguments should be Int64.
- ArrayAndIndexes(NonZeroUsize),
- /// Specialized Signature for Array functions of the form (List/LargeList,
Element, Optional Index)
- ArrayAndElementAndOptionalIndex,
- /// Specialized Signature for ArrayEmpty and similar functions
- /// The function takes a single argument that must be a
List/LargeList/FixedSizeList
- /// or something that can be coerced to one of those types.
- Array,
+ /// A function takes at least one List/LargeList/FixedSizeList argument.
+ Array {
+ /// A full list of the arguments accepted by this function.
+ arguments: Vec<ArrayFunctionArgument>,
+ /// Whether any of the input arrays are modified.
+ mutability: ArrayFunctionMutability,
Review Comment:
> and the return type is determined by the mutability setting. Isn't it 🤔 ?
I'm still unfamiliar with much of the code base, so please take everything
I'm saying with a grain of salt. The mutability of the function is only ever
looked at by the `get_valid_types()` function, which is described in the code
as "Returns a Vec of all possible valid argument types for the given
signature.".
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/expr/src/type_coercion/functions.rs#L352-L363
From that description, I would conclude that mutability determines the
accepted argument types, **_not_** the return type.
However, `ScalarUDFImpl` has a couple of trait functions, like `fn
return_type(&self, arg_types: &[DataType]) -> Result<DataType>`, that determine
the return type as a function of the argument types.
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/expr/src/udf.rs#L540-L596
If we take a look at some of the implementations of `return_type()` for
array functions, many of them blindly pass through the argument type of the
input array.
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L396-L398
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L704-L706
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L811-L813
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/concat.rs#L108-L110
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/concat.rs#L196-L198
So by modifying the accepted argument types we are indirectly modifying the
return types.
> Instead of modeling "mutability," we can explicitly define the desired
type in the function signature. This type can be either List or FixedSizeList,
and we coerce the input accordingly
It might be a better approach to not modify the accepted argument types
(i.e. don't convert `FixedSizeList` to `List` in `get_valid_types()`), and
instead move the logic to `return_type()`. Then functions can be explicit about
not returning `FixedSizeList`s.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]