izveigor opened a new issue, #7580: URL: https://github.com/apache/arrow-datafusion/issues/7580
### Is your feature request related to a problem or challenge? Follow on https://github.com/apache/arrow-datafusion/issues/6559. ## Argument quality: - definite (`def`) and undefined (`undef`); - equal (`eq`) and unequal (`uneq`); Combine qualities: - `eq-def` (equal definite) - `eq-undef` (equal undefined) - `uneq-def` (unequal definite) - `uneq-undef` (unequal undefined) ## Algebra Kleene algebra {+, ·, \*} | TypeSignature | Kleene algebra | | ------------- | ------------------------------------------- | | Variadic | (`uneq-def1` + `uneq-def2` + ...)\* | | VariadicEqual | (`eq-undef`)\* | | VariadicAny | (`uneq-undef`)\* | | Uniform | (`uneq-def1` + `uneq-def2` + ...)^n | | Exact | (`uneq-def1` · `uneq-def2` · ...) | | Any | (`uneq-undef`)^n | `(+)`: present in the current version; </br> `(-)`: not present in the current version; ### Undefined | Kleene algebra/Quality | `eq-undef` | `uneq-undef` | | ---------------------- | ------------------- | ----------------- | | (`arg`)\* | `VariadicEqual` (+) | `VariadicAny` (+) | | (`arg`)^n | `Equal` (-) | `Any` (+) | ### Definite | Kleene algebra/Quality | `eq-def` | `uneq-def` | | --------------------------------- | ----------------------------------- | -------------------------------------- | | (`arg1` + `arg2` + ...)\* | `Variadic` with single argument (+) | `Variadic` with multiple arguments (+) | | (`arg1` + `arg2` + ...)^n | `Uniform` with single argument (+) | `Uniform` with multiple arguments (+) | | (`arg1` · `arg2` · ...) | `Exact` with the same data type (+) | `Exact` with different data types (+) | ## Meta Algebra Kleene algebra {+, ·, \*} | TypeSignature | Kleene algebra | Usage | | ------------- | --------------------------------- | ----------------------------------------------------------- | | OneOf (+) | `expr1` + `expr2` + ... | used with any `TypeSignature` if it makes sense | | Concat (-) | `expr1` · `expr2` · ... | `Exact`, `Uniform`, `Equal`, `Any` (without Kleene closure) | ## Argument expansion ``` /// A function's nested argument. pub enum TypeNestedArgument { /// `DataType::List` List, /// `DataType::Struct` Struct, } ``` ``` /// A function's type argument, which defines the function's valid data types. pub enum TypeArgument { /// Supports arbitrarily (with any dimensions) nested data types (like `DataType::List`) /// `base_types`: base data types for nested type /// `include_nested`: include inner nested datatypes (all defined nested data types are arbitrary) /// /// # Examples: /// `TypeArgument::List { arg: TypeNestedArgument::List, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![TypeNestedArgument::Struct] }` can accept /// data types: `DataType::List(DataType::UInt8)`, `DataType::List(DataType::Int8)`, /// `DataType::List(DataType::Struct(DataType::Int8))`, `DataType::List(DataType::Struct(DataType::Struct(DataType::UInt8)))` and so on. Nested { arg: TypeNestedArgument, base_types: Vec<DataType>, include_nested: Vec<TypeNestedArgument>, }, /// Supports non nested data types Common(Vec<DataType>), } ``` ### Variadic Input: ``` Variadic( vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int8, DataType::UInt8]), TypeArgument::Nested { arg: TypeNestedArgument::Struct, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![], } ] ) ``` Output: ``` (DataType::List(DataType::Int8 + DataType::UInt8 + DataType::List(recursive)) + DataType::Int8 + DataType::UInt8 + DataType::Struct(DataType::Int8 + DataType::UInt8 + DataType::Struct(recursive)))* ``` ### Uniform Input: ``` Uniform( n, vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int8, DataType::UInt8]), TypeArgument::Struct{ arg: TypeNestedArgument::Struct, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![], } ] ) ``` Output: ``` (DataType::List(DataType::Int8 + DataType::UInt8 + DataType::List(recursive)) + DataType::Int8 + DataType::UInt8 + DataType::Struct(DataType::Int8 + DataType::UInt8 + DataType::Struct(recursive))) ^ n ``` ### Exact Input: ``` Exact( vec![ TypeArgument::Nested{ arg: TypeNestedArgument::List, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int8, DataType::UInt8]), TypeArgument::Nested{ arg: TypeNestedArgument::Struct, base_types: vec![DataType::Int8, DataType::UInt8], include_nested: vec![] } ] ) ``` Output: ``` DataType::List(DataType::Int8 + DataType::UInt8 + DataType::List(recursive)) | DataType::Int8 | DataType::UInt8 | DataType::Struct(DataType::Int8 + DataType::UInt8 + DataType::Struct(recursive)) ``` Proposed code for future features: ``` BuiltinScalarFunction::ArrayAppend | BuiltinScalarFunction::ArrayPositions | BuiltinScalarFunction::ArrayRemove | BuiltinScalarFunction::ArrayRemoveAll | BuiltinScalarFunction::ArrayHas => Signature::concat( vec![ Exact(vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }]), Uniform(1, vec![ TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ]), ], self.volatility(), ), BuiltinScalarFunction::ArrayPopBack | BuiltinScalarFunction::ArrayDims | BuiltinScalarFunction::ArrayEmpty | BuiltinScalarFunction::Flatten | BuiltinScalarFunction::ArrayNdims | BuiltinScalarFunction::Cardinality => Signature::exact( vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }], self.volatility(), ), BuiltinScalarFunction::ArrayConcat => Signature::variadic( vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }], self.volatility(), ), BuiltinScalarFunction::ArrayHasAll | BuiltinScalarFunction::ArrayHasAny => { Signature::exact( vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ], self.volatility(), ) } BuiltinScalarFunction::ArrayElement => Signature::exact( vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int64]), ], self.volatility(), ), BuiltinScalarFunction::ArraySlice => Signature::exact( vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int64, DataType::Int64]), ], self.volatility(), ), BuiltinScalarFunction::ArrayLength => Signature::one_of( vec![ Exact(vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }]), Exact(vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int64]), ]), ], self.volatility(), ), BuiltinScalarFunction::ArrayPosition => Signature::one_of( vec![Exact(vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, TypeArgument::Common(vec![DataType::Int64, DataType::Int64]), ])], self.volatility(), ), BuiltinScalarFunction::ArrayPrepend => Signature::concat( vec![ Uniform(1, vec![ TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ]), Exact(TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }), ], self.volatility(), ), BuiltinScalarFunction::ArrayRepeat => Signature::concat( vec![ Uniform(1, vec![ TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ]), Exact(TypeArgument::Common(vec![DataType::Int64])), ], self.volatility(), ), BuiltinScalarFunction::ArrayRemoveN => Signature::concat(vec![ Exact(vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }]), Uniform( 1, vec![ TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ]), Exact(vec![TypeArgument::Common(vec![DataType::Int64])]), ]), BuiltinScalarFunction::ArrayReplace | BuiltinScalarFunction::ArrayReplaceAll => Signature::concat( vec![ Exact(vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }]), Uniform( 2, vec![ TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ]), ], self.volatility(), ), BuiltinScalarFunction::ArrayReplaceN => Signature::concat( vec![ Exact(vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }]), Uniform(2, vec![ TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, ]), Exact(vec![TypeArgument::Common(vec![DataType::Int64])]), ], self.volatility(), ), BuiltinScalarFunction::ArrayToString => Signature::concat( Exact(vec![TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }]), OneOf(vec![ Exact(vec![TypeArgument::Common(vec![DataType::Utf8])]), Exact(vec![TypeArgument::Common(vec![ DataType::Utf8, DataType::Utf8, ])]), ]), ), BuiltinScalarFunction::MakeArray => Signature::concat( vec![Variadic(vec![ TypeArgument::Nested { arg: TypeNestedArgument::List, base_types: array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), include_nested: vec![], }, TypeArgument::Common( array_expressions::SUPPORTED_ARRAY_TYPES.to_vec(), ), ])], self.volatility(), ), ``` ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
