jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961771860
########## datafusion/expr-common/src/signature.rs: ########## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec<DataType> { } } +/// Represents type coercion rules for function arguments, specifying both the desired type +/// and optional implicit coercion rules for source types. +/// +/// # Examples +/// +/// ``` +/// use datafusion_expr_common::signature::{Coercion, TypeSignatureClass}; +/// use datafusion_common::types::{NativeType, logical_binary, logical_string}; +/// +/// // Exact coercion that only accepts timestamp types +/// let exact = Coercion::new_exact(TypeSignatureClass::Timestamp); +/// +/// // Implicit coercion that accepts string types but can coerce from binary types +/// let implicit = Coercion::new_implicit( +/// TypeSignatureClass::Native(logical_string()), +/// vec![TypeSignatureClass::Native(logical_binary())], +/// NativeType::String +/// ); +/// ``` +/// +/// There are two variants: +/// +/// * `Exact` - Only accepts arguments that exactly match the desired type +/// * `Implicit` - Accepts the desired type and can coerce from specified source types +#[derive(Debug, Clone, Eq, PartialOrd)] +pub enum Coercion { + /// Coercion that only accepts arguments exactly matching the desired type. + Exact { + /// The required type for the argument + desired_type: TypeSignatureClass, + }, + + /// Coercion that accepts the desired type and can implicitly coerce from other types. + Implicit { + /// The primary desired type for the argument + desired_type: TypeSignatureClass, + /// Rules for implicit coercion from other types + implicit_coercion: ImplicitCoercion, + }, Review Comment: > Let's consider example of substr(s, i) function. The call to substr should succeed for i being any integer type coercible to UInt64 or Int64. You’ve defined that the second argument of substr can be any integer type coercible to Int64. Isn’t this part of the function definition? By doing so, the function knows what coercion is needed. However, it's not enough to just make this definition possible. If we want to allow coercion from a string integer type or if we only expect coercion to Int32, those options should be possible as well. Given this, we can't decide how coercion should happen without being informed by the user or the function definition > For example, integer types should be coercible to broader integer types the same way regardless whether it's in context of UNION ALL, EXCEPT, a function call, or an operator. For internal DataFusion use cases, this rule works well. However, from an extensibility perspective, restricting coercion rules reduces flexibility. What if they don't want broader integer type for some specific workflow because they know the max value is in i32? We should prioritize maximum flexibility while also providing built-in options for general use cases to ensure ease of use. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org