rafafrdz commented on code in PR #17485:
URL: https://github.com/apache/datafusion/pull/17485#discussion_r2341172796


##########
datafusion/spark/src/function/url/parse_url.rs:
##########
@@ -47,23 +46,7 @@ impl Default for ParseUrl {
 impl ParseUrl {
     pub fn new() -> Self {
         Self {
-            signature: Signature::one_of(
-                vec![
-                    TypeSignature::Uniform(
-                        1,
-                        vec![DataType::Utf8View, DataType::Utf8, 
DataType::LargeUtf8],
-                    ),
-                    TypeSignature::Uniform(
-                        2,
-                        vec![DataType::Utf8View, DataType::Utf8, 
DataType::LargeUtf8],
-                    ),
-                    TypeSignature::Uniform(
-                        3,
-                        vec![DataType::Utf8View, DataType::Utf8, 
DataType::LargeUtf8],
-                    ),
-                ],
-                Volatility::Immutable,
-            ),
+            signature: Signature::user_defined(Volatility::Immutable),

Review Comment:
   After rereading this several times, my understanding is that when you pass a 
Dictionary with string values, DataFusion attempts to match it against the 
`String` signature. However, `parse_url` is defined to accept only **plain 
string** arguments 
[ref](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.parse_url.html).
 It does not expect any dictionary inputs.
   
   We mark the UDF’s signature as `user_defined` to enable coercion across 
string types (`Utf8`, `Utf8View`, `LargeUtf8`), but a dictionary array is still 
not a string type, so it isn’t coerced, and the call won’t match.
   
   In short, even if the `String` signature seems to "capture" dictionaries 
with string values, `parse_url` will still reject them because the underlying 
physical type is a dictionary, not a string



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to