uros-db commented on code in PR #46503:
URL: https://github.com/apache/spark/pull/46503#discussion_r1597932167
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/inputFileBlock.scala:
##########
@@ -39,7 +40,7 @@ case class InputFileName() extends LeafExpression with
Nondeterministic {
override def nullable: Boolean = false
- override def dataType: DataType = StringType
+ override def dataType: DataType = SQLConf.get.defaultStringType
Review Comment:
### Example 1 (incorrect behaviour with respect to collation)
User wants to check whether input file name (which is a string) begins with
"abc"
```
SELECT startsWith(input_file_name(), "abc");
```
- suppose the user has set the session level default collation, let's say
"UNICODE"
- `input_file_name` expression returns StringType(0), let's say it returns
"ABCDE"
- `startsWith` expression takes (StringTypeAnyCollation,
StringTypeAnyCollation)
- manually specified Literal "abc" will have the default collation - here:
"UNICODE"
so in this example query, `startsWith` expression would execute with respect
to first parameter collation (in this case 0 - which is "UTF8_BINARY") because
Literal collation is ignored according to the type casting rules, which is not
expected here because the session level default collation is "UNICODE"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]