qlong commented on code in PR #53458:
URL: https://github.com/apache/spark/pull/53458#discussion_r2743641823
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##########
@@ -639,10 +633,15 @@ case class Cast(
}
override def nullable: Boolean = if (!isTryCast) {
- child.nullable || Cast.forceNullable(child.dataType, dataType)
+ // In LEGACY mode with validation enabled, casting BinaryType to
StringType can return null
+ val binaryToStringInLegacy = !ansiEnabled && validateUtf8 &&
+ child.dataType == BinaryType && dataType.isInstanceOf[StringType]
Review Comment:
Thanks for reviewing @attilapiros. Good point on Cast.forceNullable() which
I was not looking at. These are the changes:
- Moved the type check in Cast.forceNullable() as suggested. This ensures
that analysis-time nullability determination (for complex types) matches the
actual runtime Cast behavior.
- Kept the check in nullable for try_cast mode
- Added new tests for complex types
- Updated `_as_string_type()` in `base.py` to query actual Spark DataFrame
schema after the cast
- Updated/Added python tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]