qlong commented on code in PR #53458:
URL: https://github.com/apache/spark/pull/53458#discussion_r2743641823


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##########
@@ -639,10 +633,15 @@ case class Cast(
   }
 
   override def nullable: Boolean = if (!isTryCast) {
-    child.nullable || Cast.forceNullable(child.dataType, dataType)
+    // In LEGACY mode with validation enabled, casting BinaryType to 
StringType can return null
+    val binaryToStringInLegacy = !ansiEnabled && validateUtf8 &&
+      child.dataType == BinaryType && dataType.isInstanceOf[StringType]

Review Comment:
   Thanks for reviewing @attilapiros. Good point on Cast.forceNullable() which 
I was not looking at.  These are the changes:
   - Moved the type check in Cast.forceNullable() as suggested. This ensures 
that analysis-time nullability determination (for complex types) matches the 
actual runtime Cast behavior.
   - Kept the check in nullable for try_cast mode
   - Added new tests for complex types
   - Updated `_as_string_type()` in `base.py` to query actual Spark DataFrame 
schema  after the cast
   - Updated/Added python tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to