stefankandic commented on code in PR #48663:
URL: https://github.com/apache/spark/pull/48663#discussion_r1848608804


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCoercion.scala:
##########
@@ -193,88 +144,208 @@ object CollationTypeCoercion {
     case other => other
   }
 
+  /**
+   * If childType is collated and target is UTF8_BINARY, the collation of the 
output
+   * should be that of the childType.
+   */
+  private def shouldRemoveCast(cast: Cast): Boolean = {
+    val isUserDefined = cast.getTagValue(Cast.USER_SPECIFIED_CAST).isDefined
+    val isChildTypeCollatedString = cast.child.dataType match {
+      case st: StringType => !st.isUTF8BinaryCollation
+      case _ => false
+    }
+    val targetType = cast.dataType
+
+    isUserDefined && isChildTypeCollatedString && targetType == StringType
+  }
+
   /**
    * Extracts StringTypes from filtered hasStringType
    */
   @tailrec
-  private def extractStringType(dt: DataType): StringType = dt match {
-    case st: StringType => st
+  private def extractStringType(dt: DataType): Option[StringType] = dt match {
+    case st: StringType => Some(st)
     case ArrayType(et, _) => extractStringType(et)

Review Comment:
   For arrays I think `array_append` would be a good example -> 
`array_append(array('a' collate unicode), 'b')` should return an array with 
unicode collation (winning context having data type array of unicode string).
   
   I'm not really sure if this can work for maps/structs because how would you 
decide which part of it you should take or later cast to a different type?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to