[GitHub] [spark] cloud-fan commented on a change in pull request #35158: [SPARK-37859][SQL] Do not check for metadata during schema comparison

GitBox Mon, 10 Jan 2022 20:57:03 -0800


cloud-fan commented on a change in pull request #35158:
URL: https://github.com/apache/spark/pull/35158#discussion_r781757984




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
##########
@@ -353,7 +353,7 @@ case class DataSource(
       case (dataSource: RelationProvider, Some(schema)) =>
         val baseRelation =
           dataSource.createRelation(sparkSession.sqlContext, 
caseInsensitiveOptions)
-        if (baseRelation.schema != schema) {
+        if (baseRelation.schema.withoutMetadata != schema.withoutMetadata) {

Review comment:
       The user-specified schema is ignored at the end. Technically we can 
completely ignore it and don't do any checks. The check is to make sure the 
user-specified schema is not too wrong. If we can ignore the metadata, it seems 
to me that compatible nullability can also be ignored. e.g. the actual column 
is not nullable but is marked as nullable in the user-specified schema, we can 
still pass the check.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #35158: [SPARK-37859][SQL] Do not check for metadata during schema comparison

Reply via email to