lokesh-lingarajan-0310 commented on code in PR #9743:
URL: https://github.com/apache/hudi/pull/9743#discussion_r1357713769
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -578,17 +582,25 @@ object HoodieSparkSqlWriter {
} else {
if (!shouldValidateSchemasCompatibility) {
// if no validation is enabled, check for col drop
- // if col drop is allowed, go ahead. if not, check for
projection, so that we do not allow dropping cols
- if (allowAutoEvolutionColumnDrop ||
canProject(latestTableSchema, canonicalizedSourceSchema)) {
+ if (allowAutoEvolutionColumnDrop) {
canonicalizedSourceSchema
} else {
- log.error(
- s"""Incoming batch schema is not compatible with the table's
one.
- |Incoming schema ${sourceSchema.toString(true)}
- |Incoming schema (canonicalized)
${canonicalizedSourceSchema.toString(true)}
- |Table's schema ${latestTableSchema.toString(true)}
- |""".stripMargin)
- throw new SchemaCompatibilityException("Incoming batch schema
is not compatible with the table's one")
+ val reconciledSchema = if (addNullForDeletedColumns) {
+
AvroSchemaEvolutionUtils.reconcileSchema(canonicalizedSourceSchema,
latestTableSchema)
Review Comment:
Since we are using a single AvroSchemaEvolutionUtils.reconcileSchema API for
both soft(OOB) and hard evolution(schema.on.read), should we just remove the
code for "DataSourceWriteOptions.RECONCILE_SCHEMA.key()" and make this function
less cluttered ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]