[GitHub] [hudi] nsivabalan commented on a change in pull request #2927: [HUDI-1129] Adding support to ingest records with old schema after table's schema is evolved

GitBox Sun, 16 May 2021 20:00:31 -0700


nsivabalan commented on a change in pull request #2927:
URL: https://github.com/apache/hudi/pull/2927#discussion_r633193730




##########
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -333,6 +333,14 @@ object DataSourceWriteOptions {
   val META_SYNC_CLIENT_TOOL_CLASS = "hoodie.meta.sync.client.tool.class"
   val DEFAULT_META_SYNC_CLIENT_TOOL_CLASS = classOf[HiveSyncTool].getName
 
+  /**
+   * When a new batch of write has records with old schema, but latest table 
schema got evolved, this config will
+   * upgrade the records to leverage latest table schema(default vals will be 
injected to missing fields).
+   * If not, the batch would fail.
+   */
+  val UPGRADE_OLD_SCHEMA_RECORDS_TO_LATEST_TABLE_SCHEMA_OPT_KEY = 
"hoodie.datasource.write.upgrade.old.schema.records.to.latest.table.schema"

Review comment:
       I am ok with this renaming. but in general, this was my rational for 
naming it that way. in general sense, schema mismatch could mean anything. it 
could mean, addition of new cols, or missing few cols, or renaming of col 
names, etc. And so wanted to be specific. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on a change in pull request #2927: [HUDI-1129] Adding support to ingest records with old schema after table's schema is evolved

Reply via email to