[GitHub] [hudi] n3nash commented on a change in pull request #2927: [HUDI-1129] Adding support to ingest records with old schema after table's schema is evolved

GitBox Sat, 15 May 2021 10:06:47 -0700


n3nash commented on a change in pull request #2927:
URL: https://github.com/apache/hudi/pull/2927#discussion_r632983301




##########
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -333,6 +333,14 @@ object DataSourceWriteOptions {
   val META_SYNC_CLIENT_TOOL_CLASS = "hoodie.meta.sync.client.tool.class"
   val DEFAULT_META_SYNC_CLIENT_TOOL_CLASS = classOf[HiveSyncTool].getName
 
+  /**
+   * When a new batch of write has records with old schema, but latest table 
schema got evolved, this config will
+   * upgrade the records to leverage latest table schema(default vals will be 
injected to missing fields).
+   * If not, the batch would fail.
+   */
+  val UPGRADE_OLD_SCHEMA_RECORDS_TO_LATEST_TABLE_SCHEMA_OPT_KEY = 
"hoodie.datasource.write.upgrade.old.schema.records.to.latest.table.schema"

Review comment:
       nit: rename to `HANDLE_SCHEMA_MISMATCH_FOR_INPUT_BATCH` and change to 
`hoodie.datasource.write.handle.schema.mismatch` ? The java doc is descriptive 
enough. Once we expose these configs, they are available forever so good to be 
little generic upfront in case you want to make more changes.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] n3nash commented on a change in pull request #2927: [HUDI-1129] Adding support to ingest records with old schema after table's schema is evolved

Reply via email to