[GitHub] [iceberg] andreacfm commented on a diff in pull request #8528: Schema Merge docs

via GitHub Wed, 13 Sep 2023 03:53:56 -0700


andreacfm commented on code in PR #8528:
URL: https://github.com/apache/iceberg/pull/8528#discussion_r1324335822



##########
docs/spark-writes.md:
##########
@@ -313,6 +313,28 @@ data.writeTo("prod.db.table")
     .createOrReplace()
 ```
 
+### Schema Merge
+
+While inserting or updating Iceberg is capable of resolving schema mismatch at 
runtime. If configured accordingly Iceberg will perform an automatic schema 
evolution as following:
+
+* A new column is present in the source but not in the target table. The new 
column is added to the target table. Column values are set to NULL in all the 
+rows already present in the table
+* A column is present in the target but not in the source. The target column 
value is set to NULL when inserting or left unchanged when updating the row.
+
+The target table must be configured to accept-any-schema
+
+```sql
+ALTER TABLE prod.db.sample SET TBLPROPERTIES (
+  'write.spark.accept-any-schema'='true'
+)
+```
+The writer must enable the `schema-merge` option.

Review Comment:
   Ops. I think schema-merge is the equivalent in pyspark. Since the sample is 
in Scala we should probably use mergeSchema. Is this correct @RussellSpitzer ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] andreacfm commented on a diff in pull request #8528: Schema Merge docs

Reply via email to