This is an automated email from the ASF dual-hosted git repository.

russellspitzer pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iceberg.git


The following commit(s) were added to refs/heads/master by this push:
     new 11a708af9d Docs: Spark Schema Merge docs (#8528)
11a708af9d is described below

commit 11a708af9d3417a1840968f46b231248b3388018
Author: Andrea Campolonghi <[email protected]>
AuthorDate: Thu Sep 14 18:31:49 2023 +0200

    Docs: Spark Schema Merge docs (#8528)
---
 docs/spark-writes.md | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/docs/spark-writes.md b/docs/spark-writes.md
index ea62c4b333..db641fc9b9 100644
--- a/docs/spark-writes.md
+++ b/docs/spark-writes.md
@@ -313,6 +313,33 @@ data.writeTo("prod.db.table")
     .createOrReplace()
 ```
 
+### Schema Merge
+
+While inserting or updating Iceberg is capable of resolving schema mismatch at 
runtime. If configured, Iceberg will perform an automatic schema evolution as 
follows:
+
+
+* A new column is present in the source but not in the target table.
+    
+  The new column is added to the target table. Column values are set to `NULL` 
in all the rows already present in the table
+
+* A column is present in the target but not in the source. 
+
+  The target column value is set to `NULL` when inserting or left unchanged 
when updating the row.
+
+The target table must be configured to accept any schema change by setting the 
property `write.spark.accept-any-schema` to `true`.
+
+```sql
+ALTER TABLE prod.db.sample SET TBLPROPERTIES (
+  'write.spark.accept-any-schema'='true'
+)
+```
+The writer must enable the `mergeSchema` option.
+
+```scala
+data.writeTo("prod.db.sample").option("mergeSchema","true").append()
+```
+
+
 ## Writing Distribution Modes
 
 Iceberg's default Spark writers require that the data in each spark task is 
clustered by partition values. This 

Reply via email to