This is an automated email from the ASF dual-hosted git repository.
szehon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git
The following commit(s) were added to refs/heads/main by this push:
new 4a0ae22199 Docs: Clarify defaults for distribution mode (#10575)
4a0ae22199 is described below
commit 4a0ae22199375e34f9033bed9781da3dc90d53c6
Author: Szehon Ho <[email protected]>
AuthorDate: Tue Jul 16 13:58:13 2024 -0700
Docs: Clarify defaults for distribution mode (#10575)
---
docs/docs/configuration.md | 2 +-
docs/docs/spark-configuration.md | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/docs/docs/configuration.md b/docs/docs/configuration.md
index 117adca09f..264b9edfa7 100644
--- a/docs/docs/configuration.md
+++ b/docs/docs/configuration.md
@@ -67,7 +67,7 @@ Iceberg tables support table properties to configure table
behavior, like the de
| write.metadata.metrics.column.col1 | (not set)
| Metrics mode for column 'col1' to allow per-column tuning; none,
counts, truncate(length), or full
|
| write.target-file-size-bytes | 536870912 (512 MB)
| Controls the size of files generated to target about this many bytes
|
| write.delete.target-file-size-bytes | 67108864 (64 MB)
| Controls the size of delete files generated to target about this many
bytes
|
-| write.distribution-mode | none
| Defines distribution of write data: __none__: don't shuffle rows;
__hash__: hash distribute by partition key ; __range__: range distribute by
partition key or sort key if table has an SortOrder |
+| write.distribution-mode | none, see engines for specific defaults, for
example [Spark Writes](spark-writes.md#writing-distribution-modes) | Defines
distribution of write data: __none__: don't shuffle rows; __hash__: hash
distribute by partition key ; __range__: range distribute by partition key or
sort key if table has an SortOrder |
| write.delete.distribution-mode | hash
| Defines distribution of write delete data
|
| write.update.distribution-mode | hash
| Defines distribution of write update data
|
| write.merge.distribution-mode | none
| Defines distribution of write merge data
|
diff --git a/docs/docs/spark-configuration.md b/docs/docs/spark-configuration.md
index 9ff7396498..5b281b1989 100644
--- a/docs/docs/spark-configuration.md
+++ b/docs/docs/spark-configuration.md
@@ -190,6 +190,7 @@ df.write
| compression-codec | Table write.(fileformat).compression-codec |
Overrides this table's compression codec for this write |
| compression-level | Table write.(fileformat).compression-level |
Overrides this table's compression level for Parquet and Avro tables for this
write |
| compression-strategy | Table write.orc.compression-strategy |
Overrides this table's compression strategy for ORC tables for this write |
+| distribution-mode | See [Spark
Writes](spark-writes.md#writing-distribution-modes) for defaults | Override
this table's distribution mode for this write |
CommitMetadata provides an interface to add custom metadata to a snapshot
summary during a SQL execution, which can be beneficial for purposes such as
auditing or change tracking. If properties start with `snapshot-property.`,
then that prefix will be removed from each property. Here is an example: