rdblue commented on a change in pull request #3820:
URL: https://github.com/apache/iceberg/pull/3820#discussion_r777621881
##########
File path: site/docs/spark-ddl.md
##########
@@ -360,3 +360,29 @@ ALTER TABLE prod.db.sample WRITE ORDERED BY category ASC
NULLS LAST, id DESC NUL
!!! Note
Table write order does not guarantee data order for queries. It only
affects how data is written to the table.
+Only local sorting can be set at the same time, use `LOCALLY ORDERED BY`
+
+```sql
+ALTER TABLE prod.db.sample WRITE LOCALLY ORDERED BY category, id
+-- use optional ASC/DEC keyword to specify sort order of each field (default
ASC)
+ALTER TABLE prod.db.sample WRITE LOCALLY ORDERED BY category ASC, id DESC
+-- use optional NULLS FIRST/NULLS LAST keyword to specify null order of each
field (default FIRST)
+ALTER TABLE prod.db.sample WRITE LOCALLY ORDERED BY category ASC NULLS LAST,
id DESC NULLS FIRST
+```
+### `ALTER TABLE ... WRITE DISTRIBUTED BY PARTITION`
+
+Iceberg tables can be configured with a hash distribution where tuples that
share the same values for clustering expressions are
Review comment:
The requirement is to distribute by partition. Hash distribution is an
implementation detail. Instead, I think this should state that `WRITE
DISTRIBUTED BY PARTITION` will guarantee that a given partition is handled by
one writer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]