This is an automated email from the ASF dual-hosted git repository.
dkuzmenko pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/hive-site.git
The following commit(s) were added to refs/heads/main by this push:
new 012556d HIVE-29135: Add Write Ordering feature documentation for
Iceberg tables (#70)
012556d is described below
commit 012556d57b683932aee9e10c4d4c6cda22646e8b
Author: kokila-19 <[email protected]>
AuthorDate: Sat Nov 1 15:47:45 2025 +0530
HIVE-29135: Add Write Ordering feature documentation for Iceberg tables
(#70)
---
content/docs/latest/language/languagemanual.md | 1 +
content/docs/latest/language/writeordering.md | 171 +++++++++++++++++++++++++
2 files changed, 172 insertions(+)
diff --git a/content/docs/latest/language/languagemanual.md
b/content/docs/latest/language/languagemanual.md
index 46ea7f5..1509246 100644
--- a/content/docs/latest/language/languagemanual.md
+++ b/content/docs/latest/language/languagemanual.md
@@ -24,6 +24,7 @@ This is the Hive Language Manual. For other Hive
documentation, see the Hive w
* Data Definition Statements
+ [DDL Statements]({{< ref "languagemanual-ddl" >}})
- [Bucketed Tables]({{< ref "languagemanual-ddl-bucketedtables"
>}})
+ - [Write Ordering (Regular & Z-Order)]({{< ref "writeordering"
>}})
+ [Statistics (Analyze and Describe)]({{< ref "statsdev" >}})
+ [Indexes]({{< ref "languagemanual-indexing" >}})
+ [Archiving]({{< ref "languagemanual-archiving" >}})
diff --git a/content/docs/latest/language/writeordering.md
b/content/docs/latest/language/writeordering.md
new file mode 100644
index 0000000..76b399c
--- /dev/null
+++ b/content/docs/latest/language/writeordering.md
@@ -0,0 +1,171 @@
+---
+title: "Apache Hive : Write Ordering"
+date: 2025-10-31
+---
+
+# Apache Hive : Write Ordering
+
+## Overview
+
+Write ordering controls the physical layout of data within table files. Unlike
`SORT BY` which orders data during query execution, write ordering is applied
at write time and persists in the stored files.
+
+Write ordering is supported for Iceberg tables and can be specified during
table creation.
+
+Hive supports two write ordering strategies:
+* **Regular Ordering**: Sort by one or more columns in a specified order
+* **Z-Ordering**: Multi-dimensional clustering using space-filling curves
+
+---
+
+## Regular Column Ordering
+
+### Version
+
+Introduced in Hive version
[4.1.0](https://issues.apache.org/jira/browse/HIVE-28586)
+
+### Syntax
+
+```sql
+CREATE TABLE table_name (column_definitions)
+WRITE [LOCALLY] ORDERED BY column_name [ASC | DESC] [NULLS FIRST | NULLS LAST]
+ [, column_name [ASC | DESC] [NULLS FIRST | NULLS LAST] ]*
+STORED BY ICEBERG
+[STORED AS file_format];
+```
+
+### Options
+
+* Sort Order
+ * `ASC`: Ascending order (default)
+ * `DESC`: Descending order
+* Null Order
+ * `NULLS FIRST`: Null values sorted before non-null values
+ * `NULLS LAST`: Null values sorted after non-null values
+
+### Examples
+
+Single column:
+
+```sql
+CREATE TABLE events (
+ event_id BIGINT,
+ event_date DATE,
+ event_type STRING
+)
+WRITE LOCALLY ORDERED BY event_date DESC
+STORED BY ICEBERG
+STORED AS ORC;
+```
+
+Multiple columns with null handling:
+
+```sql
+CREATE TABLE orders (
+ order_id BIGINT,
+ order_date DATE,
+ customer_id INT,
+ amount DECIMAL(10,2)
+)
+WRITE ORDERED BY order_date DESC NULLS FIRST, order_id ASC
+STORED BY ICEBERG;
+```
+
+### Use Cases
+
+Regular ordering is most effective for:
+
+* Time-series data with temporal access patterns
+* Range queries on sorted columns
+* Queries with consistent ORDER BY clauses
+* Single-dimensional access patterns
+
+---
+
+## Z-Ordering
+
+### Version
+
+Introduced in Hive version
[4.2.0](https://issues.apache.org/jira/browse/HIVE-29133)
+
+### Overview
+
+Z-order applies a multi-dimensional clustering technique based on
space-filling curves. This approach interleaves column values to co-locate
related records across multiple dimensions, enabling efficient filtering on
various column combinations.
+
+### Syntax
+
+```sql
+CREATE TABLE table_name (column_definitions)
+WRITE [LOCALLY] ORDERED BY ZORDER(column_name [, column_name ]*)
+STORED BY ICEBERG
+[STORED AS file_format];
+```
+
+### Examples
+
+Two columns:
+
+```sql
+CREATE TABLE user_events (
+ user_id INT,
+ event_date DATE,
+ event_type STRING,
+ value DOUBLE
+)
+WRITE LOCALLY ORDERED BY ZORDER(user_id, event_date)
+STORED BY ICEBERG
+STORED AS ORC;
+```
+
+Multiple columns:
+
+```sql
+CREATE TABLE analytics (
+ customer_id INT,
+ activity_date DATE,
+ country STRING,
+ product_id INT
+)
+WRITE ORDERED BY ZORDER(customer_id, activity_date, country)
+STORED BY ICEBERG;
+```
+
+### Table Properties Method
+
+Z-ordering can alternatively be specified using table properties.
+
+```sql
+CREATE TABLE table_name (column_definitions)
+STORED BY ICEBERG
+TBLPROPERTIES (
+ 'sort.order' = 'zorder',
+ 'sort.columns' = 'column1,column2'
+);
+```
+
+### Use Cases
+
+Z-order is most effective for:
+
+* Multi-dimensional analytical queries
+* Ad-hoc queries with varying filter patterns
+* Queries filtering on different column combinations
+---
+
+## Comparison with SORT BY
+
+| Feature | WRITE ORDERED BY | SORT BY |
+|---------|------------------|---------|
+| Application | Write time | Query time |
+| Persistence | Permanent in files | Query result only |
+| Scope | Physical file layout | Query execution |
+| Table Support | Iceberg tables | All table types |
+
+---
+
+## Limitations and Considerations
+
+* Write ordering only applies to Iceberg tables
+* Write operations incur ordering overhead:
+ * Regular ordering: Sort cost
+ * Z-order: Sort cost plus z-value computation
+* Column selection should be based on query workload analysis