(iceberg) branch main updated: Doc: Remove Hive 2.x/3.x related docs in hive.md (#12700)

pvary Wed, 09 Apr 2025 00:30:06 -0700

This is an automated email from the ASF dual-hosted git repository.

pvary pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git



The following commit(s) were added to refs/heads/main by this push:
     new 665baa5ac2 Doc: Remove Hive 2.x/3.x related docs in hive.md (#12700)
665baa5ac2 is described below

commit 665baa5ac23f7743642ff51d6b7594952e60198e
Author: jackylee <[email protected]>
AuthorDate: Wed Apr 9 15:29:24 2025 +0800

    Doc: Remove Hive 2.x/3.x related docs in hive.md (#12700)
---
 docs/docs/hive.md                 | 94 +++++++++------------------------------
 site/docs/multi-engine-support.md |  8 ++--
 2 files changed, 24 insertions(+), 78 deletions(-)

diff --git a/docs/docs/hive.md b/docs/docs/hive.md
index 5810feffaf..3abf8d89ce 100644
--- a/docs/docs/hive.md
+++ b/docs/docs/hive.md
@@ -24,41 +24,20 @@ Iceberg supports reading and writing Iceberg tables through 
[Hive](https://hive.
 a 
[StorageHandler](https://cwiki.apache.org/confluence/display/Hive/StorageHandlers).
 
 ## Feature support
-The following features matrix illustrates the support for different features 
across Hive releases for Iceberg tables - 
-
-| Feature support                                                 | Hive 2 / 3 
| Hive 4 |
-|-----------------------------------------------------------------|------------|--------|
-| [SQL create table](#create-table)                               | ✔️         
 | ✔️      |
-| [SQL create table as select (CTAS)](#create-table-as-select)    | ✔️         
 | ✔️      |
-| [SQL create table like table (CTLT)](#create-table-like-table)  | ✔️         
 | ✔️      |
-| [SQL drop table](#drop-table)                                   | ✔️         
 | ✔️      |
-| [SQL insert into](#insert-into)                                 | ✔️         
 | ✔️      |
-| [SQL insert overwrite](#insert-overwrite)                       | ✔️         
 | ✔️      |
-| [SQL delete from](#delete-from)                                 |            
| ✔️      |
-| [SQL update](#update)                                           |            
| ✔️      |
-| [SQL merge into](#merge-into)                                   |            
| ✔️      |
-| [Branches and tags](#branches-and-tags)                         |            
| ✔️      |
-
-Iceberg compatibility with Hive 2.x and Hive 3.1.2/3 supports the following 
features:
-
-* Creating a table
-* Dropping a table
-* Reading a table
-* Inserting into a table (INSERT INTO)
 
-!!! warning
-    DML operations work only with MapReduce execution engine.
-
-Hive supports the following additional features with Hive version 4.0.0 and 
above:
+Hive supports the following features with Hive version 4.0.0 and above:
 
-* Creating an Iceberg identity-partitioned table
-* Creating an Iceberg table with any partition spec, including the various 
transforms supported by Iceberg
-* Creating a table from an existing table (CTAS table)
-* Altering a table while keeping Iceberg and Hive schemas in sync
-* Altering the partition schema (updating columns)
-* Altering the partition schema by specifying partition transforms
+* Creating an Iceberg table.
+* Creating an Iceberg identity-partitioned table.
+* Creating an Iceberg table with any partition spec, including the various 
transforms supported by Iceberg.
+* Creating a table from an existing table (CTAS table).
+* Dropping a table.
+* Altering a table while keeping Iceberg and Hive schemas in sync.
+* Altering the partition schema (updating columns).
+* Altering the partition schema by specifying partition transforms.
 * Truncating a table / partition, dropping a partition.
-* Migrating tables in Avro, Parquet, or ORC (Non-ACID) format to Iceberg
+* Migrating tables in Avro, Parquet, or ORC (Non-ACID) format to Iceberg.
+* Reading an Iceberg table.
 * Reading the schema of a table.
 * Querying Iceberg metadata tables.
 * Time travel applications.
@@ -66,11 +45,11 @@ Hive supports the following additional features with Hive 
version 4.0.0 and abov
 * Inserting data overwriting existing data (INSERT OVERWRITE) in a table / 
partition.
 * Copy-on-write support for delete, update and merge queries, CRUD support for 
Iceberg V1 tables.
 * Altering a table with expiring snapshots.
-* Create a table like an existing table (CTLT table)
-* Support adding parquet compression type via Table properties [Compression 
types](https://spark.apache.org/docs/2.4.3/sql-data-sources-parquet.html#configuration)
+* Create a table like an existing table (CTLT table).
+* Support adding parquet compression type via Table properties [Compression 
types](https://spark.apache.org/docs/2.4.3/sql-data-sources-parquet.html#configuration).
 * Altering a table metadata location.
 * Supporting table rollback.
-* Honors sort orders on existing tables when writing a table [Sort orders 
specification](../../spec.md#sort-orders)
+* Honors sort orders on existing tables when writing a table [Sort orders 
specification](../../spec.md#sort-orders).
 * Creating, writing to and dropping an Iceberg branch / tag.
 * Allowing expire snapshots by Snapshot ID, by time range, by retention of 
last N snapshots and using table properties.
 * Set current snapshot using snapshot ID for an Iceberg table.
@@ -89,29 +68,14 @@ Hive supports the following additional features with Hive 
version 4.0.0 and abov
 
 ## Enabling Iceberg support in Hive
 
-Hive 4 comes with `hive-iceberg` that ships Iceberg, so no additional 
downloads or jars are needed. For older versions of Hive a runtime jar has to 
be added.
+Starting from 1.8.0 Iceberg doesn't release Hive runtime connector. For Hive 
query engine integration (specifically
+with Hive 2.x and 3.x) use Hive runtime connector coming with Iceberg 1.6.1, 
or use Hive 4.0.0 or later
+which is released with embedded Iceberg integration.
 
 ### Hive 4.0.x
 
 Hive 4.0.x comes with Iceberg 1.4.3 included.
 
-### Hive 2.3.x, Hive 3.1.x
-
-In order to use Hive 2.3.x or Hive 3.1.x, you must load the Iceberg-Hive 
runtime jar and enable Iceberg support, either globally or for an individual 
table using a table property.
-
-#### Loading runtime jar
-
-To enable Iceberg support in Hive, the `HiveIcebergStorageHandler` and 
supporting classes need to be made available on
-Hive's classpath. These are provided by the `iceberg-hive-runtime` jar file. 
For example, if using the Hive shell, this
-can be achieved by issuing a statement like so:
-
-```
-add jar /path/to/iceberg-hive-runtime.jar;
-```
-
-There are many others ways to achieve this including adding the jar file to 
Hive's auxiliary classpath so it is
-available by default. Please refer to Hive's documentation for more 
information.
-
 #### Enabling support
 
 If the Iceberg storage handler is not in Hive's classpath, then Hive cannot 
load or update the metadata for an Iceberg
@@ -126,9 +90,6 @@ To enable Hive support globally for an application, set 
`iceberg.engine.hive.ena
 For example, setting this in the `hive-site.xml` loaded by Spark will enable 
the storage handler for all tables created
 by Spark.
 
-!!! danger
-    Starting with Apache Iceberg `0.11.0`, when using Hive with Tez you also 
have to disable vectorization (`hive.vectorized.execution.enabled=false`).
-
 ##### Table property configuration
 
 Alternatively, the property `engine.hive.enabled` can be set to `true` and 
added to the table properties when creating
@@ -143,19 +104,12 @@ Catalog catalog=...;
 
 The table level configuration overrides the global Hadoop configuration.
 
-##### Hive on Tez configuration
-
-To use the Tez engine on Hive `3.1.2` or later, Tez needs to be upgraded to >= 
`0.10.1` which contains a necessary fix 
[TEZ-4248](https://issues.apache.org/jira/browse/TEZ-4248).
-
-To use the Tez engine on Hive `2.3.x`, you will need to manually build Tez 
from the `branch-0.9` branch due to a
-backwards incompatibility issue with Tez `0.10.1`.
-
-In both cases, you will also need to set the following property in the 
`tez-site.xml` configuration file: 
`tez.mrreader.config.update.properties=hive.io.file.readcolumn.names,hive.io.file.readcolumn.ids`.
-
 ## Catalog Management
 
 ### Global Hive catalog
 
+HiveCatalog integration supports Hive 2.3.10 or 3.1.3 or later.
+
 From the Hive engine's perspective, there is only one global data catalog that 
is defined in the Hadoop configuration in
 the runtime environment. In contrast, Iceberg supports multiple different data 
catalog types such as Hive, Hadoop, AWS
 Glue, or custom catalog implementations. Iceberg also allows loading a table 
directly based on its path in the file
@@ -213,12 +167,6 @@ SET iceberg.catalog.glue.lock.table=myGlueLockTable;
 
 ## DDL Commands
 
-Not all the features below are supported with Hive 2.3.x and Hive 3.1.x. 
Please refer to the
-[Feature support](#feature-support) paragraph for further details.
-
-One generally applicable difference is that Hive 4 provides the possibility to 
use
-`STORED BY ICEBERG` instead of the old `STORED BY 
'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'`
-
 ### CREATE TABLE
 
 #### Non partitioned tables
@@ -601,9 +549,7 @@ Here are the features highlights for Iceberg Hive read 
support:
 
 1. **Predicate pushdown**: Pushdown of the Hive SQL `WHERE` clause has been 
implemented so that these filters are used at the Iceberg `TableScan` level as 
well as by the Parquet and ORC Readers.
 2. **Column projection**: Columns from the Hive SQL `SELECT` clause are 
projected down to the Iceberg readers to reduce the number of columns read.
-3. **Hive query engines**:
-   - With Hive 2.3.x, 3.1.x, both the MapReduce and Tez query execution 
engines are supported.
-   - With Hive 4.x, the Tez query execution engine is supported.
+3. **Hive query engines**: With Hive 4.x, the Tez query execution engine is 
supported.
 
 Some of the advanced / little used optimizations are not yet implemented for 
Iceberg tables, so you should check your individual queries.
 Also currently the statistics stored in the MetaStore are used for query 
planning. This is something we are planning to improve in the future.
diff --git a/site/docs/multi-engine-support.md 
b/site/docs/multi-engine-support.md
index e791be4226..be3bc02a7c 100644
--- a/site/docs/multi-engine-support.md
+++ b/site/docs/multi-engine-support.md
@@ -103,10 +103,10 @@ Users should continuously upgrade their Flink version to 
stay up-to-date.
 
 <!-- markdown-link-check-disable -->
 
-| Version        | Recommended minor version | Lifecycle Stage   | Initial 
Iceberg Support | Latest Iceberg Support | Latest Runtime Jar |
-| -------------- | ------------------------- | ----------------- | 
----------------------- | ---------------------- | ------------------ |
-| 2              | 2.3.8                     | Deprecated        | 
0.8.0-incubating        | 1.7.2                  | 
[iceberg-hive-runtime](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/1.7.2/iceberg-hive-runtime-1.7.2.jar)
 |
-| 3              | 3.1.2                     | Deprecated        | 0.10.0      
            | 1.7.2                  | 
[iceberg-hive-runtime](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/1.7.2/iceberg-hive-runtime-1.7.2.jar)
 |
+| Version        | Recommended minor version | Lifecycle Stage   | Initial 
Iceberg Support | Latest Iceberg Support | Latest Runtime Jar                   
                                                                                
                                |
+| -------------- | ------------------------- | ----------------- | 
----------------------- 
|------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 2              | 2.3.8                     | Deprecated        | 
0.8.0-incubating        | 1.6.1                  | 
[iceberg-hive-runtime](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/1.6.1/iceberg-hive-runtime-1.6.1.jar)
 |
+| 3              | 3.1.2                     | Deprecated        | 0.10.0      
            | 1.6.1                  | 
[iceberg-hive-runtime](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/1.6.1/iceberg-hive-runtime-1.6.1.jar)
 |
 
 <!-- markdown-link-check-enable -->

(iceberg) branch main updated: Doc: Remove Hive 2.x/3.x related docs in hive.md (#12700)

Reply via email to