This is an automated email from the ASF dual-hosted git repository.
adoroszlai pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git
The following commit(s) were added to refs/heads/master by this push:
new 19c8136afe HDDS-11962. [Docs] Hive Integration (#7596)
19c8136afe is described below
commit 19c8136afe34683193f9cdeef0b369a54c1ad045
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Sat Jan 11 09:27:03 2025 -0800
HDDS-11962. [Docs] Hive Integration (#7596)
---
hadoop-hdds/docs/content/integration/Hive.md | 169 +++++++++++++++++++++++++
hadoop-hdds/docs/content/integration/_index.md | 26 ++++
2 files changed, 195 insertions(+)
diff --git a/hadoop-hdds/docs/content/integration/Hive.md
b/hadoop-hdds/docs/content/integration/Hive.md
new file mode 100644
index 0000000000..8b43236d56
--- /dev/null
+++ b/hadoop-hdds/docs/content/integration/Hive.md
@@ -0,0 +1,169 @@
+---
+title: Hive
+weight: 4
+menu:
+ main:
+ parent: "Application Integrations"
+---
+<!---
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+Apache Hive has supported Apache Ozone since Hive 4.0. To enable Hive to work
with Ozone paths, ensure that the `ozone-filesystem-hadoop3` JAR is added to
the Hive classpath.
+
+## Supported Access Protocols
+
+Hive supports the following protocols for accessing Ozone data:
+
+* ofs
+* o3fs
+* s3a
+
+## Supported Replication Types
+
+Hive is compatible with Ozone buckets configured with either:
+
+* RATIS (Replication)
+* Erasure Coding
+
+## Accessing Ozone Data in Hive
+
+Hive provides two methods to interact with data in Ozone:
+
+* Managed Tables
+* External Tables
+
+### Managed Tables
+#### Configuring the Hive Warehouse Directory in Ozone
+To store managed tables in Ozone, update the following properties in the
`hive-site.xml` configuration file:
+
+```xml
+<property>
+ <name>hive.metastore.warehouse.dir</name>
+ <value>ofs://ozone1/vol1/bucket1/warehouse/</value>
+</property>
+```
+
+#### Creating a Managed Table
+You can create a managed table with a standard `CREATE TABLE` statement:
+
+```sql
+CREATE TABLE myTable (
+ id INT,
+ name STRING
+);
+```
+
+#### Loading Data into a Managed Table
+Data can be loaded into a Hive table from an Ozone location:
+
+```sql
+LOAD DATA INPATH 'ofs://ozone1/vol1/bucket1/table.csv' INTO TABLE myTable;
+```
+
+#### Specifying a Custom Ozone Path
+You can define a custom Ozone path for a database using the `MANAGEDLOCATION`
clause:
+
+```sql
+CREATE DATABASE d1 MANAGEDLOCATION 'ofs://ozone1/vol1/bucket1/data';
+```
+
+Tables created in the database d1 will be stored under the specified path:
+`ofs://ozone1/vol1/bucket1/data`
+
+#### Verifying the Ozone Path
+You can confirm that Hive references the correct Ozone path using:
+
+```sql
+SHOW CREATE DATABASE d1;
+```
+
+Output Example:
+
+```text
++----------------------------------------------------+
+| createdb_stmt |
++----------------------------------------------------+
+| CREATE DATABASE `d1` |
+| LOCATION |
+| 'ofs://ozone1/vol1/bucket1/external/d1.db' |
+| MANAGEDLOCATION |
+| 'ofs://ozone1/vol1/bucket1/data' |
++----------------------------------------------------+
+```
+
+### External Tables
+
+Hive allows the creation of external tables to query existing data stored in
Ozone.
+
+#### Creating an External Table
+```sql
+CREATE EXTERNAL TABLE external_table (
+ id INT,
+ name STRING
+)
+LOCATION 'ofs://ozone1/vol1/bucket1/table1';
+```
+
+* With external tables, the data is expected to be created and managed by
another tool.
+* Hive queries the data as-is.
+* Note: Dropping an external table in Hive does not delete the associated data.
+
+To set a default path for external tables, configure the following property in
the `hive-site.xml` file:
+```xml
+<property>
+ <name>hive.metastore.warehouse.external.dir</name>
+ <value>ofs://ozone1/vol1/bucket1/external/</value>
+</property>
+```
+This property specifies the base directory for external tables when no
explicit `LOCATION` is provided.
+
+#### Verifying the External Table Path
+To confirm the table's metadata and location, use:
+
+```sql
+SHOW CREATE TABLE external_table;
+```
+Output Example:
+
+```text
++----------------------------------------------------+
+| createtab_stmt |
++----------------------------------------------------+
+| CREATE EXTERNAL TABLE `external_table`( |
+| `id` int, |
+| `name` string) |
+| ROW FORMAT SERDE |
+| 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' |
+| STORED AS INPUTFORMAT |
+| 'org.apache.hadoop.mapred.TextInputFormat' |
+| OUTPUTFORMAT |
+| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
+| LOCATION |
+| 'ofs://ozone1/vol1/bucket1/table1' |
+| TBLPROPERTIES ( |
+| 'bucketing_version'='2', |
+| 'transient_lastDdlTime'='1734725573') |
++----------------------------------------------------+
+```
+
+## Using the S3A Protocol
+In addition to ofs, Hive can access Ozone using the S3 Gateway via the S3A
file system.
+
+For more information, consult:
+
+* The [S3 Protocol]({{< ref "interface/S3.md">}})
+* The [Hadoop
S3A](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html)
documentation.
diff --git a/hadoop-hdds/docs/content/integration/_index.md
b/hadoop-hdds/docs/content/integration/_index.md
new file mode 100644
index 0000000000..87f6a4825b
--- /dev/null
+++ b/hadoop-hdds/docs/content/integration/_index.md
@@ -0,0 +1,26 @@
+---
+title: "Application Integrations"
+menu:
+ main:
+ weight: 5
+---
+<!---
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+{{<jumbotron>}}
+Many applications can be integrated with Ozone through the Hadoop-compatible
ofs interface or the S3 interface.
+{{</jumbotron>}}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]