[iceberg] branch master updated: Docs: Update Hive read docs for HiveCatalog (#1748)

blue Thu, 19 Nov 2020 17:11:00 -0800

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iceberg.git



The following commit(s) were added to refs/heads/master by this push:
     new 953a7fd  Docs: Update Hive read docs for HiveCatalog (#1748)
953a7fd is described below

commit 953a7fdd7f79ef86c7ad7d063efdbe32901da50e
Author: Adrian Woodhead <[email protected]>
AuthorDate: Fri Nov 20 01:10:36 2020 +0000

    Docs: Update Hive read docs for HiveCatalog (#1748)
---
 site/docs/hive.md | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/site/docs/hive.md b/site/docs/hive.md
index d43d33a..e7977b9 100644
--- a/site/docs/hive.md
+++ b/site/docs/hive.md
@@ -50,6 +50,40 @@ You should now be able to issue Hive SQL `SELECT` queries 
using the above table
 SELECT * from table_a;
 ```
 
+#### Using Hive Catalog
+Iceberg tables created using `HiveCatalog` are automatically registered with 
Hive.
+
+##### Create an Iceberg table
+The first step is to create an Iceberg table using the Spark/Java/Python API 
and `HiveCatalog`. For the purposes of this documentation we will assume that 
the table is called `table_b` and that the table location is 
`s3://some_path/table_b`. In order for Iceberg to correctly set up the Hive 
table for querying some configuration values need to be set, the two options 
for this are described below - you can use either or the other depending on 
your use case.
+
+##### Hive Configuration
+The value `iceberg.engine.hive.enabled` needs to be set to `true` and added to 
the Hive configuration file on the classpath of the application creating the 
table. This can be done by modifying the relevant `hive-site.xml`. 
Alternatively this can done programatically like so:
+```java
+Configuration hadoopConfiguration = spark.sparkContext().hadoopConfiguration();
+hadoopConfiguration.set(ConfigProperties.ENGINE_HIVE_ENABLED, "true"); 
//iceberg.engine.hive.enabled=true
+HiveCatalog catalog = new HiveCatalog(hadoopConfiguration);
+...
+catalog.createTable(tableId, schema, spec);
+```
+
+##### Table Property Configuration
+The property `engine.hive.enabled` needs to be set to `true` and added to the 
table properties when creating the Iceberg table. This can be done like so:
+```java
+    Map<String, String> tableProperties = new HashMap<String, String>();
+    tableProperties.put(TableProperties.ENGINE_HIVE_ENABLED, "true"); 
//engine.hive.enabled=true
+    catalog.createTable(tableId, schema, spec, tableProperties);
+```
+
+#### Query the Iceberg table via Hive
+In order to query a Hive table created by either of the HiveCatalog methods 
described above you need to first set a Hive configuration value like so:
+```sql
+SET iceberg.mr.catalog=hive;
+```
+You should now be able to issue Hive SQL `SELECT` queries using the above 
table and see the results returned from the underlying Iceberg table. Both the 
Map Reduce and Tez query execution engines are supported.
+```sql
+SELECT * from table_b;
+```
+
 ### Features
 
 #### Predicate pushdown

[iceberg] branch master updated: Docs: Update Hive read docs for HiveCatalog (#1748)

Reply via email to