This is an automated email from the ASF dual-hosted git repository.
rymurr pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iceberg.git
The following commit(s) were added to refs/heads/master by this push:
new a6a88e6 Docs: Improve Nessie docs (#2637)
a6a88e6 is described below
commit a6a88e63198b79a1795701504b14f36f580320c4
Author: Eduard Tudenhöfner <[email protected]>
AuthorDate: Wed May 26 15:06:10 2021 +0200
Docs: Improve Nessie docs (#2637)
---
site/docs/nessie.md | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/site/docs/nessie.md b/site/docs/nessie.md
index c4bfb5b..494df1c 100644
--- a/site/docs/nessie.md
+++ b/site/docs/nessie.md
@@ -18,7 +18,7 @@
# Iceberg Nessie Integration
Iceberg provides integration with Nessie through the `iceberg-nessie` module.
-This section describes how to use Iceberg with Nessie. Nessie provides several
key features on top of iceberg:
+This section describes how to use Iceberg with Nessie. Nessie provides several
key features on top of Iceberg:
* multi-table transactions
* git-like operations (eg branches, tags, commits)
@@ -30,7 +30,7 @@ See [Project Nessie](https://projectnessie.org) for more
information on Nessie.
## Enabling Nessie Catalog
The `iceberg-nessie` module is bundled with Spark and Flink runtimes for all
versions from `0.11.0`. To get started
-with nessie and iceberg simply add the iceberg runtime to your process. Eg:
`spark-sql --packages
+with Nessie and Iceberg simply add the Iceberg runtime to your process. Eg:
`spark-sql --packages
org.apache.iceberg:iceberg-spark3-runtiume:{{ versions.iceberg }}`.
## Nessie Catalog
@@ -64,11 +64,24 @@ conf.set("spark.sql.catalog.nessie.ref", "main")
conf.set("spark.sql.catalog.nessie.catalog-impl",
"org.apache.iceberg.nessie.NessieCatalog")
conf.set("spark.sql.catalog.nessie", "org.apache.iceberg.spark.SparkCatalog")
```
+This is how it looks in Flink via the Python API (additional details can be
found [here](flink.md)):
+```python
+import os
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.table import StreamTableEnvironment
+
+env = StreamExecutionEnvironment.get_execution_environment()
+iceberg_flink_runtime_jar = os.path.join(os.getcwd(),
"iceberg-flink-runtime-0.11.1.jar")
+env.add_jars("file://{}".format(iceberg_flink_runtime_jar))
+table_env = StreamTableEnvironment.create(env)
+
+table_env.execute_sql("CREATE CATALOG nessie_catalog WITH ('type'='iceberg',
'catalog-impl'='org.apache.iceberg.nessie.NessieCatalog',
'uri'='http://localhost:19120/api/v1', 'ref'='main',
'warehouse'='/path/to/warehouse')")
+```
There is nothing special above about the `nessie` name. A spark catalog can
have any name, the important parts are the
settings for the `catalog-impl` and the required config to start Nessie
correctly.
Once you have a Nessie catalog you have access to your entire Nessie repo. You
can then perform create/delete/merge
-operations on branches and perform commits on branches. Each iceberg table in
a Nessie Catalog is identified by an
+operations on branches and perform commits on branches. Each Iceberg table in
a Nessie Catalog is identified by an
arbitrary length namespace and table name (eg `data.base.name.table`). These
namespaces are implicit and don't need to
be created separately. Any transaction on a Nessie enabled Iceberg table is a
single commit in Nessie. Nessie commits
can encompass an arbitrary number of actions on an arbitrary number of tables,
however in Iceberg this will be limited
@@ -82,8 +95,8 @@ Nessie functionality.
## Nessie and Iceberg
For most cases Nessie acts just like any other Catalog for Iceberg: providing
a logical organization of a set of tables
-and providing atomicity to transactions. However using Nessie opens up other
interesting possibilities. When using Nessie with
-iceberg every iceberg transaction becomes a nessie commit. This history can be
listed, merged or cherry-picked across branches.
+and providing atomicity to transactions. However, using Nessie opens up other
interesting possibilities. When using Nessie with
+Iceberg every Iceberg transaction becomes a Nessie commit. This history can be
listed, merged or cherry-picked across branches.
### Loosely coupled transactions
@@ -117,8 +130,8 @@ Nessie features.
## Example
-Please see [Nessie Iceberg
Demo](https://github.com/projectnessie/nessie/blob/main/python/demo/nessie-iceberg-demo.ipynb)
-for a complete example of Nessie and Iceberg in action together.
+Please have a look at the [Nessie Demos
repo](https://github.com/projectnessie/nessie-demos)
+for different examples of Nessie and Iceberg in action together.
## Future Improvements