[hudi] branch asf-site updated: [DOCS] Add documentation around downgrading hudi table (#9704)

sivabalan Thu, 14 Sep 2023 16:10:16 -0700

This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 8cac243d150 [DOCS] Add documentation around downgrading hudi table 
(#9704)
8cac243d150 is described below

commit 8cac243d1502da2563230acec337c055fed88a95
Author: Lokesh Jain <[email protected]>
AuthorDate: Fri Sep 15 04:40:03 2023 +0530

    [DOCS] Add documentation around downgrading hudi table (#9704)
    
    * [DOCS] Add documentation around downgrading hudi table
    
    * Address review comments
---
 website/docs/deployment.md | 82 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/website/docs/deployment.md b/website/docs/deployment.md
index 088f99834ad..52f502ffb79 100644
--- a/website/docs/deployment.md
+++ b/website/docs/deployment.md
@@ -11,6 +11,7 @@ Specifically, we will cover the following aspects.
 
  - [Deployment Model](#deploying) : How various Hudi components are deployed 
and managed.
  - [Upgrading Versions](#upgrading) : Picking up new releases of Hudi, 
guidelines and general best-practices.
+ - [Downgrading Versions](#downgrading) : Reverting back to an older version 
of Hudi
  - [Migrating to Hudi](#migrating) : How to migrate your existing tables to 
Apache Hudi.
  
 ## Deploying
@@ -167,6 +168,87 @@ As general guidelines,
 
 Note that release notes can override this information with specific 
instructions, applicable on case-by-case basis.
 
+## Downgrading
+
+Upgrade is automatic whenever a new Hudi version is used whereas downgrade is 
a manual step. We need to use the Hudi
+CLI to downgrade a table from a higher version to lower version. Let's 
consider an example where we create a table using 
+0.12.0, upgrade it to 0.13.0 and then downgrade it via Hudi CLI.
+
+Launch spark shell with Hudi 0.11.0 version.
+```shell
+spark-shell \
+  --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.0 \
+  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+  --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
+  --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+```
+
+Create a hudi table by using the scala script below.
+```scala
+import org.apache.hudi.QuickstartUtils._
+import scala.collection.JavaConversions._
+import org.apache.spark.sql.SaveMode._
+import org.apache.hudi.DataSourceReadOptions._
+import org.apache.hudi.DataSourceWriteOptions._
+import org.apache.hudi.config.HoodieWriteConfig._
+import org.apache.hudi.common.model.HoodieRecord
+import org.apache.hudi.common.table.timeline.HoodieTimeline
+import org.apache.hudi.common.fs.FSUtils
+import org.apache.hudi.HoodieDataSourceHelpers
+
+val dataGen = new DataGenerator
+val tableType = MOR_TABLE_TYPE_OPT_VAL
+val basePath = "file:///tmp/hudi_table"
+val tableName = "hudi_table"
+
+val inserts = convertToStringList(dataGen.generateInserts(100)).toList
+val insertDf = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
+insertDf.write.format("hudi").
+        options(getQuickstartWriteConfigs).
+        option(PRECOMBINE_FIELD_OPT_KEY, "ts").
+        option(RECORDKEY_FIELD_OPT_KEY, "uuid").
+        option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
+        option(TABLE_NAME, tableName).
+        option(OPERATION.key(), INSERT_OPERATION_OPT_VAL).
+        mode(Append).
+        save(basePath)
+```
+
+You will see an entry for table version in hoodie.properties which states the 
table version is 4.
+```shell
+bash$ cat /tmp/hudi_table/.hoodie/hoodie.properties | grep hoodie.table.version
+hoodie.table.version=4
+```
+
+Launch a new spark shell using version 0.13.0 and append to the same table 
using the script above. Note the upgrade 
+happens automatically with the new version.
+```shell
+spark-shell \
+  --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.13.1 \
+  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+  --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
+  --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+```
+
+After upgrade, the table version is updated to 5.
+```shell
+bash$ cat /tmp/hudi_table/.hoodie/hoodie.properties | grep hoodie.table.version
+hoodie.table.version=5
+```
+
+Lets try downgrading the table back to version 4. For downgrading we will need 
to use Hudi CLI and execute downgrade.
+For more details on downgrade, please refer documentation 
[here](cli#upgrade-and-downgrade-table).
+```shell
+connect --path /tmp/hudi_table
+downgrade table --toVersion 4
+```
+
+After downgrade, the table version is updated to 4.
+```shell
+bash$ cat /tmp/hudi_table/.hoodie/hoodie.properties | grep hoodie.table.version
+hoodie.table.version=4
+```
+
 ## Migrating
 
 Currently migrating to Hudi can be done using two approaches

[hudi] branch asf-site updated: [DOCS] Add documentation around downgrading hudi table (#9704)

Reply via email to