This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new fce7283372 [HUDI-4564] Update docs for Spark 3.3 support (#6359)
fce7283372 is described below

commit fce72833727b6e594657ff9a3fac432dffdad324
Author: Y Ethan Guo <[email protected]>
AuthorDate: Fri Aug 12 10:49:04 2022 -0700

    [HUDI-4564] Update docs for Spark 3.3 support (#6359)
---
 website/docs/quick-start-guide.md | 48 ++++++++++++++++++++++++++++++---------
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/website/docs/quick-start-guide.md 
b/website/docs/quick-start-guide.md
index acd51ff8cc..b33aeb59f9 100644
--- a/website/docs/quick-start-guide.md
+++ b/website/docs/quick-start-guide.md
@@ -20,6 +20,7 @@ Hudi works with Spark-2.4.3+ & Spark 3.x versions. You can 
follow instructions [
 
 | Hudi            | Supported Spark 3 version                       |
 |:----------------|:------------------------------------------------|
+| 0.12.x          | 3.3.x (default build), 3.2.x, 3.1.x             |
 | 0.11.x          | 3.2.x (default build, Spark bundle only), 3.1.x |
 | 0.10.x          | 3.1.x (default build), 3.0.x                    |
 | 0.7.0 - 0.9.0   | 3.0.x                                           |
@@ -28,6 +29,7 @@ Hudi works with Spark-2.4.3+ & Spark 3.x versions. You can 
follow instructions [
 The *default build* Spark version indicates that it is used to build the 
`hudi-spark3-bundle`.
 
 :::note
+In 0.12.0, we introduce the experimental support for Spark 3.3.0.
 In 0.11.0, there are changes on using Spark bundles, please refer
 to [0.11.0 release 
notes](https://hudi.apache.org/releases/release-0.11.0/#spark-versions-and-bundles)
 for detailed
 instructions.
@@ -45,10 +47,18 @@ values={[
 
 From the extracted directory run spark-shell with Hudi:
 
+```shell
+# Spark 3.3
+spark-shell \
+  --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.12.0 \
+  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+  --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
+  --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+```
 ```shell
 # Spark 3.2
 spark-shell \
-  --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.1 \
+  --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.12.0 \
   --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
   --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
   --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
@@ -56,13 +66,13 @@ spark-shell \
 ```shell
 # Spark 3.1
 spark-shell \
-  --packages org.apache.hudi:hudi-spark3.1-bundle_2.12:0.11.1 \
+  --packages org.apache.hudi:hudi-spark3.1-bundle_2.12:0.12.0 \
   --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
 ```
 ```shell
 # Spark 2.4
 spark-shell \
-  --packages org.apache.hudi:hudi-spark2.4-bundle_2.11:0.11.1 \
+  --packages org.apache.hudi:hudi-spark2.4-bundle_2.11:0.12.0 \
   --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
 ```
 </TabItem>
@@ -71,11 +81,20 @@ spark-shell \
 
 From the extracted directory run pyspark with Hudi:
 
+```shell
+# Spark 3.3
+export PYSPARK_PYTHON=$(which python3)
+pyspark \
+--packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.12.0 \
+--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+--conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
+--conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+```
 ```shell
 # Spark 3.2
 export PYSPARK_PYTHON=$(which python3)
 pyspark \
---packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.1 \
+--packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.12.0 \
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
 --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
 --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
@@ -84,14 +103,14 @@ pyspark \
 # Spark 3.1
 export PYSPARK_PYTHON=$(which python3)
 pyspark \
---packages org.apache.hudi:hudi-spark3.1-bundle_2.12:0.11.1 \
+--packages org.apache.hudi:hudi-spark3.1-bundle_2.12:0.12.0 \
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
 ```
 ```shell
 # Spark 2.4
 export PYSPARK_PYTHON=$(which python3)
 pyspark \
---packages org.apache.hudi:hudi-spark2.4-bundle_2.11:0.11.1 \
+--packages org.apache.hudi:hudi-spark2.4-bundle_2.11:0.12.0 \
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
 ```
 </TabItem>
@@ -101,22 +120,29 @@ pyspark \
 Hudi support using Spark SQL to write and read data with the 
**HoodieSparkSessionExtension** sql extension.
 From the extracted directory run Spark SQL with Hudi:
 
+```shell
+# Spark 3.3
+spark-sql --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.12.0 \
+--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+--conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' \
+--conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
+```
 ```shell
 # Spark 3.2
-spark-sql --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.1 \
+spark-sql --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.12.0 \
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
 --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' \
 --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 ```
 ```shell
 # Spark 3.1
-spark-sql --packages org.apache.hudi:hudi-spark3.1-bundle_2.12:0.11.1 \
+spark-sql --packages org.apache.hudi:hudi-spark3.1-bundle_2.12:0.12.0 \
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
 --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
 ```
 ```shell
 # Spark 2.4
-spark-sql --packages org.apache.hudi:hudi-spark2.4-bundle_2.11:0.11.1 \
+spark-sql --packages org.apache.hudi:hudi-spark2.4-bundle_2.11:0.12.0 \
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
 --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
 ```
@@ -128,7 +154,7 @@ spark-sql --packages 
org.apache.hudi:hudi-spark2.4-bundle_2.11:0.11.1 \
 
 :::note Please note the following
 <ul>
-  <li> For Spark 3.2, the additional spark_catalog config is required: 
+  <li> For Spark 3.2 and above, the additional spark_catalog config is 
required: 
 --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 </li>
   <li> We have used hudi-spark-bundle built for scala 2.12 since the 
spark-avro module used can also depend on 2.12. </li>
 </ul>
@@ -1206,7 +1232,7 @@ more details please refer to [procedures](procedures).
 
 You can also do the quickstart by [building hudi 
yourself](https://github.com/apache/hudi#building-apache-hudi-from-source), 
 and using `--jars <path to 
hudi_code>/packaging/hudi-spark-bundle/target/hudi-spark3.2-bundle_2.1?-*.*.*-SNAPSHOT.jar`
 in the spark-shell command above
-instead of `--packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.1`. Hudi 
also supports scala 2.12. Refer [build with scala 
2.12](https://github.com/apache/hudi#build-with-different-spark-versions)
+instead of `--packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.12.0`. Hudi 
also supports scala 2.12. Refer [build with scala 
2.12](https://github.com/apache/hudi#build-with-different-spark-versions)
 for more info.
 
 Also, we used Spark here to show case the capabilities of Hudi. However, Hudi 
can support multiple table types/query types and 

Reply via email to