This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 8a236d6fd24 [MINOR][DOCS] Updating Hudi table services documentation
(#13345)
8a236d6fd24 is described below
commit 8a236d6fd24ba6c0126a305d8b1b3b44d39d05f6
Author: deepakpanda93 <[email protected]>
AuthorDate: Sat May 24 07:24:12 2025 +0530
[MINOR][DOCS] Updating Hudi table services documentation (#13345)
Co-authored-by: Deepak Panda <[email protected]>
---
website/docs/cleaning.md | 12 ++++++------
website/docs/clustering.md | 8 ++++----
website/docs/compaction.md | 6 +++---
website/docs/metadata_indexing.md | 18 +++++++++---------
website/versioned_docs/version-1.0.2/cleaning.md | 12 ++++++------
website/versioned_docs/version-1.0.2/clustering.md | 8 ++++----
website/versioned_docs/version-1.0.2/compaction.md | 6 +++---
.../versioned_docs/version-1.0.2/metadata_indexing.md | 18 +++++++++---------
8 files changed, 44 insertions(+), 44 deletions(-)
diff --git a/website/docs/cleaning.md b/website/docs/cleaning.md
index 97b99b0ecce..65a5b7aa779 100644
--- a/website/docs/cleaning.md
+++ b/website/docs/cleaning.md
@@ -80,7 +80,7 @@ For Flink based writing, this is the default mode of
cleaning. Please refer to [
Hoodie Cleaner can also be run as a separate process. Following is the command
for running the cleaner independently:
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar`
--help
Usage: <main class> [options]
Options:
@@ -104,7 +104,7 @@ Some examples to run the cleaner.
Keep the latest 10 commits
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar` \
--target-base-path /path/to/hoodie_table \
--hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_COMMITS \
@@ -114,7 +114,7 @@ spark-submit --master local \
Keep the latest 3 file versions
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar` \
--hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_FILE_VERSIONS \
--hoodie-conf hoodie.cleaner.fileversions.retained=3 \
@@ -123,7 +123,7 @@ spark-submit --master local \
Clean commits older than 24 hours
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar` \
--target-base-path /path/to/hoodie_table \
--hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_BY_HOURS \
@@ -142,7 +142,7 @@ CLI provides the below commands for cleaner service:
Example of cleaner keeping the latest 10 commits
```
-cleans run --sparkMaster local --hoodieConfigs
hoodie.cleaner.policy=KEEP_LATEST_COMMITS hoodie.cleaner.commits.retained=3
hoodie.cleaner.parallelism=200
+cleans run --sparkMaster local --hoodieConfigs
hoodie.cleaner.policy=KEEP_LATEST_COMMITS hoodie.cleaner.commits.retained=10
hoodie.cleaner.parallelism=200
```
You can find more details and the relevant code for these commands in
[`org.apache.hudi.cli.commands.CleansCommand`](https://github.com/apache/hudi/blob/master/hudi-cli/src/main/java/org/apache/hudi/cli/commands/CleansCommand.java)
class.
@@ -156,4 +156,4 @@ You can find more details and the relevant code for these
commands in [`org.apac
* [Cleaner Service: Save up to 40% on data lake storage costs | Hudi
Labs](https://youtu.be/mUvRhJDoO3w)
* [Efficient Data Lake Management with Apache Hudi Cleaner: Benefits of
Scheduling Data Cleaning #1](https://www.youtube.com/watch?v=CEzgFtmVjx4)
-* [Efficient Data Lake Management with Apache Hudi Cleaner: Benefits of
Scheduling Data Cleaning #2](https://www.youtube.com/watch?v=RbBF9Ys2GqM)
\ No newline at end of file
+* [Efficient Data Lake Management with Apache Hudi Cleaner: Benefits of
Scheduling Data Cleaning #2](https://www.youtube.com/watch?v=RbBF9Ys2GqM)
diff --git a/website/docs/clustering.md b/website/docs/clustering.md
index 75ecd2af3ba..5676289c6e4 100644
--- a/website/docs/clustering.md
+++ b/website/docs/clustering.md
@@ -243,9 +243,9 @@ A sample spark-submit command to setup HoodieClusteringJob
is as below:
```bash
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieClusteringJob \
-/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /path/to/config/clusteringjob.properties \
--mode scheduleAndExecute \
--base-path /path/to/hudi_table/basePath \
@@ -273,9 +273,9 @@ A sample spark-submit command to setup HoodieStreamer is as
below:
```bash
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
-/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /path/to/config/clustering_kafka.properties \
--schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider
\
--source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
diff --git a/website/docs/compaction.md b/website/docs/compaction.md
index 6025d89916b..6af8e6361c1 100644
--- a/website/docs/compaction.md
+++ b/website/docs/compaction.md
@@ -150,7 +150,7 @@ ingests data to Hudi table continuously from upstream
sources. In this mode, Hud
compactions. Here is an example snippet for running in continuous mode with
async compactions
```properties
-spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
--table-type MERGE_ON_READ \
--target-base-path <hudi_base_path> \
@@ -187,7 +187,7 @@ The compactor utility allows to do scheduling and execution
of compaction.
Example:
```properties
-spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCompactor \
--base-path <base_path> \
--table-name <table_name> \
@@ -231,4 +231,4 @@ Offline compaction needs to submit the Flink task on the
command line. The progr
<h3>Blogs</h3>
[Apache Hudi
Compaction](https://medium.com/@simpsons/apache-hudi-compaction-6e6383790234)
-[Standalone HoodieCompactor
Utility](https://medium.com/@simpsons/standalone-hoodiecompactor-utility-890198e4c539)
\ No newline at end of file
+[Standalone HoodieCompactor
Utility](https://medium.com/@simpsons/standalone-hoodiecompactor-utility-890198e4c539)
diff --git a/website/docs/metadata_indexing.md
b/website/docs/metadata_indexing.md
index 5dc89b4cade..430c5c7be38 100644
--- a/website/docs/metadata_indexing.md
+++ b/website/docs/metadata_indexing.md
@@ -159,8 +159,8 @@ hoodie.write.lock.zookeeper.base_path=<zk_base_path>
```bash
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
---class org.apache.hudi.utilities.streamer.HoodieStreamer `ls
/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar`
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
+--class org.apache.hudi.utilities.streamer.HoodieStreamer `ls
/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar`
\
--props `ls /Users/home/path/to/write/config.properties` \
--source-class org.apache.hudi.utilities.sources.ParquetDFSSource
--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
\
--source-ordering-field tpep_dropoff_datetime \
@@ -212,9 +212,9 @@ Now, we can schedule indexing using `HoodieIndexer` in
`schedule` mode as follow
```
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieIndexer \
-/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /Users/home/path/to/indexer.properties \
--mode schedule \
--base-path /tmp/hudi-ny-taxi \
@@ -232,9 +232,9 @@ To execute indexing, run the indexer in `execute` mode as
below.
```
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieIndexer \
-/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /Users/home/path/to/indexer.properties \
--mode execute \
--base-path /tmp/hudi-ny-taxi \
@@ -288,9 +288,9 @@ To drop an index, just run the index in `dropindex` mode.
```
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieIndexer \
-/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /Users/home/path/to/indexer.properties \
--mode dropindex \
--base-path /tmp/hudi-ny-taxi \
@@ -315,4 +315,4 @@ follow
[HUDI-2488](https://issues.apache.org/jira/browse/HUDI-2488) for developm
## Related Resources
<h3>Videos</h3>
-* [Advantages of Metadata Indexing and Asynchronous Indexing in Hudi Hands on
Lab](https://www.youtube.com/watch?v=TSphQCsY4pY)
\ No newline at end of file
+* [Advantages of Metadata Indexing and Asynchronous Indexing in Hudi Hands on
Lab](https://www.youtube.com/watch?v=TSphQCsY4pY)
diff --git a/website/versioned_docs/version-1.0.2/cleaning.md
b/website/versioned_docs/version-1.0.2/cleaning.md
index 97b99b0ecce..65a5b7aa779 100644
--- a/website/versioned_docs/version-1.0.2/cleaning.md
+++ b/website/versioned_docs/version-1.0.2/cleaning.md
@@ -80,7 +80,7 @@ For Flink based writing, this is the default mode of
cleaning. Please refer to [
Hoodie Cleaner can also be run as a separate process. Following is the command
for running the cleaner independently:
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar`
--help
Usage: <main class> [options]
Options:
@@ -104,7 +104,7 @@ Some examples to run the cleaner.
Keep the latest 10 commits
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar` \
--target-base-path /path/to/hoodie_table \
--hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_COMMITS \
@@ -114,7 +114,7 @@ spark-submit --master local \
Keep the latest 3 file versions
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar` \
--hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_FILE_VERSIONS \
--hoodie-conf hoodie.cleaner.fileversions.retained=3 \
@@ -123,7 +123,7 @@ spark-submit --master local \
Clean commits older than 24 hours
```
spark-submit --master local \
- --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+ --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCleaner `ls
packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle-*.jar` \
--target-base-path /path/to/hoodie_table \
--hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_BY_HOURS \
@@ -142,7 +142,7 @@ CLI provides the below commands for cleaner service:
Example of cleaner keeping the latest 10 commits
```
-cleans run --sparkMaster local --hoodieConfigs
hoodie.cleaner.policy=KEEP_LATEST_COMMITS hoodie.cleaner.commits.retained=3
hoodie.cleaner.parallelism=200
+cleans run --sparkMaster local --hoodieConfigs
hoodie.cleaner.policy=KEEP_LATEST_COMMITS hoodie.cleaner.commits.retained=10
hoodie.cleaner.parallelism=200
```
You can find more details and the relevant code for these commands in
[`org.apache.hudi.cli.commands.CleansCommand`](https://github.com/apache/hudi/blob/master/hudi-cli/src/main/java/org/apache/hudi/cli/commands/CleansCommand.java)
class.
@@ -156,4 +156,4 @@ You can find more details and the relevant code for these
commands in [`org.apac
* [Cleaner Service: Save up to 40% on data lake storage costs | Hudi
Labs](https://youtu.be/mUvRhJDoO3w)
* [Efficient Data Lake Management with Apache Hudi Cleaner: Benefits of
Scheduling Data Cleaning #1](https://www.youtube.com/watch?v=CEzgFtmVjx4)
-* [Efficient Data Lake Management with Apache Hudi Cleaner: Benefits of
Scheduling Data Cleaning #2](https://www.youtube.com/watch?v=RbBF9Ys2GqM)
\ No newline at end of file
+* [Efficient Data Lake Management with Apache Hudi Cleaner: Benefits of
Scheduling Data Cleaning #2](https://www.youtube.com/watch?v=RbBF9Ys2GqM)
diff --git a/website/versioned_docs/version-1.0.2/clustering.md
b/website/versioned_docs/version-1.0.2/clustering.md
index 75ecd2af3ba..5676289c6e4 100644
--- a/website/versioned_docs/version-1.0.2/clustering.md
+++ b/website/versioned_docs/version-1.0.2/clustering.md
@@ -243,9 +243,9 @@ A sample spark-submit command to setup HoodieClusteringJob
is as below:
```bash
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieClusteringJob \
-/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /path/to/config/clusteringjob.properties \
--mode scheduleAndExecute \
--base-path /path/to/hudi_table/basePath \
@@ -273,9 +273,9 @@ A sample spark-submit command to setup HoodieStreamer is as
below:
```bash
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
-/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /path/to/config/clustering_kafka.properties \
--schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider
\
--source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
diff --git a/website/versioned_docs/version-1.0.2/compaction.md
b/website/versioned_docs/version-1.0.2/compaction.md
index 6025d89916b..6af8e6361c1 100644
--- a/website/versioned_docs/version-1.0.2/compaction.md
+++ b/website/versioned_docs/version-1.0.2/compaction.md
@@ -150,7 +150,7 @@ ingests data to Hudi table continuously from upstream
sources. In this mode, Hud
compactions. Here is an example snippet for running in continuous mode with
async compactions
```properties
-spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
--table-type MERGE_ON_READ \
--target-base-path <hudi_base_path> \
@@ -187,7 +187,7 @@ The compactor utility allows to do scheduling and execution
of compaction.
Example:
```properties
-spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.1,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.1
\
+spark-submit --packages
org.apache.hudi:hudi-utilities-slim-bundle_2.12:1.0.2,org.apache.hudi:hudi-spark3.5-bundle_2.12:1.0.2
\
--class org.apache.hudi.utilities.HoodieCompactor \
--base-path <base_path> \
--table-name <table_name> \
@@ -231,4 +231,4 @@ Offline compaction needs to submit the Flink task on the
command line. The progr
<h3>Blogs</h3>
[Apache Hudi
Compaction](https://medium.com/@simpsons/apache-hudi-compaction-6e6383790234)
-[Standalone HoodieCompactor
Utility](https://medium.com/@simpsons/standalone-hoodiecompactor-utility-890198e4c539)
\ No newline at end of file
+[Standalone HoodieCompactor
Utility](https://medium.com/@simpsons/standalone-hoodiecompactor-utility-890198e4c539)
diff --git a/website/versioned_docs/version-1.0.2/metadata_indexing.md
b/website/versioned_docs/version-1.0.2/metadata_indexing.md
index 5dc89b4cade..430c5c7be38 100644
--- a/website/versioned_docs/version-1.0.2/metadata_indexing.md
+++ b/website/versioned_docs/version-1.0.2/metadata_indexing.md
@@ -159,8 +159,8 @@ hoodie.write.lock.zookeeper.base_path=<zk_base_path>
```bash
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
---class org.apache.hudi.utilities.streamer.HoodieStreamer `ls
/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar`
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
+--class org.apache.hudi.utilities.streamer.HoodieStreamer `ls
/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar`
\
--props `ls /Users/home/path/to/write/config.properties` \
--source-class org.apache.hudi.utilities.sources.ParquetDFSSource
--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
\
--source-ordering-field tpep_dropoff_datetime \
@@ -212,9 +212,9 @@ Now, we can schedule indexing using `HoodieIndexer` in
`schedule` mode as follow
```
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieIndexer \
-/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /Users/home/path/to/indexer.properties \
--mode schedule \
--base-path /tmp/hudi-ny-taxi \
@@ -232,9 +232,9 @@ To execute indexing, run the indexer in `execute` mode as
below.
```
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieIndexer \
-/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /Users/home/path/to/indexer.properties \
--mode execute \
--base-path /tmp/hudi-ny-taxi \
@@ -288,9 +288,9 @@ To drop an index, just run the index in `dropindex` mode.
```
spark-submit \
---jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.1.jar"
\
+--jars
"packaging/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar,packaging/hudi-spark-bundle/target/hudi-spark3.5-bundle_2.12-1.0.2.jar"
\
--class org.apache.hudi.utilities.HoodieIndexer \
-/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.1.jar
\
+/Users/home/path/to/hudi-utilities-slim-bundle/target/hudi-utilities-slim-bundle_2.12-1.0.2.jar
\
--props /Users/home/path/to/indexer.properties \
--mode dropindex \
--base-path /tmp/hudi-ny-taxi \
@@ -315,4 +315,4 @@ follow
[HUDI-2488](https://issues.apache.org/jira/browse/HUDI-2488) for developm
## Related Resources
<h3>Videos</h3>
-* [Advantages of Metadata Indexing and Asynchronous Indexing in Hudi Hands on
Lab](https://www.youtube.com/watch?v=TSphQCsY4pY)
\ No newline at end of file
+* [Advantages of Metadata Indexing and Asynchronous Indexing in Hudi Hands on
Lab](https://www.youtube.com/watch?v=TSphQCsY4pY)