yihua opened a new pull request, #7790: URL: https://github.com/apache/hudi/pull/7790
### Change Logs The Hudi CLI commands which require launching Spark cannot be executed in Hudi CLI shell with hudi-cli-bundle: ``` savepoint create --commit <latest-commit-timestamp> --sparkMaster local savepoint delete --commit <latest-commit-timestamp> --sparkMaster local savepoint create --commit <latest-commit-timestamp> --sparkMaster local downgrade table --toVersion 3 --sparkMaster local upgrade table --toVersion 5 --sparkMaster local compaction schedule --hoodieConfigs hoodie.compact.inline.max.delta.commits=1 ``` Sample error message: ``` 30977 [Thread-4] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - Error: Failed to load org.apache.hudi.cli.commands.SparkMain: org/apache/hudi/common/engine/HoodieEngineContext ``` The root cause is that the `hudi-cli-bundle` excludes the classes already in `hudi-spark*-bundle`, such as in `hudi-common` module, and the `hudi-spark*-bundle` is not added to the Spark launcher, so that the Spark job fails due to class not found. This PR fixes the problem by adding the `hudi-spark*-bundle` specified by env variable `SPARK_BUNDLE_JAR` to the Spark launcher. Note that `SPARK_BUNDLE_JAR` is required when using `hudi-cli-bundle`. ### Impact Ensures that Hudi CLI commands which require launching Spark can be executed with `hudi-cli-bundle`. The above CLI commands are tested to be working locally when this fix. ### Risk level low ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
