[
https://issues.apache.org/jira/browse/HUDI-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17914861#comment-17914861
]
Ranga Reddy edited comment on HUDI-8884 at 1/21/25 6:29 AM:
------------------------------------------------------------
By default, the HUDI CLI *cleans run* command launches the Spark application in
*YARN* mode. To specify a different master, you need to use the *--sparkMaster*
parameter.
*Local Mode:*
{code:java}
cleans run --sparkMaster local {code}
*Standalone Mode:*
{code:java}
cleans run --sparkMaster spark://spark-master:7077 {code}
was (Author: [email protected]):
By default, the HUDI CLI *cleans run* command launches the Spark application in
*YARN* mode. To specify a different master, you need to use the *--sparkMaster*
parameter.
{*}Local Mode:{*}{*}{*}
{code:java}
cleans run --sparkMaster local {code}
{*}Standalone Mode:{*}{*}{*}
{code:java}
cleans run --sparkMaster spark://spark-master:7077 {code}
> Hudi CLI - Cleans run command always running with YARN mode.
> -------------------------------------------------------------
>
> Key: HUDI-8884
> URL: https://issues.apache.org/jira/browse/HUDI-8884
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: Ranga Reddy
> Priority: Major
>
> *Problem Statement:*
> When running the {{cleans run}} command from the Hudi CLI, it always defaults
> to YARN mode and fails, as it cannot find the Resource Manager.
> *Steps to Reproduce:*
> # Go to Hudi CLI.
> # Connect to the desired table using the command:
> {{connect --path s3a:///<table_name>/}}
> # Run the *{{cleans run}}* command.
> *Expected Solution:*
> There should be a way to run the Hudi CLI {{cleans run}} command in any mode
> or automatically detect the Spark master from {{{}spark-defaults.conf{}}},
> ensuring the command runs without failure.
> *HUDI Issue:*
> For more details refer the following Hudi Issue.
> https://github.com/apache/hudi/issues/8952
--
This message was sent by Atlassian Jira
(v8.20.10#820010)