[ 
https://issues.apache.org/jira/browse/HUDI-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983739#comment-16983739
 ] 

Balaji Varadarajan commented on HUDI-118:
-----------------------------------------

[~Pratyaksh] 

If you look at the referenced PR :  
[https://github.com/apache/incubator-hudi/pull/691], there is a new command 
line parameter "--hoodie-conf" that is added. The purpose of this option is to 
override the properties that we get by reading the DFS properties file.  Both 
DFS properties file and the overridden properties are used to construct objects 
like *HoodieWriteClient*. 

 

There are some specific CLI commands in the files (CleansCommand, 
CompactionCommand, HDFSParquetImporterCommand) which invoke the corresponding 
utilities scripts touched in [https://github.com/apache/incubator-hudi/pull/691]

Along with "--hoodie-conf" argument, you might also have to add "–props" 
argument as they both are missing in the hoodie-cli counterpart.

 

To your question on HDFSParquetImporterCommand, HDFSParquetImporter (in 
hudi-utilities) has the following command line arguments

```

@Parameter(names = \{"--props"}, description = "path to properties file on 
localfs or dfs, with configurations for "
 + "hoodie client for importing")
public String propsFilePath = null;
@Parameter(names = \{"--hoodie-conf"}, description = "Any configuration that 
can be set in the properties file "
 + "(using the CLI parameter \"--propsFilePath\") can also be passed command 
line using this parameter")
public List<String> configs = new ArrayList<>();

``` 

These arguments are being used to construct HoodieWriteClient

 

```

HoodieWriteClient client =
 UtilHelpers.createHoodieClient(jsc, cfg.targetPath, schemaStr, 
cfg.parallelism, Option.empty(), props);

```

We would need both --props and --hoodie-conf to be passed from 
HDFSParquetImportCommand.convert. 

 

 

 

 

> Hudi CLI : Provide options for passing properties to Compactor, Cleaner and 
> ParquetImporter 
> --------------------------------------------------------------------------------------------
>
>                 Key: HUDI-118
>                 URL: https://issues.apache.org/jira/browse/HUDI-118
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: CLI, newbie
>            Reporter: Balaji Varadarajan
>            Assignee: Pratyaksh Sharma
>            Priority: Minor
>
> For non-trivial CLI operations, we have a standalone script in hudi-utilities 
> that users can call directly using spark-submit (usually). We also have 
> commands in hudi-cli to invoke the commands directly from hudi-cli shell.
> There was an earlier effort to allow users to pass properties directly to the 
> scripts in hudi-utilities but we still need to give the same functionality to 
> the corresponding commands in hudi-cli.
> In hudi-cli, Compaction (schedule/compact), Cleaner and HDFSParquetImporter 
> command does not have option to pass DFS properties file. This is a followup 
> to PR [https://github.com/apache/incubator-hudi/pull/691]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to