Re: How i can pass Hoodie configurations to spark job

Vinoth Chandar Thu, 07 Mar 2019 10:36:24 -0800

Hi Rahul,

Can you please subscribe to the mailing list? Otherwise, each reply
requires a moderator to approve before it can show up :)


Thanks
Vinoth

On Thu, Mar 7, 2019 at 9:02 AM Balaji Varadarajan
<[email protected]> wrote:

>  It depends on which mechanism you use :
> 1. For Spark DataSource route, you can use the "options" API
> of DataFrameWriter to pass in these configs. Here is an example from
> http://hudi.apache.org/incremental_processing.html
> inputDF.write()
>        .format("com.uber.hoodie")
>        .options(clientOpts) // any of the Hudi client opts can be passed
> in as well
>        .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(),
> "_row_key")
>        .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY(),
> "partition")
>        .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY(),
> "timestamp")
>        .option(HoodieWriteConfig.TABLE_NAME, tableName)
>        .mode(SaveMode.Append)
>        .save(basePath);
> 2. For an approach involving using HoodieWriteClient directly, you can
> simply construct HoodieWriteConfig object with the configs in the link you
> mentioned.
> 3. When using HoodieDeltaStreamer tool to ingest, you can set the configs
> in properties file and pass the file as the cmdline argument "--props"
>
> All the file-size configs must be in bytes denomination
> Balaji.V
>
>
>     On Thursday, March 7, 2019, 7:00:55 AM PST, [email protected] <
> [email protected]> wrote:
>
>  Hi All
> I have found hoodie related configurations in
> http://hudi.apache.org/configurations.html.  Please tell how we can pass
> these configurations to the spark job.  Also please tell for file size
> related configs in which way i need to give the value for MB/GB/Bytes.
>
> Thanks & Regards
> Rahul P
>

Re: How i can pass Hoodie configurations to spark job

Reply via email to