On 2019/03/07 18:35:54, Vinoth Chandar <[email protected]> wrote:
> Hi Rahul,
>
> Can you please subscribe to the mailing list? Otherwise, each reply
> requires a moderator to approve before it can show up :)
>
> Thanks
> Vinoth
>
> On Thu, Mar 7, 2019 at 9:02 AM Balaji Varadarajan
> <[email protected]> wrote:
>
> > It depends on which mechanism you use :
> > 1. For Spark DataSource route, you can use the "options" API
> > of DataFrameWriter to pass in these configs. Here is an example from
> > http://hudi.apache.org/incremental_processing.html
> > inputDF.write()
> > .format("com.uber.hoodie")
> > .options(clientOpts) // any of the Hudi client opts can be passed
> > in as well
> > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(),
> > "_row_key")
> > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY(),
> > "partition")
> > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY(),
> > "timestamp")
> > .option(HoodieWriteConfig.TABLE_NAME, tableName)
> > .mode(SaveMode.Append)
> > .save(basePath);
> > 2. For an approach involving using HoodieWriteClient directly, you can
> > simply construct HoodieWriteConfig object with the configs in the link you
> > mentioned.
> > 3. When using HoodieDeltaStreamer tool to ingest, you can set the configs
> > in properties file and pass the file as the cmdline argument "--props"
> >
> > All the file-size configs must be in bytes denomination
> > Balaji.V
> >
> >
> > On Thursday, March 7, 2019, 7:00:55 AM PST, [email protected] <
> > [email protected]> wrote:
> >
> > Hi All
> > I have found hoodie related configurations in
> > http://hudi.apache.org/configurations.html. Please tell how we can pass
> > these configurations to the spark job. Also please tell for file size
> > related configs in which way i need to give the value for MB/GB/Bytes.
> >
> > Thanks & Regards
> > Rahul P
> >
>
I tried testing the properties with --props options. I added some of the
properties in the kafka.source.properties file and passed with --props option.
But the configuration are not valid. I think it is not for the deletastreamer.
Can you please give proper properties for the deltastreamer. My intention is
inline compaction.
WARN VerifiableProperties: Property OPERATION_OPT_KEY is not valid
INFO VerifiableProperties: Property auto.offset.reset is overridden to smallest
WARN VerifiableProperties: Property compactionSmallFileSize is not valid
WARN VerifiableProperties: Property hoodie.compact.inline is not valid
Thanks & Regards
Rahul P