baunz commented on issue #10319:
URL: https://github.com/apache/hudi/issues/10319#issuecomment-1853695531
Hi @ad1happy2go,
this leads to the same error, the CLI arguments for HoodieStreamer take
precedence. In the end that's okay and not a "bug", but I would rather have a
single properties file for a table where the property is specified *once*.
Currently I have to specify all the properties twice at bulk load time
and at Streamer invocation:
```
"--target-table", "'"${TABLE}"'",
"--table-type", "MERGE_ON_READ",
"--target-base-path", "'"${WAREHOUSE}/${TABLE}"'",
"--enable-sync",
"--sync-tool-classes",
"org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool",
"--source-class",
"org.apache.hudi.utilities.sources.AvroKafkaSource",
"--payload-class",
"org.apache.hudi.common.model.DefaultHoodieRecordPayload",
"--source-ordering-field", "LAST_UPDATE",
"--schemaprovider-class",
"org.apache.hudi.utilities.schema.SchemaRegistryProvider",
"--props", "'"${TABLE_PROPS}"'"
```
If this is just the way it is, this issue can be closed. I was wondering if
there is some smarter way for reusing things already configured in property
files. This would also make reusing HoodieStreamer wrapper scripts easier, when
one wants to load a lot of tables that differ in their config (different source
ordering, different payload class)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]