baunz commented on issue #10319:
URL: https://github.com/apache/hudi/issues/10319#issuecomment-1853695531

   Hi @ad1happy2go,
   
   this leads to the same error, the CLI arguments for HoodieStreamer take 
precedence. In the end that's okay and not a "bug", but I would rather have a 
single properties file for a table where the property is specified *once*. 
Currently I have to specify all the properties twice at bulk load time 
   
   and at Streamer invocation:
   
   ```
                     "--target-table", "'"${TABLE}"'",
                     "--table-type", "MERGE_ON_READ", 
                     "--target-base-path", "'"${WAREHOUSE}/${TABLE}"'",
                     "--enable-sync",
                     "--sync-tool-classes", 
"org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool",
                     "--source-class", 
"org.apache.hudi.utilities.sources.AvroKafkaSource",
                     "--payload-class", 
"org.apache.hudi.common.model.DefaultHoodieRecordPayload",
                     "--source-ordering-field", "LAST_UPDATE",
                     "--schemaprovider-class", 
"org.apache.hudi.utilities.schema.SchemaRegistryProvider",
                     "--props", "'"${TABLE_PROPS}"'"   
   ```
   
   If this is just the way it is, this issue can be closed. I was wondering if 
there is some smarter way for reusing things already configured in property 
files. This would also make reusing HoodieStreamer wrapper scripts easier, when 
one wants to load a lot of tables that differ in their config (different source 
ordering, different payload class)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to