xushiyan commented on issue #3831:
URL: https://github.com/apache/hudi/issues/3831#issuecomment-948120800


   1. deltastreamer is a spark application that supposed to run on a cluster. 
not sure how it fits into a CLI utility. if you want to just use a CLI command 
to submit the job as in it just triggers the submission, yea there is nothing 
stops you doing it.
   2. most deltastreamer configs translates to hudi options internally, for 
e.g., --source-ordering-field matches precombine field option. I'd suggest find 
the all the needed hudi configs for your application based on deltastreamer's 
params and create a map of hudi write options from scratch, then pass it to 
datasource writer. The extra work is you may need to do some orchestration for 
your datasource writer like schedule it periodically and trigger compaction in 
a separate process. Not all deltastreamer params match to hudi write options 
like --continuous is for orchestration mode not writer option. So deltastreamer 
is at higher level than datasource writing you can't flip them.
   3. see 2)
   Hope this helps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to