satishkotha commented on issue #1866:
URL: https://github.com/apache/hudi/issues/1866#issuecomment-663683323


   > Is there a possibility that commits get archived before clean job is 
resulting in a noop. I will continue to monitor.
   clean and archival are somewhat independent. So noop should not happen.
   
   > Also can you confirm If I can run a clean job in a separate spark job 
concurrently while streaming write is happening, guess it should be fine as 
compaction runs have that ability
   Why are you considering separate spark job for clean? Are you seeing clean 
take a lot of time? You can consider running clean concurrently with write by 
setting 'hoodie.clean.async' to true. (This runs clean in same job, but 
concurrently with write). 
   
   I don't know of anyone using separate spark job to run clean. Theoretically, 
I think it is possible. But you may have to do some testing because it isn't 
used like this afaik.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to