satishkotha edited a comment on issue #1866: URL: https://github.com/apache/hudi/issues/1866#issuecomment-663683323
> Is there a possibility that commits get archived before clean job is resulting in a noop. I will continue to monitor. clean and archival are somewhat independent. So noop should not happen. > Also can you confirm If I can run a clean job in a separate spark job concurrently while streaming write is happening, guess it should be fine as compaction runs have that ability Why are you considering separate spark job for clean? Are you seeing clean take a lot of time? You can consider running clean concurrently with write by setting 'hoodie.clean.async' to true. (This runs clean in same job, but concurrently with write). I don't know of anyone using separate spark job to run clean. Theoretically, I think it is possible. But you may have to do some testing because it isn't used like this afaik. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org