zhouyejoe commented on pull request #29392:
URL: https://github.com/apache/spark/pull/29392#issuecomment-672428047


   @yanxiaole  I don't think there will be race condition between the 
checkForLogs() and cleanLogs() thread. These two threads are launched from the 
same pool, and the thread pool is defined as a single thread pool. So they 
won't be able to run simultaneously. But with the latest trunk, I do see there 
is race condition between the replay Task and checkForLogs(). In some former 
version of Spark, checkForLogs will wait for all the replay tasks to finish 
then it exists. But now, replay tasks can run simultaneously with 
checkForLogs() thread, where checkForLogs scanning stale entries will iterate 
the listing.ldb and replay tasks can trigger the listing.delete if the log file 
is out of maximum retention time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to