tanvn commented on PR #36424: URL: https://github.com/apache/spark/pull/36424#issuecomment-1119230516
@gengliangwang Thank you for taking your time. Regarding `isProcessing ` method, in the scenario I wrote in the PR summary: In the next run of checkForLogs, now the AAA application has finished, the log `viewfs://iu/log/spark3/AAA_1.inprogress` has been deleted and a new log `viewfs://iu/log/spark3/AAA_1` is created. In this case, `viewfs://iu/log/spark3/AAA_1` will be marked as `processing`. https://github.com/apache/spark/blob/v3.2.1/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L1391 but the `stale` is taking data from the `listing` map, https://github.com/apache/spark/blob/v3.2.1/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L586-L592 which means it will contain `viewfs://iu/log/spark3/AAA_1.inprogress` (as it is added to the `listing` in the previous execution of `checkForLogs`) and `isProcessing` for `viewfs://iu/log/spark3/AAA_1.inprogress` will return false, so `cleanAppData` will be executed for it (which I think there is no problem) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
