Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22444#discussion_r218279175 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -465,20 +475,31 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) } } catch { case _: NoSuchElementException => - // If the file is currently not being tracked by the SHS, add an entry for it and try - // to parse it. This will allow the cleaner code to detect the file as stale later on - // if it was not possible to parse it. - listing.write(LogInfo(entry.getPath().toString(), newLastScanTime, None, None, - entry.getLen())) --- End diff -- if you don't do this here for all entries, I think the cleaning around line 522 isn't going to work.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org