Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22752#discussion_r225845782 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -449,7 +450,7 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) listing.write(info.copy(lastProcessed = newLastScanTime, fileSize = entry.getLen())) } - if (info.fileSize < entry.getLen()) { + if (info.fileSize < entry.getLen() || checkAbsoluteLength(info, entry)) { --- End diff -- I think we can create a function to get the length of given file: 1. If the new conf is enabled and the input is DFSInputStream, use `getFileLength` (or `max(getFileLength, entry.getLen()`) 2. otherwise `entry.getLen()` The logic can be simpler.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org