Github user gengliangwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/22752#discussion_r225845782
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -449,7 +450,7 @@ private[history] class FsHistoryProvider(conf:
SparkConf, clock: Clock)
listing.write(info.copy(lastProcessed = newLastScanTime,
fileSize = entry.getLen()))
}
- if (info.fileSize < entry.getLen()) {
+ if (info.fileSize < entry.getLen() ||
checkAbsoluteLength(info, entry)) {
--- End diff --
I think we can create a function to get the length of given file:
1. If the new conf is enabled and the input is DFSInputStream, use
`getFileLength` (or `max(getFileLength, entry.getLen()`)
2. otherwise `entry.getLen()`
The logic can be simpler.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]