Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/6935#issuecomment-161779559
This is the next iteration; if you look at the intermittent patches I was
trying to track the time the filesize changed, but it (a) made the code complex
(b) didn't actually work.
Now there's an atomic long generational counter: every time any attempt is
updated, the counter is incremented (i.e. it is unique across all apps -we
could make it per-attempt but I don't see what that would gain right now).
# there is a scan in the update loop for has filesize changing: this
triggers an update of the generation counter *but no re-read of the
application*. This means that the updates are low-cost to track.
# As before, the check for app updates on modtime do trigger a re-read; as
the `EventLoggingLIstener` always attempts to set the modtime in its `stop()`
command, the mod time is always updated on any filesystem which supports
`setTimes()`, even if `rename()` doesn't. That includes POSIX, but still omits
the object stores. The fact that rename is copy+delete should handle that
implicitly.
#I worry things are over complex; I'll need to diff with master after the
update and see if I can trim it a bit.
Comments welcome! now it should be ready for initial reviews
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]