Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/10446#issuecomment-167971026
I'm pushing an update with those changes. I found a similar one in
`FileBasedWriteAheadLogWriter` and one more in `FsHistoryProvider` regarding
checking for safe mode.
BTW here are the additional cleanups that could be made if Hadoop 2.6+ was
assumed:
- SparkHadoopUtil: `getFileSystemThreadStatistics`,
`getFSBytesReadOnThreadCallback` and `getFSBytesWrittenOnThreadCallback` only
actually work on Hadoop 2.5+
- Hadoop RDDs could always directly access SplitLocationInfo,
InputSplitWithLocationInfo introduced in 2.5
- YARN integration tests could directly access tag-related APIs like
ApplicationContext getApplicationTags from 2.4 (?)
- AM could cleanly handle `ApplicationAttemptNotFoundException`
Taken together those don't add up to much advantage, but it's not nothing.
The bigger reason would be less JAR hell to deal with and being able to access
new APIs more freely.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]