Github user cmccabe commented on the pull request:
https://github.com/apache/spark/pull/850#issuecomment-44374139
So as I commented on the JIRA, using the public API here fixes
Spark-on-YARN for CDH5.1.0. That, plus the fact that this fixes the trunk
(hadoop 3.0.0) build, is probably a good enough reason to do this, even if we
didn't care about the deprecation issues. If we don't have this in Spark 1.0,
it's just not going to work against CDH 5.1.0.
Also, all the YARN people I talked to were upset that we were using
YarnClientImpl-- that class is not supposed to be used by outsiders at all...
it would be private, but we have the old "we want to put things from a single
project in multiple java packages" issue.
I still have not managed to figured out why this change is required to get
cdh5.1.0 going. I was pursuing a theory today that turned out to be wrong. I
also verified that both Hadoop 2.4 and a pre-release version of Hadoop 2.5 that
I built work without this fix.
Tomorrow I'm going to do a closer comparison of the working versus
non-working versions. CDH5.1.0 should be pretty close to the pre-release 2.5,
so that might be a good place for me to look for differences. Maybe turning up
log4j on YARN will reveal something.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---