Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r83340292
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2474,25 +2478,42 @@ private[spark] class CallerContext(
val context = "SPARK_" + from + appIdStr + appAttemptIdStr +
jobIdStr + stageIdStr + stageAttemptIdStr + taskIdStr +
taskAttemptNumberStr
+ lazy val conf = new Configuration
+
/**
* Set up the caller context [[context]] by invoking Hadoop
CallerContext API of
* [[org.apache.hadoop.ipc.CallerContext]], which was added in hadoop
2.8.
*/
def setCurrentContext(): Boolean = {
- var succeed = false
- try {
- // scalastyle:off classforname
- val callerContext =
Class.forName("org.apache.hadoop.ipc.CallerContext")
- val Builder =
Class.forName("org.apache.hadoop.ipc.CallerContext$Builder")
- // scalastyle:on classforname
- val builderInst =
Builder.getConstructor(classOf[String]).newInstance(context)
- val hdfsContext = Builder.getMethod("build").invoke(builderInst)
- callerContext.getMethod("setCurrent", callerContext).invoke(null,
hdfsContext)
- succeed = true
- } catch {
- case NonFatal(e) => logInfo("Fail to set Spark caller context", e)
+ if (!CallerContext.callerContextSupported) {
--- End diff --
If you cannot fully preventing re-executing this code, and to the worst all
threads will executing the same logics again, so is it necessary enough to add
such flag? For me I think it is some kind of undeterministic that will confuse
the user (some tasks printed the log while others not).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]