[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Hi, @tgravescs 
[SPARK-17714](https://issues.apache.org/jira/browse/SPARK-17714) has been 
created for further investigation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Thanks @tgravescs yes. I have created a PR 
[15286](https://github.com/apache/spark/pull/15286). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/14659
  
its there to make sure everyone is using the same classloader and to handle 
if it they are chained. I'm not really familiar with all the scenarios of the 
repl.  I see the Suite itself is getting the classloader and loading somethings 
based on uri.   

I think for now can you put up a patch changing to use Class.forName and we 
can file a followup jira to investigate more.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
@tgravescs @srowen Thanks. Using `Class.forName` which uses 
`this.getClass().getClassLoader()` by default makes all the tests passed (both 
sbt and maven). However there must be some reason we prefer 
`Utils.classForName` instead. Do you have any suggestions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/14659
  
Yeah its definitely caused by the call in Task to the 
Utils.CallerContext.setCurrent().  It seems to be from it using the 
ExecutorClassLoader to go try to check the class for the executor and I guess 
because ExecutorClassLoader overrides findClass which is called form loadClass 
which is called from forName.  There must be something slightly different in 
the MessageMatcher which I guess is generated on the fly by netty.  I guess 
loading the remote one must mismatch with the local one.

@srowen  do you think we revert this while we figure it out? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Hi, @tgravescs @srowen 
Give an intermediate update, If using `Class.forName  `instead of 
Utils.classForName`, Maven build and all of the tests will be passed. 

```
  def setCurrentContext(): Boolean = {
var succeed = false
try {
  // scalastyle:off classforname
  val callerContext = 
Class.forName("org.apache.hadoop.ipc.CallerContext")
  val Builder = 
Class.forName("org.apache.hadoop.ipc.CallerContext$Builder")
  // scalastyle:on classforname
  val builderInst = 
Builder.getConstructor(classOf[String]).newInstance(context)
  val hdfsContext = Builder.getMethod("build").invoke(builderInst)
  callerContext.getMethod("setCurrent", callerContext).invoke(null, 
hdfsContext)
  succeed = true
} catch {
  case NonFatal(e) => logInfo("Fail to set Spark caller context", e)
}
succeed
  }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
@tgravescs @srowen Sorry for the failure. I am looking into it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/14659
  
this is a new feature so generally don't put them into point releases which 
are bug fixes.

@srowen  thanks for pointing out, I will take alook.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-28 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14659
  
@Sherry302 @tgravescs oops it looks like this causes master Maven builds to 
fail:

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.3/2000/consoleFull

But not the SBT ones, weird. And it fails REPL tests, with an odd error. I 
don't know what's actually going on there.

```
Lost task 0.0 in stage 0.0 (TID 0, localhost): 
java.lang.ClassCircularityError: 
io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62)
at 
io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54)
at 
io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42)
at 
io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78)
at 
io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59)
at 
org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34)
at 
org.apache.spark.network.TransportContext.(TransportContext.java:78)
at 
org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354)
at 
org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324)
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-27 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Hi, @tgravescs Should we also commit this PR to Branch-2? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-27 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Thanks a lot for the review. @tgravescs @cnauroth @steveloughran @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-27 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/14659
  
thanks @Sherry302 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-27 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/14659
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65934/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65934 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65934/consoleFull)**
 for PR 14659 at commit 
[`dbcabfc`](https://github.com/apache/spark/commit/dbcabfc3ff0d14c1a0a77daddc7751a77ec6d241).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65934 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65934/consoleFull)**
 for PR 14659 at commit 
[`dbcabfc`](https://github.com/apache/spark/commit/dbcabfc3ff0d14c1a0a77daddc7751a77ec6d241).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-26 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Hi, @tgravescs Thanks a lot for the comments. I have updated the PR to 
rename local vals and remove the `@since` in `Utils.scala`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-26 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/14659
  
Couple minor things otherwise looks good.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65799/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65799 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65799/consoleFull)**
 for PR 14659 at commit 
[`47de8a2`](https://github.com/apache/spark/commit/47de8a2a9e1640e0ea942d1a689150d7b7a66c10).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Hi, @tgravescs Thank you very much. Yes. I have updated the PR to make the 
string of the caller context shorter. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65799 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65799/consoleFull)**
 for PR 14659 at commit 
[`47de8a2`](https://github.com/apache/spark/commit/47de8a2a9e1640e0ea942d1a689150d7b7a66c10).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65677/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65677 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65677/consoleFull)**
 for PR 14659 at commit 
[`10dbc6f`](https://github.com/apache/spark/commit/10dbc6f26ac7d224803b721f32a9a0b4306e1f47).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-20 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14659
  
Hi, @tgravescs Thank you so much for the review. I have updated the PR 
based on your every comment.

The only one question left is this one (in `Task`) "are these params all 
optional just to make it easier for different task types?" I have replied this. 
Could you check it again and give your opinion?

To make the caller context more readable, at commit 
[10dbc6f](https://github.com/apache/spark/commit/10dbc6f26ac7d224803b721f32a9a0b4306e1f47),
 I added the static strings `AttemptId` back ( for stage, task and app) which 
had been deleted at commit 
[748e7a9](https://github.com/apache/spark/commit/748e7a9b6f6fe928df9e49f8e020d02126123be8).

Yes, this PR will set up the caller context for both HDFS and YARN. At very 
beginning, to make the review easier, I created two different jiras to set up 
caller contexts for HDFS(SPARK-16757) and YARN (SPARK-16758) although the code 
is the same. I have updated the jiras, the title of this PR, and the 
description of this PR. In the “How was this patch tested” of the PR’s 
description, you can see what are showing in HDFS hdfs-audit.log and Yarn RM 
audit log. 

When invoking Hadoop CallerContext API in Yarn Client, the caller context 
(including `SPARK_CLIENT` with AppId only) will be written to both HDFS audit 
log and Yarn RM audit log. 
In hdfs-audit.log:
```
2016-09-20 11:54:24,116 INFO FSNamesystem.audit: allowed=true   ugi=wyang 
(auth:SIMPLE) ip=/127.0.0.1   cmd=opensrc=/lr_big.txt dst=nullperm=null 
  proto=rpc   callerContext=SPARK_CLIENT_AppId_application_1474394339641_0005
```
In Yarn RM log:
```
2016-09-20 11:59:24,050 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=wyang
IP=127.0.0.1OPERATION=Submit Application RequestTARGET=ClientRMService  
RESULT=SUCCESS  APPID=application_1474394339641_0006
CALLERCONTEXT=SPARK_CLIENT_AppId_application_1474394339641_0006
```
Also, I have tested this with multiple tasks running in the same executor. 
Take `application_1474394339641_0006` as example.

My command line to run tests as below:
```
./bin/spark-submit --verbose --executor-cores 3 --num-executors 1 --master 
yarn --deploy-mode client --class org.apache.spark.examples.SparkKMeans 
examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar 
hdfs://localhost:9000/lr_big.txt 2 5
```
In Spark History Application page, you can see there are two executors (one 
is driver), in the executor, there are 46 tasks:
https://cloud.githubusercontent.com/assets/8546874/18686920/a2617e70-7f32-11e6-947e-dfe83c4185e3.png;>
In HDFS audit log, there are 46 task records.:
```
2016-09-20 11:59:33,868 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)ip=/127.0.0.1cmd=mkdirs 
src=/private/tmp/hadoop-wyang/nm-local-dir/usercache/wyang/appcache/application_1474394339641_0006/container_1474394339641_0006_01_01/spark-warehouse
dst=null   perm=wyang:supergroup:rwxr-xr-x   proto=rpc
callerContext=SPARK_APPLICATION_MASTER_AppId_application_1474394339641_0006_AttemptId_1
2016-09-20 11:59:37,214 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)ip=/127.0.0.1cmd=open   
src=/lr_big.txt dst=null   perm=null  proto=rpc 
callerContext=SPARK_TASK_AppId_application_1474394339641_0006_AttemptId_1_JobId_0_StageId_0_AttemptId_0_TaskId_1_AttemptNum_0
2016-09-20 11:59:37,215 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)ip=/127.0.0.1cmd=open   
src=/lr_big.txt dst=null   perm=null  proto=rpc 
callerContext=SPARK_TASK_AppId_application_1474394339641_0006_AttemptId_1_JobId_0_StageId_0_AttemptId_0_TaskId_2_AttemptNum_0
2016-09-20 11:59:37,215 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)ip=/127.0.0.1cmd=open   
src=/lr_big.txt dst=null   perm=null  proto=rpc 
callerContext=SPARK_TASK_AppId_application_1474394339641_0006_AttemptId_1_JobId_0_StageId_0_AttemptId_0_TaskId_0_AttemptNum_0
2016-09-20 11:59:42,391 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)ip=/127.0.0.1cmd=open   
src=/lr_big.txt dst=null   perm=null  proto=rpc 
callerContext=SPARK_TASK_AppId_application_1474394339641_0006_AttemptId_1_JobId_0_StageId_0_AttemptId_0_TaskId_3_AttemptNum_0
2016-09-20 11:59:42,432 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)ip=/127.0.0.1cmd=open   
src=/lr_big.txt dst=null   perm=null  proto=rpc 
callerContext=SPARK_TASK_AppId_application_1474394339641_0006_AttemptId_1_JobId_0_StageId_0_AttemptId_0_TaskId_4_AttemptNum_0
2016-09-20 11:59:42,445 INFO FSNamesystem.audit: allowed=true
ugi=wyang (auth:SIMPLE)   

[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65677 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65677/consoleFull)**
 for PR 14659 at commit 
[`10dbc6f`](https://github.com/apache/spark/commit/10dbc6f26ac7d224803b721f32a9a0b4306e1f47).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org