[jira] [Created] (SPARK-20172) Event log without read permission should be filtered out before actually reading it

2017-03-31 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-20172:
---

 Summary: Event log without read permission should be filtered out 
before actually reading it
 Key: SPARK-20172
 URL: https://issues.apache.org/jira/browse/SPARK-20172
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.0
Reporter: Saisai Shao
Priority: Minor


In the current Spark's HistoryServer, we expected to check file permission when 
listing all the files, and filter out this files with no read permission. That 
was not worked because we actually doesn't check the access permission, so we 
defer this permission check until reading files, that is not necessary and the 
exception is printed out in every 10 seconds by default.

So to avoid this problem we should add a access check logic in listing files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20128) MetricsSystem not always killed in SparkContext.stop()

2017-03-28 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15946493#comment-15946493
 ] 

Saisai Shao commented on SPARK-20128:
-

Sorry I cannot access the logs. What I could see from the link provided above 
is:

{noformat}
[info] - internal accumulators in multiple stages (185 milliseconds)
3/24/17 2:02:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 2:22:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 2:42:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 3:02:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 3:22:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 3:42:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 4:02:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 4:22:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 4:42:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 5:02:19 PM =

-- Gauges --
master.aliveWorkers
3/24/17 5:22:19 PM =

-- Gauges --
{noformat}

>From the console output what I could see is that after this {{internal 
>accumulators in multiple stages}} unit test is finished, then the whole test 
>is hang, and just print some metrics information.

> MetricsSystem not always killed in SparkContext.stop()
> --
>
> Key: SPARK-20128
> URL: https://issues.apache.org/jira/browse/SPARK-20128
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core, Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>  Labels: flaky-test
>
> One Jenkins run failed due to the MetricsSystem never getting killed after a 
> failed test, which led that test to hang and the tests to timeout:
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75176
> {noformat}
> 17/03/24 13:44:19.537 dag-scheduler-event-loop ERROR 
> DAGSchedulerEventProcessLoop: DAGSchedulerEventProcessLoop failed; shutting 
> down SparkContext
> java.lang.ArrayIndexOutOfBoundsException: -1
> at 
> org.apache.spark.MapOutputTrackerMaster$$anonfun$getEpochForMapOutput$1.apply(MapOutputTracker.scala:431)
> at 
> org.apache.spark.MapOutputTrackerMaster$$anonfun$getEpochForMapOutput$1.apply(MapOutputTracker.scala:430)
> at scala.Option.flatMap(Option.scala:171)
> at 
> org.apache.spark.MapOutputTrackerMaster.getEpochForMapOutput(MapOutputTracker.scala:430)
> at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1298)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1731)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1689)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1678)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 17/03/24 13:44:19.540 dispatcher-event-loop-11 INFO 
> MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
> 17/03/24 13:44:19.546 stop-spark-context INFO MemoryStore: MemoryStore cleared
> 17/03/24 13:44:19.546 stop-spark-context INFO BlockManager: BlockManager 
> stopped
> 17/03/24 13:44:19.546 stop-spark-context INFO BlockManagerMaster: 
> BlockManagerMaster stopped
> 17/03/24 13:44:19.546 dispatcher-event-loop-16 INFO 
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
> OutputCommitCoordinator stopped!
> 17/03/24 13:44:19.547 stop-spark-context INFO SparkContext: Successfully 
> 

[jira] [Comment Edited] (SPARK-20128) MetricsSystem not always killed in SparkContext.stop()

2017-03-28 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15946457#comment-15946457
 ] 

Saisai Shao edited comment on SPARK-20128 at 3/29/17 3:20 AM:
--

Here the exception is from MasterSource, which only exists in Standalone 
Master, I think it should not be related to SparkContext, may be the Master is 
not cleanly stopped. --Also as I remembered by default we will not enable 
ConsoleReporter, not sure how this could be happened.--

Looks like we have a metrics property in the test resource, that's why console 
sink will be enabled in UT.


was (Author: jerryshao):
Here the exception is from MasterSource, which only exists in Standalone 
Master, I think it should not be related to SparkContext, may be the Master is 
not cleanly stopped. Also as I remembered by default we will not enable 
ConsoleReporter, not sure how this could be happened.

> MetricsSystem not always killed in SparkContext.stop()
> --
>
> Key: SPARK-20128
> URL: https://issues.apache.org/jira/browse/SPARK-20128
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core, Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>  Labels: flaky-test
>
> One Jenkins run failed due to the MetricsSystem never getting killed after a 
> failed test, which led that test to hang and the tests to timeout:
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75176
> {noformat}
> 17/03/24 13:44:19.537 dag-scheduler-event-loop ERROR 
> DAGSchedulerEventProcessLoop: DAGSchedulerEventProcessLoop failed; shutting 
> down SparkContext
> java.lang.ArrayIndexOutOfBoundsException: -1
> at 
> org.apache.spark.MapOutputTrackerMaster$$anonfun$getEpochForMapOutput$1.apply(MapOutputTracker.scala:431)
> at 
> org.apache.spark.MapOutputTrackerMaster$$anonfun$getEpochForMapOutput$1.apply(MapOutputTracker.scala:430)
> at scala.Option.flatMap(Option.scala:171)
> at 
> org.apache.spark.MapOutputTrackerMaster.getEpochForMapOutput(MapOutputTracker.scala:430)
> at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1298)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1731)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1689)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1678)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 17/03/24 13:44:19.540 dispatcher-event-loop-11 INFO 
> MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
> 17/03/24 13:44:19.546 stop-spark-context INFO MemoryStore: MemoryStore cleared
> 17/03/24 13:44:19.546 stop-spark-context INFO BlockManager: BlockManager 
> stopped
> 17/03/24 13:44:19.546 stop-spark-context INFO BlockManagerMaster: 
> BlockManagerMaster stopped
> 17/03/24 13:44:19.546 dispatcher-event-loop-16 INFO 
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
> OutputCommitCoordinator stopped!
> 17/03/24 13:44:19.547 stop-spark-context INFO SparkContext: Successfully 
> stopped SparkContext
> 17/03/24 14:02:19.934 metrics-console-reporter-1-thread-1 ERROR 
> ScheduledReporter: RuntimeException thrown from ConsoleReporter#report. 
> Exception was suppressed.
> java.lang.NullPointerException
> at 
> org.apache.spark.deploy.master.MasterSource$$anon$2.getValue(MasterSource.scala:35)
> at 
> org.apache.spark.deploy.master.MasterSource$$anon$2.getValue(MasterSource.scala:34)
> at 
> com.codahale.metrics.ConsoleReporter.printGauge(ConsoleReporter.java:239)
> ...
> {noformat}
> unfortunately I didn't save the entire test logs, but what happens is the 
> initial IndexOutOfBoundsException is a real bug, which causes the 
> SparkContext to stop, and the test to fail.  However, the MetricsSystem 
> somehow stays alive, and since its not a daemon thread, it just hangs, and 
> every 20 mins we get that NPE from within the metrics system as it tries to 
> report.
> I am totally perplexed at how this can happen, it looks like the metric 
> system should always get stopped by the time we see
> {noformat}
> 17/03/24 13:44:19.547 stop-spark-context INFO SparkContext: Successfully 
> stopped SparkContext
> {noformat}
> I don't think I've ever seen this in a real spark use, but it doesn't look 
> like something which is limited to tests, whatever the cause.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20128) MetricsSystem not always killed in SparkContext.stop()

2017-03-28 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15946457#comment-15946457
 ] 

Saisai Shao commented on SPARK-20128:
-

Here the exception is from MasterSource, which only exists in Standalone 
Master, I think it should not be related to SparkContext, may be the Master is 
not cleanly stopped. Also as I remembered by default we will not enable 
ConsoleReporter, not sure how this could be happened.

> MetricsSystem not always killed in SparkContext.stop()
> --
>
> Key: SPARK-20128
> URL: https://issues.apache.org/jira/browse/SPARK-20128
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core, Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>  Labels: flaky-test
>
> One Jenkins run failed due to the MetricsSystem never getting killed after a 
> failed test, which led that test to hang and the tests to timeout:
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75176
> {noformat}
> 17/03/24 13:44:19.537 dag-scheduler-event-loop ERROR 
> DAGSchedulerEventProcessLoop: DAGSchedulerEventProcessLoop failed; shutting 
> down SparkContext
> java.lang.ArrayIndexOutOfBoundsException: -1
> at 
> org.apache.spark.MapOutputTrackerMaster$$anonfun$getEpochForMapOutput$1.apply(MapOutputTracker.scala:431)
> at 
> org.apache.spark.MapOutputTrackerMaster$$anonfun$getEpochForMapOutput$1.apply(MapOutputTracker.scala:430)
> at scala.Option.flatMap(Option.scala:171)
> at 
> org.apache.spark.MapOutputTrackerMaster.getEpochForMapOutput(MapOutputTracker.scala:430)
> at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1298)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1731)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1689)
> at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1678)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 17/03/24 13:44:19.540 dispatcher-event-loop-11 INFO 
> MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
> 17/03/24 13:44:19.546 stop-spark-context INFO MemoryStore: MemoryStore cleared
> 17/03/24 13:44:19.546 stop-spark-context INFO BlockManager: BlockManager 
> stopped
> 17/03/24 13:44:19.546 stop-spark-context INFO BlockManagerMaster: 
> BlockManagerMaster stopped
> 17/03/24 13:44:19.546 dispatcher-event-loop-16 INFO 
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
> OutputCommitCoordinator stopped!
> 17/03/24 13:44:19.547 stop-spark-context INFO SparkContext: Successfully 
> stopped SparkContext
> 17/03/24 14:02:19.934 metrics-console-reporter-1-thread-1 ERROR 
> ScheduledReporter: RuntimeException thrown from ConsoleReporter#report. 
> Exception was suppressed.
> java.lang.NullPointerException
> at 
> org.apache.spark.deploy.master.MasterSource$$anon$2.getValue(MasterSource.scala:35)
> at 
> org.apache.spark.deploy.master.MasterSource$$anon$2.getValue(MasterSource.scala:34)
> at 
> com.codahale.metrics.ConsoleReporter.printGauge(ConsoleReporter.java:239)
> ...
> {noformat}
> unfortunately I didn't save the entire test logs, but what happens is the 
> initial IndexOutOfBoundsException is a real bug, which causes the 
> SparkContext to stop, and the test to fail.  However, the MetricsSystem 
> somehow stays alive, and since its not a daemon thread, it just hangs, and 
> every 20 mins we get that NPE from within the metrics system as it tries to 
> report.
> I am totally perplexed at how this can happen, it looks like the metric 
> system should always get stopped by the time we see
> {noformat}
> 17/03/24 13:44:19.547 stop-spark-context INFO SparkContext: Successfully 
> stopped SparkContext
> {noformat}
> I don't think I've ever seen this in a real spark use, but it doesn't look 
> like something which is limited to tests, whatever the cause.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20079) Re registration of AM hangs spark cluster in yarn-client mode

2017-03-28 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944629#comment-15944629
 ] 

Saisai Shao commented on SPARK-20079:
-

What is the specific symptom you met? I believe there's bunch of corner cases 
regarding RPC back and forth in yarn-client + AM reattempt scenario, and 
sometimes these scenarios are quite hard to fix, so usually I would suggest to 
set max attempt to 1 in yarn client mode.

> Re registration of AM hangs spark cluster in yarn-client mode
> -
>
> Key: SPARK-20079
> URL: https://issues.apache.org/jira/browse/SPARK-20079
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.1.0
>Reporter: Guoqiang Li
>
> 1. Start cluster
> echo -e "sc.parallelize(1 to 2000).foreach(_ => Thread.sleep(1000))" | 
> ./bin/spark-shell  --master yarn-client --executor-cores 1 --conf 
> spark.shuffle.service.enabled=true --conf 
> spark.dynamicAllocation.enabled=true --conf 
> spark.dynamicAllocation.maxExecutors=2 
> 2.  Kill the AM process when a stage is scheduled. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19143) API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters)

2017-03-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15943289#comment-15943289
 ] 

Saisai Shao commented on SPARK-19143:
-

Thanks [~tgraves], let me see how to propose a SPIP.

> API in Spark for distributing new delegation tokens (Improve delegation token 
> handling in secure clusters)
> --
>
> Key: SPARK-19143
> URL: https://issues.apache.org/jira/browse/SPARK-19143
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Ruslan Dautkhanov
>
> Spin off from SPARK-14743 and comments chain in [recent comments| 
> https://issues.apache.org/jira/browse/SPARK-5493?focusedCommentId=15802179=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15802179]
>  in SPARK-5493.
> Spark currently doesn't have a way for distribution new delegation tokens. 
> Quoting [~vanzin] from SPARK-5493 
> {quote}
> IIRC Livy doesn't yet support delegation token renewal. Once it reaches the 
> TTL, the session is unusable.
> There might be ways to hack support for that without changes in Spark, but 
> I'd like to see a proper API in Spark for distributing new delegation tokens. 
> I mentioned that in SPARK-14743, but although that bug is closed, that 
> particular feature hasn't been implemented yet.
> {quote}
> Other thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19143) API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters)

2017-03-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15943303#comment-15943303
 ] 

Saisai Shao commented on SPARK-19143:
-

[~tgraves], can I add you as a *SPIP Shepherd*? :)

> API in Spark for distributing new delegation tokens (Improve delegation token 
> handling in secure clusters)
> --
>
> Key: SPARK-19143
> URL: https://issues.apache.org/jira/browse/SPARK-19143
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Ruslan Dautkhanov
>
> Spin off from SPARK-14743 and comments chain in [recent comments| 
> https://issues.apache.org/jira/browse/SPARK-5493?focusedCommentId=15802179=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15802179]
>  in SPARK-5493.
> Spark currently doesn't have a way for distribution new delegation tokens. 
> Quoting [~vanzin] from SPARK-5493 
> {quote}
> IIRC Livy doesn't yet support delegation token renewal. Once it reaches the 
> TTL, the session is unusable.
> There might be ways to hack support for that without changes in Spark, but 
> I'd like to see a proper API in Spark for distributing new delegation tokens. 
> I mentioned that in SPARK-14743, but although that bug is closed, that 
> particular feature hasn't been implemented yet.
> {quote}
> Other thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19143) API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters)

2017-03-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15943243#comment-15943243
 ] 

Saisai Shao commented on SPARK-19143:
-

Attach a WIP branch 
(https://github.com/jerryshao/apache-spark/tree/SPARK-19143).

> API in Spark for distributing new delegation tokens (Improve delegation token 
> handling in secure clusters)
> --
>
> Key: SPARK-19143
> URL: https://issues.apache.org/jira/browse/SPARK-19143
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Ruslan Dautkhanov
>
> Spin off from SPARK-14743 and comments chain in [recent comments| 
> https://issues.apache.org/jira/browse/SPARK-5493?focusedCommentId=15802179=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15802179]
>  in SPARK-5493.
> Spark currently doesn't have a way for distribution new delegation tokens. 
> Quoting [~vanzin] from SPARK-5493 
> {quote}
> IIRC Livy doesn't yet support delegation token renewal. Once it reaches the 
> TTL, the session is unusable.
> There might be ways to hack support for that without changes in Spark, but 
> I'd like to see a proper API in Spark for distributing new delegation tokens. 
> I mentioned that in SPARK-14743, but although that bug is closed, that 
> particular feature hasn't been implemented yet.
> {quote}
> Other thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-20059) HbaseCredentialProvider uses wrong classloader

2017-03-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15942976#comment-15942976
 ] 

Saisai Shao edited comment on SPARK-20059 at 3/27/17 10:02 AM:
---

[~sowen], it probably is not the same issue.

In SPARK-20019, jars are added through SQL command in to Hive's SessionState, 
the code path is different.

And in SPARK-11421, AFAIK,  the target is to add jars to the current 
classloader in the runtime.

Here the issue is during start, jars specified through {{--jars}} should be 
added to Spark's child classloader in yarn cluster mode. 

They're probably all classloader related issues, but I think they're targeted 
to the different area, and touched the different code path.



was (Author: jerryshao):
[~sowen], it probably is not the same issue.

In SPARK-20019, jars are added through SQL command in to Hive's SessionState, 
the code path is different.

And in SPARK-11421, AFAIK,  the target is to add jars to the current 
classloader in the runtime.

Here the issue is during start, jars specified through {{--jars}} should be 
added to Spark's child classloader in yarn cluster mode. 

They're all probably all classloader related issues, but I think they're 
targeted to the different area, and touched the different code path.


> HbaseCredentialProvider uses wrong classloader
> --
>
> Key: SPARK-20059
> URL: https://issues.apache.org/jira/browse/SPARK-20059
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Saisai Shao
>
> {{HBaseCredentialProvider}} uses system classloader instead of child 
> classloader, which will make HBase jars specified with {{--jars}} fail to 
> work, so here we should use the right class loader.
> Besides in yarn cluster mode jars specified with {{--jars}} is not added into 
> client's class path, which will make it fail to load HBase jars and issue 
> tokens in our scenario. Also some customized credential provider cannot be 
> registered into client.
> So here I will fix this two issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20059) HbaseCredentialProvider uses wrong classloader

2017-03-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15942976#comment-15942976
 ] 

Saisai Shao commented on SPARK-20059:
-

[~sowen], it probably is not the same issue.

In SPARK-20019, jars are added through SQL command in to Hive's SessionState, 
the code path is different.

And in SPARK-11421, AFAIK,  the target is to add jars to the current 
classloader in the runtime.

Here the issue is during start, jars specified through {{--jars}} should be 
added to Spark's child classloader in yarn cluster mode. 

They're all probably all classloader related issues, but I think they're 
targeted to the different area, and touched the different code path.


> HbaseCredentialProvider uses wrong classloader
> --
>
> Key: SPARK-20059
> URL: https://issues.apache.org/jira/browse/SPARK-20059
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Saisai Shao
>
> {{HBaseCredentialProvider}} uses system classloader instead of child 
> classloader, which will make HBase jars specified with {{--jars}} fail to 
> work, so here we should use the right class loader.
> Besides in yarn cluster mode jars specified with {{--jars}} is not added into 
> client's class path, which will make it fail to load HBase jars and issue 
> tokens in our scenario. Also some customized credential provider cannot be 
> registered into client.
> So here I will fix this two issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20059) HbaseCredentialProvider uses wrong classloader

2017-03-27 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-20059:

Description: 
{{HBaseCredentialProvider}} uses system classloader instead of child 
classloader, which will make HBase jars specified with {{--jars}} fail to work, 
so here we should use the right class loader.

Besides in yarn cluster mode jars specified with {{--jars}} is not added into 
client's class path, which will make it fail to load HBase jars and issue 
tokens in our scenario. Also some customized credential provider cannot be 
registered into client.

So here I will fix this two issues.



  was:
{{HBaseCredentialProvider}} uses system classloader instead of child 
classloader, which will make HBase jars specified with {{--jars}} fail to work, 
so here we should use the right class loader.

Besides in yarn client mode jars specified with {{--jars}} is not added into 
client's class path, which will make it fail to load HBase jars and issue 
tokens in our scenario. Also some customized credential provider cannot be 
registered into client.

So here I will fix this two issues.




> HbaseCredentialProvider uses wrong classloader
> --
>
> Key: SPARK-20059
> URL: https://issues.apache.org/jira/browse/SPARK-20059
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Saisai Shao
>
> {{HBaseCredentialProvider}} uses system classloader instead of child 
> classloader, which will make HBase jars specified with {{--jars}} fail to 
> work, so here we should use the right class loader.
> Besides in yarn cluster mode jars specified with {{--jars}} is not added into 
> client's class path, which will make it fail to load HBase jars and issue 
> tokens in our scenario. Also some customized credential provider cannot be 
> registered into client.
> So here I will fix this two issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster

2017-03-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939867#comment-15939867
 ] 

Saisai Shao commented on SPARK-19992:
-

Sorry I cannot give you valid suggestions without knowing your actual 
environment. Basically running Spark on yarn you don't have to configure 
anything except HADOOP_CONF_DIR specified in spark-env.sh. Other than this 
default configurations should be enough.

You could also send your problem to mail list, I think there will be more users 
who met the same problem before. Here JIRA is mainly used to track dev work of 
Spark, not for questions.

> spark-submit on deployment-mode cluster
> ---
>
> Key: SPARK-19992
> URL: https://issues.apache.org/jira/browse/SPARK-19992
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.0.2
> Environment: spark version 2.0.2
> hadoop version 2.6.0
>Reporter: narendra maru
>
> spark version 2.0.2
> hadoop version 2.6.0
> spark -submit command
> "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode 
> cluster --jars /home/ec2-user/jars/hgmongonew.jar, 
> /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar"
> after adding following in
> 1 Spark-default.conf
> spark.executor.extraJavaOptions -Dconfig.fuction.conf 
> spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*
> spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory
> spark.eventLog.enabled=true
> 2yarn-site.xml
> 
> yarn.application.classpath
> 
> /usr/local/hadoop-2.6.0/etc/hadoop,
> /usr/local/hadoop-2.6.0/,
> /usr/local/hadoop-2.6.0/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/lib/
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/,
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*,
> /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar
> 
> 
> Error on log:-
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ApplicationMaster
> Error on terminal:-
> diagnostics: Application application_1489673977198_0002 failed 2 times due to 
> AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 
> For more detailed output, check application tracking 
> page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then,
>  click on links to logs of each attempt.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20050) Kafka 0.10 DirectStream doesn't commit last processed batch's offset when graceful shutdown

2017-03-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937845#comment-15937845
 ] 

Saisai Shao commented on SPARK-20050:
-

I think you could register a commit callback function in {{commitAsync}}, this 
callback function will be invoked once offset is committed into Kafka, I think 
you could use this to know when to stop the application.

> Kafka 0.10 DirectStream doesn't commit last processed batch's offset when 
> graceful shutdown
> ---
>
> Key: SPARK-20050
> URL: https://issues.apache.org/jira/browse/SPARK-20050
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.2.0
>Reporter: Sasaki Toru
>
> I use Kafka 0.10 DirectStream with properties 'enable.auto.commit=false' and 
> call 'DirectKafkaInputDStream#commitAsync' finally in each batches,  such 
> below
> {code}
> val kafkaStream = KafkaUtils.createDirectStream[String, String](...)
> kafkaStream.map { input =>
>   "key: " + input.key.toString + " value: " + input.value.toString + " 
> offset: " + input.offset.toString
>   }.foreachRDD { rdd =>
> rdd.foreach { input =>
> println(input)
>   }
> }
> kafkaStream.foreachRDD { rdd =>
>   val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
>   kafkaStream.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges)
> }
> {\code}
> Some records which processed in the last batch before Streaming graceful 
> shutdown reprocess in the first batch after Spark Streaming restart.
> It may cause offsets specified in commitAsync will commit in the head of next 
> batch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20058) the running application status changed from running to waiting when a master is down and it change to another standy by master

2017-03-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937841#comment-15937841
 ] 

Saisai Shao commented on SPARK-20058:
-

I have a PR for this bug very very long ago 
(https://github.com/apache/spark/pull/10506).

I've already bring out to latest, but still not one is reviewing it. I have 
seen this issue is brought up either in mail list or in jira several times. 
[~srowen] Can you and someone else could help to review it?

> the running application status changed  from running to waiting when a master 
> is down and it change to another standy by master
> ---
>
> Key: SPARK-20058
> URL: https://issues.apache.org/jira/browse/SPARK-20058
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.0.2
>Reporter: wangweidong
>Priority: Minor
>
> 1.I deployed the spark with cluster mode, test186 is master, test171 is a 
> backup  master, workers is test137, test155 and test138.
> 2, Start spark with command sbin/start-all.sh
> 3, submit my task with command bin/spark-submit --supervise --deployed-mode 
> cluster --master spark://test186:7077 etc.
> 4, view web ui ,enter test186:8080 , I can see my application is running 
> normally.
> 5, Stop the master of test186, after a period of time, view web ui with 
> test171(standby master), I see my application is waiting and it can not 
> change to run, but type one application and enter the detail page, i can see 
> it is actually runing.
> Is it an bug ? or i start spark with incorrect setting?
> Help!!!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-20058) the running application status changed from running to waiting when a master is down and it change to another standy by master

2017-03-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937841#comment-15937841
 ] 

Saisai Shao edited comment on SPARK-20058 at 3/23/17 7:02 AM:
--

I have a PR for this bug very very long ago 
(https://github.com/apache/spark/pull/10506).

I've already brought it out to latest, but still not one is reviewing it. I 
have seen this issue is brought up either in mail list or in jira several 
times. [~srowen] Can you and someone else could help to review it?


was (Author: jerryshao):
I have a PR for this bug very very long ago 
(https://github.com/apache/spark/pull/10506).

I've already bring out to latest, but still not one is reviewing it. I have 
seen this issue is brought up either in mail list or in jira several times. 
[~srowen] Can you and someone else could help to review it?

> the running application status changed  from running to waiting when a master 
> is down and it change to another standy by master
> ---
>
> Key: SPARK-20058
> URL: https://issues.apache.org/jira/browse/SPARK-20058
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.0.2
>Reporter: wangweidong
>Priority: Minor
>
> 1.I deployed the spark with cluster mode, test186 is master, test171 is a 
> backup  master, workers is test137, test155 and test138.
> 2, Start spark with command sbin/start-all.sh
> 3, submit my task with command bin/spark-submit --supervise --deployed-mode 
> cluster --master spark://test186:7077 etc.
> 4, view web ui ,enter test186:8080 , I can see my application is running 
> normally.
> 5, Stop the master of test186, after a period of time, view web ui with 
> test171(standby master), I see my application is waiting and it can not 
> change to run, but type one application and enter the detail page, i can see 
> it is actually runing.
> Is it an bug ? or i start spark with incorrect setting?
> Help!!!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20058) the running application status changed from running to waiting when a master is down and it change to another standy by master

2017-03-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936255#comment-15936255
 ] 

Saisai Shao commented on SPARK-20058:
-

Please subscribe this spark user mail list and set the question to this mail 
list.

> the running application status changed  from running to waiting when a master 
> is down and it change to another standy by master
> ---
>
> Key: SPARK-20058
> URL: https://issues.apache.org/jira/browse/SPARK-20058
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.0.2
>Reporter: wangweidong
>Priority: Minor
>
> 1.I deployed the spark with cluster mode, test186 is master, test171 is a 
> backup  master, workers is test137, test155 and test138.
> 2, Start spark with command sbin/start-all.sh
> 3, submit my task with command bin/spark-submit --supervise --deployed-mode 
> cluster --master spark://test186:7077 etc.
> 4, view web ui ,enter test186:8080 , I can see my application is running 
> normally.
> 5, Stop the master of test186, after a period of time, view web ui with 
> test171(standby master), I see my application is waiting and it can not 
> change to run, but type one application and enter the detail page, i can see 
> it is actually runing.
> Is it an bug ? or i start spark with incorrect setting?
> Help!!!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-20058) the running application status changed from running to waiting when a master is down and it change to another standy by master

2017-03-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936255#comment-15936255
 ] 

Saisai Shao edited comment on SPARK-20058 at 3/22/17 1:03 PM:
--

Please subscribe this spark user mail list and send the question to this mail 
list.


was (Author: jerryshao):
Please subscribe this spark user mail list and set the question to this mail 
list.

> the running application status changed  from running to waiting when a master 
> is down and it change to another standy by master
> ---
>
> Key: SPARK-20058
> URL: https://issues.apache.org/jira/browse/SPARK-20058
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.0.2
>Reporter: wangweidong
>Priority: Minor
>
> 1.I deployed the spark with cluster mode, test186 is master, test171 is a 
> backup  master, workers is test137, test155 and test138.
> 2, Start spark with command sbin/start-all.sh
> 3, submit my task with command bin/spark-submit --supervise --deployed-mode 
> cluster --master spark://test186:7077 etc.
> 4, view web ui ,enter test186:8080 , I can see my application is running 
> normally.
> 5, Stop the master of test186, after a period of time, view web ui with 
> test171(standby master), I see my application is waiting and it can not 
> change to run, but type one application and enter the detail page, i can see 
> it is actually runing.
> Is it an bug ? or i start spark with incorrect setting?
> Help!!!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20059) HbaseCredentialProvider uses wrong classloader

2017-03-22 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-20059:
---

 Summary: HbaseCredentialProvider uses wrong classloader
 Key: SPARK-20059
 URL: https://issues.apache.org/jira/browse/SPARK-20059
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 2.1.0, 2.2.0
Reporter: Saisai Shao


{{HBaseCredentialProvider}} uses system classloader instead of child 
classloader, which will make HBase jars specified with {{--jars}} fail to work, 
so here we should use the right class loader.

Besides in yarn client mode jars specified with {{--jars}} is not added into 
client's class path, which will make it fail to load HBase jars and issue 
tokens in our scenario. Also some customized credential provider cannot be 
registered into client.

So here I will fix this two issues.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-19992) spark-submit on deployment-mode cluster

2017-03-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936241#comment-15936241
 ] 

Saisai Shao edited comment on SPARK-19992 at 3/22/17 12:51 PM:
---

Oh, I see. Check the code again, looks like "/*" cannot be worked with "local" 
schema, only hadoop support schema like hdfs, file could support glob path.

I agree this probably just a setup / env problem.


was (Author: jerryshao):
Oh, I see. Check the code again, looks like "/*" cannot be worked with "local" 
schema, only hadoop support schema like hdfs, file could support glob path.

> spark-submit on deployment-mode cluster
> ---
>
> Key: SPARK-19992
> URL: https://issues.apache.org/jira/browse/SPARK-19992
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.0.2
> Environment: spark version 2.0.2
> hadoop version 2.6.0
>Reporter: narendra maru
>
> spark version 2.0.2
> hadoop version 2.6.0
> spark -submit command
> "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode 
> cluster --jars /home/ec2-user/jars/hgmongonew.jar, 
> /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar"
> after adding following in
> 1 Spark-default.conf
> spark.executor.extraJavaOptions -Dconfig.fuction.conf 
> spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*
> spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory
> spark.eventLog.enabled=true
> 2yarn-site.xml
> 
> yarn.application.classpath
> 
> /usr/local/hadoop-2.6.0/etc/hadoop,
> /usr/local/hadoop-2.6.0/,
> /usr/local/hadoop-2.6.0/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/lib/
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/,
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*,
> /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar
> 
> 
> Error on log:-
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ApplicationMaster
> Error on terminal:-
> diagnostics: Application application_1489673977198_0002 failed 2 times due to 
> AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 
> For more detailed output, check application tracking 
> page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then,
>  click on links to logs of each attempt.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster

2017-03-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936241#comment-15936241
 ] 

Saisai Shao commented on SPARK-19992:
-

Oh, I see. Check the code again, looks like "/*" cannot be worked with "local" 
schema, only hadoop support schema like hdfs, file could support glob path.

> spark-submit on deployment-mode cluster
> ---
>
> Key: SPARK-19992
> URL: https://issues.apache.org/jira/browse/SPARK-19992
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.0.2
> Environment: spark version 2.0.2
> hadoop version 2.6.0
>Reporter: narendra maru
>
> spark version 2.0.2
> hadoop version 2.6.0
> spark -submit command
> "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode 
> cluster --jars /home/ec2-user/jars/hgmongonew.jar, 
> /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar"
> after adding following in
> 1 Spark-default.conf
> spark.executor.extraJavaOptions -Dconfig.fuction.conf 
> spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*
> spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory
> spark.eventLog.enabled=true
> 2yarn-site.xml
> 
> yarn.application.classpath
> 
> /usr/local/hadoop-2.6.0/etc/hadoop,
> /usr/local/hadoop-2.6.0/,
> /usr/local/hadoop-2.6.0/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/lib/
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/,
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*,
> /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar
> 
> 
> Error on log:-
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ApplicationMaster
> Error on terminal:-
> diagnostics: Application application_1489673977198_0002 failed 2 times due to 
> AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 
> For more detailed output, check application tracking 
> page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then,
>  click on links to logs of each attempt.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19941) Spark should not schedule tasks on executors on decommissioning YARN nodes

2017-03-19 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932088#comment-15932088
 ] 

Saisai Shao commented on SPARK-19941:
-

I think this scenario is quite similar to container preemption. In container 
preemption scenario, AM can be informed from RM which containers will be 
preempted in the next 15 seconds (by default), and AM could react based on such 
information.

I made a similar PR to avoid scheduling tasks on the executors going to be 
preempted. But finally it got rejected because the main reason is that let to 
be preempted executors idle for 15 seconds is too long and waste the resources. 
In your description the executors will be idle for 60 seconds before 
decommission, so this will really waste the resource if most of the works could 
be done in 1 minutes on this executors.

Also I'm not sure why the job will be hang as you mentioned before. I think the 
failed tasks will be rerun again.

So IMHO I think it is better not to handle this scenario unless there's some 
bad problems we met. Sometimes the effort of rerun tasks is smaller than 
wasting the resources.

> Spark should not schedule tasks on executors on decommissioning YARN nodes
> --
>
> Key: SPARK-19941
> URL: https://issues.apache.org/jira/browse/SPARK-19941
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler, YARN
>Affects Versions: 2.1.0
> Environment: Hadoop 2.8.0-rc1
>Reporter: Karthik Palaniappan
>
> Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in 
> YARN: https://issues.apache.org/jira/browse/YARN-914
> Essentially you can mark nodes to be decommissioned, and let them a) finish 
> work in progress and b) finish serving shuffle data. But no new work will be 
> scheduled on the node.
> Spark should respect when NMs are set to decommissioned, and similarly 
> decommission executors on those nodes by not scheduling any more tasks on 
> them.
> It looks like in the future YARN may inform the app master when containers 
> will be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I 
> don't think Spark should schedule based on a timeout. We should gracefully 
> decommission the executor as fast as possible (which is the spirit of 
> YARN-914). The app master can query the RM for NM statuses (if it doesn't 
> already have them) and stop scheduling on executors on NMs that are 
> decommissioning.
> Stretch feature: The timeout may be useful in determining whether running 
> further tasks on the executor is even helpful. Spark may be able to tell that 
> shuffle data will not be consumed by the time the node is decommissioned, so 
> it is not worth computing. The executor can be killed immediately.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19995) Using real user to connect HiveMetastore in HiveClientImpl

2017-03-17 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19995:
---

 Summary: Using real user to connect HiveMetastore in HiveClientImpl
 Key: SPARK-19995
 URL: https://issues.apache.org/jira/browse/SPARK-19995
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Saisai Shao


If user specify "--proxy-user" in kerberized environment with Hive catalog 
implementation, HiveClientImpl will try to connect hive metastore with current 
user. While we use real user to do kinit, this will make connection failure. We 
should change like what we did before in yarn code to use real user.

{noformat}
ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at 
org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at 
org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
at 
org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166)
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:188)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:173)
at 
org.apache.spark.sql.internal.SharedState.(SharedState.scala:86)
at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)

[jira] [Comment Edited] (SPARK-19992) spark-submit on deployment-mode cluster

2017-03-17 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929547#comment-15929547
 ] 

Saisai Shao edited comment on SPARK-19992 at 3/17/17 7:48 AM:
--

Looks like I guess wrong from the url you provided.

If you're trying to use 
"spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*", make sure 
the spark related jars existed under the same path in every node.


was (Author: jerryshao):
Looks like I guess wrong from the url you provided.

If you're trying to use 
"spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*", make sure 
these jars existed in every node.

> spark-submit on deployment-mode cluster
> ---
>
> Key: SPARK-19992
> URL: https://issues.apache.org/jira/browse/SPARK-19992
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.0.2
> Environment: spark version 2.0.2
> hadoop version 2.6.0
>Reporter: narendra maru
>
> spark version 2.0.2
> hadoop version 2.6.0
> spark -submit command
> "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode 
> cluster --jars /home/ec2-user/jars/hgmongonew.jar, 
> /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar"
> after adding following in
> 1 Spark-default.conf
> spark.executor.extraJavaOptions -Dconfig.fuction.conf 
> spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*
> spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory
> spark.eventLog.enabled=true
> 2yarn-site.xml
> 
> yarn.application.classpath
> 
> /usr/local/hadoop-2.6.0/etc/hadoop,
> /usr/local/hadoop-2.6.0/,
> /usr/local/hadoop-2.6.0/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/lib/
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/,
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*,
> /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar
> 
> 
> Error on log:-
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ApplicationMaster
> Error on terminal:-
> diagnostics: Application application_1489673977198_0002 failed 2 times due to 
> AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 
> For more detailed output, check application tracking 
> page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then,
>  click on links to logs of each attempt.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster

2017-03-17 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929547#comment-15929547
 ] 

Saisai Shao commented on SPARK-19992:
-

Looks like I guess wrong from the url you provided.

If you're trying to use 
"spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*", make sure 
these jars existed in every node.

> spark-submit on deployment-mode cluster
> ---
>
> Key: SPARK-19992
> URL: https://issues.apache.org/jira/browse/SPARK-19992
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.0.2
> Environment: spark version 2.0.2
> hadoop version 2.6.0
>Reporter: narendra maru
>
> spark version 2.0.2
> hadoop version 2.6.0
> spark -submit command
> "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode 
> cluster --jars /home/ec2-user/jars/hgmongonew.jar, 
> /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar"
> after adding following in
> 1 Spark-default.conf
> spark.executor.extraJavaOptions -Dconfig.fuction.conf 
> spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*
> spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory
> spark.eventLog.enabled=true
> 2yarn-site.xml
> 
> yarn.application.classpath
> 
> /usr/local/hadoop-2.6.0/etc/hadoop,
> /usr/local/hadoop-2.6.0/,
> /usr/local/hadoop-2.6.0/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/lib/
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/,
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*,
> /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar
> 
> 
> Error on log:-
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ApplicationMaster
> Error on terminal:-
> diagnostics: Application application_1489673977198_0002 failed 2 times due to 
> AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 
> For more detailed output, check application tracking 
> page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then,
>  click on links to logs of each attempt.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster

2017-03-17 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929544#comment-15929544
 ] 

Saisai Shao commented on SPARK-19992:
-

Are you using HDP environment, if so I guess need to configuration hdp.version 
in Spark, you could google it.

> spark-submit on deployment-mode cluster
> ---
>
> Key: SPARK-19992
> URL: https://issues.apache.org/jira/browse/SPARK-19992
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.0.2
> Environment: spark version 2.0.2
> hadoop version 2.6.0
>Reporter: narendra maru
>
> spark version 2.0.2
> hadoop version 2.6.0
> spark -submit command
> "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode 
> cluster --jars /home/ec2-user/jars/hgmongonew.jar, 
> /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar"
> after adding following in
> 1 Spark-default.conf
> spark.executor.extraJavaOptions -Dconfig.fuction.conf 
> spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*
> spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory
> spark.eventLog.enabled=true
> 2yarn-site.xml
> 
> yarn.application.classpath
> 
> /usr/local/hadoop-2.6.0/etc/hadoop,
> /usr/local/hadoop-2.6.0/,
> /usr/local/hadoop-2.6.0/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/,
> /usr/local/hadoop-2.6.0/share/hadoop/common/lib/
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/,
> /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/,
> /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/,
> /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*,
> /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar
> 
> 
> Error on log:-
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ApplicationMaster
> Error on terminal:-
> diagnostics: Application application_1489673977198_0002 failed 2 times due to 
> AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 
> For more detailed output, check application tracking 
> page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then,
>  click on links to logs of each attempt.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19143) API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters)

2017-03-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904497#comment-15904497
 ] 

Saisai Shao commented on SPARK-19143:
-

Hi all, I wrote a rough design doc based on the comments above, here is the 
link 
(https://docs.google.com/document/d/1DFWGHu4_GJapbbfXGWsot_z_W9Wka_39DFNmg9r9SAI/edit?usp=sharing).

[~tgraves] [~vanzin] [~mridulm80] please review and comment, greatly appreciate 
your suggestions.

> API in Spark for distributing new delegation tokens (Improve delegation token 
> handling in secure clusters)
> --
>
> Key: SPARK-19143
> URL: https://issues.apache.org/jira/browse/SPARK-19143
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Ruslan Dautkhanov
>
> Spin off from SPARK-14743 and comments chain in [recent comments| 
> https://issues.apache.org/jira/browse/SPARK-5493?focusedCommentId=15802179=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15802179]
>  in SPARK-5493.
> Spark currently doesn't have a way for distribution new delegation tokens. 
> Quoting [~vanzin] from SPARK-5493 
> {quote}
> IIRC Livy doesn't yet support delegation token renewal. Once it reaches the 
> TTL, the session is unusable.
> There might be ways to hack support for that without changes in Spark, but 
> I'd like to see a proper API in Spark for distributing new delegation tokens. 
> I mentioned that in SPARK-14743, but although that bug is closed, that 
> particular feature hasn't been implemented yet.
> {quote}
> Other thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-19812) YARN shuffle service fails to relocate recovery DB directories

2017-03-07 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900821#comment-15900821
 ] 

Saisai Shao edited comment on SPARK-19812 at 3/8/17 7:42 AM:
-

[~tgraves], I'm not quite sure what you mean here?

bq. The tests are using files rather then directories so it didn't catch. We 
need to fix the test also.

>From my understanding this issues happens when dest dir is not empty and try 
>to move with REPLACE_EXISTING. Also be happened when calling rename failed and 
>the source dir is not empty directory.

But I cannot imagine how this happened, from the log it is more like a `rename` 
failure issue, since the path in Exception points to source dir.



was (Author: jerryshao):
[~tgraves], I'm not quite sure what you mean here?

bq. The tests are using files rather then directories so it didn't catch. We 
need to fix the test also.

>From my understanding this issues happens when dest dir is not empty and try 
>to move with REPLACE_EXISTING. Also be happened when calling rename failed and 
>the source dir is not empty directory.

But I cannot imagine how this happened, because if dest dir is not empty, then 
it should be returned before, will not go to check old NM local dirs.



> YARN shuffle service fails to relocate recovery DB directories
> --
>
> Key: SPARK-19812
> URL: https://issues.apache.org/jira/browse/SPARK-19812
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.1
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>
> The yarn shuffle service tries to switch from the yarn local directories to 
> the real recovery directory but can fail to move the existing recovery db's.  
> It fails due to Files.move not doing directories that have contents.
> 2017-03-03 14:57:19,558 [main] ERROR yarn.YarnShuffleService: Failed to move 
> recovery file sparkShuffleRecovery.ldb to the path 
> /mapred/yarn-nodemanager/nm-aux-services/spark_shuffle
> java.nio.file.DirectoryNotEmptyException:/yarn-local/sparkShuffleRecovery.ldb
> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:498)
> at 
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> at java.nio.file.Files.move(Files.java:1395)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.initRecoveryDb(YarnShuffleService.java:369)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.createSecretManager(YarnShuffleService.java:200)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:174)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:262)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> This used to use f.renameTo and we switched it in the pr due to review 
> comments and it looks like didn't do a final real test. The tests are using 
> files rather then directories so it didn't catch. We need to fix the test 
> also.
> history: 
> https://github.com/apache/spark/pull/14999/commits/65de8531ccb91287f5a8a749c7819e99533b9440



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-19812) YARN shuffle service fails to relocate recovery DB directories

2017-03-07 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900821#comment-15900821
 ] 

Saisai Shao edited comment on SPARK-19812 at 3/8/17 7:40 AM:
-

[~tgraves], I'm not quite sure what you mean here?

bq. The tests are using files rather then directories so it didn't catch. We 
need to fix the test also.

>From my understanding this issues happens when dest dir is not empty and try 
>to move with REPLACE_EXISTING. Also be happened when calling rename failed and 
>the source dir is not empty directory.

But I cannot imagine how this happened, because if dest dir is not empty, then 
it should be returned before, will not go to check old NM local dirs.




was (Author: jerryshao):
[~tgraves], I'm not quite sure what you mean here?

bq. The tests are using files rather then directories so it didn't catch. We 
need to fix the test also.

>From my understanding this issues happens when dest dir is not empty and try 
>to move with REPLACE_EXISTING, but I cannot imagine how this happened, because 
>if dest dir is not empty, then it should be returned before, will not go to 
>check old NM local dirs.

> YARN shuffle service fails to relocate recovery DB directories
> --
>
> Key: SPARK-19812
> URL: https://issues.apache.org/jira/browse/SPARK-19812
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.1
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>
> The yarn shuffle service tries to switch from the yarn local directories to 
> the real recovery directory but can fail to move the existing recovery db's.  
> It fails due to Files.move not doing directories that have contents.
> 2017-03-03 14:57:19,558 [main] ERROR yarn.YarnShuffleService: Failed to move 
> recovery file sparkShuffleRecovery.ldb to the path 
> /mapred/yarn-nodemanager/nm-aux-services/spark_shuffle
> java.nio.file.DirectoryNotEmptyException:/yarn-local/sparkShuffleRecovery.ldb
> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:498)
> at 
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> at java.nio.file.Files.move(Files.java:1395)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.initRecoveryDb(YarnShuffleService.java:369)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.createSecretManager(YarnShuffleService.java:200)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:174)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:262)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> This used to use f.renameTo and we switched it in the pr due to review 
> comments and it looks like didn't do a final real test. The tests are using 
> files rather then directories so it didn't catch. We need to fix the test 
> also.
> history: 
> https://github.com/apache/spark/pull/14999/commits/65de8531ccb91287f5a8a749c7819e99533b9440



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19812) YARN shuffle service fails to relocate recovery DB directories

2017-03-07 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900821#comment-15900821
 ] 

Saisai Shao commented on SPARK-19812:
-

[~tgraves], I'm not quite sure what you mean here?

bq. The tests are using files rather then directories so it didn't catch. We 
need to fix the test also.

>From my understanding this issues happens when dest dir is not empty and try 
>to move with REPLACE_EXISTING, but I cannot imagine how this happened, because 
>if dest dir is not empty, then it should be returned before, will not go to 
>check old NM local dirs.

> YARN shuffle service fails to relocate recovery DB directories
> --
>
> Key: SPARK-19812
> URL: https://issues.apache.org/jira/browse/SPARK-19812
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.1
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>
> The yarn shuffle service tries to switch from the yarn local directories to 
> the real recovery directory but can fail to move the existing recovery db's.  
> It fails due to Files.move not doing directories that have contents.
> 2017-03-03 14:57:19,558 [main] ERROR yarn.YarnShuffleService: Failed to move 
> recovery file sparkShuffleRecovery.ldb to the path 
> /mapred/yarn-nodemanager/nm-aux-services/spark_shuffle
> java.nio.file.DirectoryNotEmptyException:/yarn-local/sparkShuffleRecovery.ldb
> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:498)
> at 
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> at java.nio.file.Files.move(Files.java:1395)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.initRecoveryDb(YarnShuffleService.java:369)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.createSecretManager(YarnShuffleService.java:200)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:174)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:262)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> This used to use f.renameTo and we switched it in the pr due to review 
> comments and it looks like didn't do a final real test. The tests are using 
> files rather then directories so it didn't catch. We need to fix the test 
> also.
> history: 
> https://github.com/apache/spark/pull/14999/commits/65de8531ccb91287f5a8a749c7819e99533b9440



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19802) Remote History Server

2017-03-02 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893580#comment-15893580
 ] 

Saisai Shao commented on SPARK-19802:
-

Spark's {{ApplicationHistoryProvider}} is pluggable, user could implement their 
own provider and plug into Spark's history server. So you could implement a 
{{HistoryProvider}} you wanted out of Spark.

>From your description, this is more like a Hadoop ATS (Hadoop application 
>timeline server). We have an implementation of Timeline based history provider 
>for Spark's history server. The main feature is like what you mentioned query 
>through TCP, get the event and display on UI.

> Remote History Server
> -
>
> Key: SPARK-19802
> URL: https://issues.apache.org/jira/browse/SPARK-19802
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0
>Reporter: Ben Barnard
>
> Currently the history server expects to find history in a filesystem 
> somewhere. It would be nice to have a history server that listens for 
> application events on a TCP port, and have a EventLoggingListener that sends 
> events to the listening history server instead of writing to a file. This 
> would allow the history server to show up-to-date history for past and 
> running jobs in a cluster environment that lacks a shared filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19750) Spark UI http -> https redirect error

2017-02-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887204#comment-15887204
 ] 

Saisai Shao commented on SPARK-19750:
-

This issue was found by [~yeshavora], credits to her.

> Spark UI http -> https redirect error
> -
>
> Key: SPARK-19750
> URL: https://issues.apache.org/jira/browse/SPARK-19750
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Saisai Shao
>
> Spark's http redirect uses port 0 as a secure port to do redirect if http 
> port is not set, this will introduce {{ java.net.NoRouteToHostException: 
> Can't assign requested address }}, so here fixed to use bound port for 
> redirect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19750) Spark UI http -> https redirect error

2017-02-27 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19750:
---

 Summary: Spark UI http -> https redirect error
 Key: SPARK-19750
 URL: https://issues.apache.org/jira/browse/SPARK-19750
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.1.0, 2.0.2
Reporter: Saisai Shao


Spark's http redirect uses port 0 as a secure port to do redirect if http port 
is not set, this will introduce {{ java.net.NoRouteToHostException: Can't 
assign requested address }}, so here fixed to use bound port for redirect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882027#comment-15882027
 ] 

Saisai Shao commented on SPARK-19688:
-

According to my test, "spark.yarn.credentials.file" will be overwritten in 
yarn-client to point to a correct path when launching application 
(https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L737).
 So even Spark Streaming checkpoint still keeps the old configuration, it will 
be overwritten when the new application is started. So I don't see an issue 
here except this weird setting.

> Spark on Yarn Credentials File set to different application directory
> -
>
> Key: SPARK-19688
> URL: https://issues.apache.org/jira/browse/SPARK-19688
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, YARN
>Affects Versions: 1.6.3
>Reporter: Devaraj Jonnadula
>Priority: Minor
>
> spark.yarn.credentials.file property is set to different application Id 
> instead of actual Application Id 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-19707) Improve the invalid path check for sc.addJar

2017-02-23 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-19707:

Summary: Improve the invalid path check for sc.addJar  (was: Improve the 
invalid path handling for sc.addJar)

> Improve the invalid path check for sc.addJar
> 
>
> Key: SPARK-19707
> URL: https://issues.apache.org/jira/browse/SPARK-19707
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Saisai Shao
>
> Currently in Spark there're two issues when we add jars with invalid path:
> * If the jar path is a empty string {--jar ",dummy.jar"}, then Spark will 
> resolve it to the current directory path and add to classpath / file server, 
> which is unwanted.
> * If the jar path is a invalid path (file doesn't exist), file server doesn't 
> check this and will still added file server, the exception will be thrown 
> until job is running. This local path could be checked immediately, no need 
> to wait until task running. We have similar check in {{addFile}}, but lacks 
> similar one in {{addJar}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19707) Improve the invalid path handling for sc.addJar

2017-02-23 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19707:
---

 Summary: Improve the invalid path handling for sc.addJar
 Key: SPARK-19707
 URL: https://issues.apache.org/jira/browse/SPARK-19707
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.0
Reporter: Saisai Shao


Currently in Spark there're two issues when we add jars with invalid path:

* If the jar path is a empty string {--jar ",dummy.jar"}, then Spark will 
resolve it to the current directory path and add to classpath / file server, 
which is unwanted.
* If the jar path is a invalid path (file doesn't exist), file server doesn't 
check this and will still added file server, the exception will be thrown until 
job is running. This local path could be checked immediately, no need to wait 
until task running. We have similar check in {{addFile}}, but lacks similar one 
in {{addJar}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879893#comment-15879893
 ] 

Saisai Shao commented on SPARK-19688:
-

I see. So what issue did you encounter when you restart the application 
manually, or you just saw the abnormal credential configuration?

>From my understanding, this credential configuration will be overwritten when 
>you restart the application, so it should be fine.

> Spark on Yarn Credentials File set to different application directory
> -
>
> Key: SPARK-19688
> URL: https://issues.apache.org/jira/browse/SPARK-19688
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, YARN
>Affects Versions: 1.6.3
>Reporter: Devaraj Jonnadula
>Priority: Minor
>
> spark.yarn.credentials.file property is set to different application Id 
> instead of actual Application Id 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879883#comment-15879883
 ] 

Saisai Shao commented on SPARK-19688:
-

[~j.devaraj], when you say Spark application is restarted, are you pointing to 
yarn's reattempt mechanism or you manually restart the application?

> Spark on Yarn Credentials File set to different application directory
> -
>
> Key: SPARK-19688
> URL: https://issues.apache.org/jira/browse/SPARK-19688
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, YARN
>Affects Versions: 1.6.3
>Reporter: Devaraj Jonnadula
>Priority: Minor
>
> spark.yarn.credentials.file property is set to different application Id 
> instead of actual Application Id 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879564#comment-15879564
 ] 

Saisai Shao commented on SPARK-19688:
-

I see, so we should exclude this configuration in checkpoint and make it 
re-configured after restarted.

> Spark on Yarn Credentials File set to different application directory
> -
>
> Key: SPARK-19688
> URL: https://issues.apache.org/jira/browse/SPARK-19688
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, YARN
>Affects Versions: 1.6.3
>Reporter: Devaraj Jonnadula
>Priority: Minor
>
> spark.yarn.credentials.file property is set to different application Id 
> instead of actual Application Id 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-19688:

Component/s: DStreams

> Spark on Yarn Credentials File set to different application directory
> -
>
> Key: SPARK-19688
> URL: https://issues.apache.org/jira/browse/SPARK-19688
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, YARN
>Affects Versions: 1.6.3
>Reporter: Devaraj Jonnadula
>Priority: Minor
>
> spark.yarn.credentials.file property is set to different application Id 
> instead of actual Application Id 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-21 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877587#comment-15877587
 ] 

Saisai Shao commented on SPARK-19688:
-

Can you please elaborate the problem you met, otherwise it is hard for others 
to identify.

Also "spark.yarn.credentials.file" is a internal configuration, usually user 
should not configure it.

> Spark on Yarn Credentials File set to different application directory
> -
>
> Key: SPARK-19688
> URL: https://issues.apache.org/jira/browse/SPARK-19688
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3
>Reporter: Devaraj Jonnadula
>Priority: Minor
>
> spark.yarn.credentials.file property is set to different application Id 
> instead of actual Application Id 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19649) Spark YARN client throws exception if job succeeds and max-completed-applications=0

2017-02-19 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874101#comment-15874101
 ] 

Saisai Shao commented on SPARK-19649:
-

MapReduce could delegate to history server to query the state again, this may 
not be applied to Spark.

> Spark YARN client throws exception if job succeeds and 
> max-completed-applications=0
> ---
>
> Key: SPARK-19649
> URL: https://issues.apache.org/jira/browse/SPARK-19649
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3
> Environment: EMR release label 4.8.x
>Reporter: Joshua Caplan
>Priority: Minor
>
> I believe the patch in SPARK-3877 created a new race condition between YARN 
> and the Spark client.
> I typically configure YARN not to keep *any* recent jobs in memory, as some 
> of my jobs get pretty large.
> {code}
> yarn-site yarn.resourcemanager.max-completed-applications 0
> {code}
> The once-per-second call to getApplicationReport may thus encounter a RUNNING 
> application followed by a not found application, and report a false negative.
> (typical) Executor log:
> {code}
> 17/01/09 19:31:23 INFO ApplicationMaster: Final app status: SUCCEEDED, 
> exitCode: 0
> 17/01/09 19:31:23 INFO SparkContext: Invoking stop() from shutdown hook
> 17/01/09 19:31:24 INFO SparkUI: Stopped Spark web UI at 
> http://10.0.0.168:37046
> 17/01/09 19:31:24 INFO YarnClusterSchedulerBackend: Shutting down all 
> executors
> 17/01/09 19:31:24 INFO YarnClusterSchedulerBackend: Asking each executor to 
> shut down
> 17/01/09 19:31:24 INFO MapOutputTrackerMasterEndpoint: 
> MapOutputTrackerMasterEndpoint stopped!
> 17/01/09 19:31:24 INFO MemoryStore: MemoryStore cleared
> 17/01/09 19:31:24 INFO BlockManager: BlockManager stopped
> 17/01/09 19:31:24 INFO BlockManagerMaster: BlockManagerMaster stopped
> 17/01/09 19:31:24 INFO 
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
> OutputCommitCoordinator stopped!
> 17/01/09 19:31:24 INFO SparkContext: Successfully stopped SparkContext
> 17/01/09 19:31:24 INFO ApplicationMaster: Unregistering ApplicationMaster 
> with SUCCEEDED
> 17/01/09 19:31:24 INFO RemoteActorRefProvider$RemotingTerminator: Shutting 
> down remote daemon.
> 17/01/09 19:31:24 INFO RemoteActorRefProvider$RemotingTerminator: Remote 
> daemon shut down; proceeding with flushing remote transports.
> 17/01/09 19:31:24 INFO AMRMClientImpl: Waiting for application to be 
> successfully unregistered.
> 17/01/09 19:31:24 INFO RemoteActorRefProvider$RemotingTerminator: Remoting 
> shut down.
> {code}
> Client log:
> {code}
> 17/01/09 19:31:23 INFO Client: Application report for 
> application_1483983939941_0056 (state: RUNNING)
> 17/01/09 19:31:24 ERROR Client: Application application_1483983939941_0056 
> not found.
> Exception in thread "main" org.apache.spark.SparkException: Application 
> application_1483983939941_0056 is killed
>   at org.apache.spark.deploy.yarn.Client.run(Client.scala:1038)
>   at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
>   at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>   at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19649) Spark YARN client throws exception if job succeeds and max-completed-applications=0

2017-02-19 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874098#comment-15874098
 ] 

Saisai Shao commented on SPARK-19649:
-

>From my understanding, this looks like a deliberately made exception, since 
>you configured  {{max-completed-applications}} to 0, so you will have a great 
>chance to not catch finished state for this application in RM., because Spark 
>queries the state from RM in a polling mechanism. This looks like not a race 
>condition issue, unless RM could push finished state actively to Spark, then 
>it is hard to catch the finished state from Spark side.

> Spark YARN client throws exception if job succeeds and 
> max-completed-applications=0
> ---
>
> Key: SPARK-19649
> URL: https://issues.apache.org/jira/browse/SPARK-19649
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3
> Environment: EMR release label 4.8.x
>Reporter: Joshua Caplan
>Priority: Minor
>
> I believe the patch in SPARK-3877 created a new race condition between YARN 
> and the Spark client.
> I typically configure YARN not to keep *any* recent jobs in memory, as some 
> of my jobs get pretty large.
> {code}
> yarn-site yarn.resourcemanager.max-completed-applications 0
> {code}
> The once-per-second call to getApplicationReport may thus encounter a RUNNING 
> application followed by a not found application, and report a false negative.
> (typical) Executor log:
> {code}
> 17/01/09 19:31:23 INFO ApplicationMaster: Final app status: SUCCEEDED, 
> exitCode: 0
> 17/01/09 19:31:23 INFO SparkContext: Invoking stop() from shutdown hook
> 17/01/09 19:31:24 INFO SparkUI: Stopped Spark web UI at 
> http://10.0.0.168:37046
> 17/01/09 19:31:24 INFO YarnClusterSchedulerBackend: Shutting down all 
> executors
> 17/01/09 19:31:24 INFO YarnClusterSchedulerBackend: Asking each executor to 
> shut down
> 17/01/09 19:31:24 INFO MapOutputTrackerMasterEndpoint: 
> MapOutputTrackerMasterEndpoint stopped!
> 17/01/09 19:31:24 INFO MemoryStore: MemoryStore cleared
> 17/01/09 19:31:24 INFO BlockManager: BlockManager stopped
> 17/01/09 19:31:24 INFO BlockManagerMaster: BlockManagerMaster stopped
> 17/01/09 19:31:24 INFO 
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
> OutputCommitCoordinator stopped!
> 17/01/09 19:31:24 INFO SparkContext: Successfully stopped SparkContext
> 17/01/09 19:31:24 INFO ApplicationMaster: Unregistering ApplicationMaster 
> with SUCCEEDED
> 17/01/09 19:31:24 INFO RemoteActorRefProvider$RemotingTerminator: Shutting 
> down remote daemon.
> 17/01/09 19:31:24 INFO RemoteActorRefProvider$RemotingTerminator: Remote 
> daemon shut down; proceeding with flushing remote transports.
> 17/01/09 19:31:24 INFO AMRMClientImpl: Waiting for application to be 
> successfully unregistered.
> 17/01/09 19:31:24 INFO RemoteActorRefProvider$RemotingTerminator: Remoting 
> shut down.
> {code}
> Client log:
> {code}
> 17/01/09 19:31:23 INFO Client: Application report for 
> application_1483983939941_0056 (state: RUNNING)
> 17/01/09 19:31:24 ERROR Client: Application application_1483983939941_0056 
> not found.
> Exception in thread "main" org.apache.spark.SparkException: Application 
> application_1483983939941_0056 is killed
>   at org.apache.spark.deploy.yarn.Client.run(Client.scala:1038)
>   at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
>   at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>   at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-14 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865434#comment-15865434
 ] 

Saisai Shao commented on SPARK-19588:
-

Putting on HDFS still requires downloading to local disk for 
driver/yarn#client, since driver/yarn#client is not in control by yarn, so 
there's no difference whether putting it locally or on HDFS.



> Allow putting keytab file to HDFS location specified in spark.yarn.keytab
> -
>
> Key: SPARK-19588
> URL: https://issues.apache.org/jira/browse/SPARK-19588
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, Spark Submit
>Affects Versions: 2.0.2, 2.1.0
> Environment: kerberized cluster, Spark 2
>Reporter: Ruslan Dautkhanov
>  Labels: authentication, kerberos, security, yarn-client
>
> As a workaround for SPARK-19038 tried putting keytab in user's home directory 
> in HDFS but this fails with 
> {noformat}
> Exception in thread "main" org.apache.spark.SparkException: Keytab file: 
> hdfs:///user/svc_odiprd/.kt does not exist
> at 
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:555)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:158)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {noformat}
> This is yarn-client mode, so driver probably can't see HDFS while submitting 
> a job; although I suspect it doesn't not only with yarn-client.
> Would be great to support reading keytab for kerberos ticket renewals 
> directly from HDFS.
> We think that in some scenarios it's more secure than referencing a keytab 
> from a local fs on a client machine that does a spark-submit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19579) spark-submit fails to run Kafka Stream python script

2017-02-14 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865360#comment-15865360
 ] 

Saisai Shao commented on SPARK-19579:
-

Only this specific part is not supported, all the that Python APIs are still 
supported.

> spark-submit fails to run Kafka Stream python script
> 
>
> Key: SPARK-19579
> URL: https://issues.apache.org/jira/browse/SPARK-19579
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.1.0
> Environment: Linux Ubuntu 16.10 64bit
>Reporter: Piotr Nestorow
>
> Kafka Stream python script is executed but it fails with:
> TypeError: 'JavaPackage' object is not callable
> The Spark Kafka streaming jar is provided:
> spark-streaming-kafka-0-10_2.11-2.1.0.jar
> Kafka version: kafka_2.11-0.10.1.1
> In the conf/spark-defaults.conf:
> spark.jars.packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.1.0
> Also at runtime:
> bin/spark-submit --jars 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar  
> --driver-class-path 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar 
> /home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py  
> localhost:2181 example_topic
> Also the Spark Kafka streaming example:
> 'examples/src/main/python/streaming/kafka_wordcount.py'
> can be used to check this problem.
> Nothing helps...
> Detailed output:
> _
>   Spark Streaming's Kafka libraries not found in class path. Try one of the 
> following.
>   1. Include the Kafka library and its dependencies with in the
>  spark-submit command as
>  $ bin/spark-submit --packages 
> org.apache.spark:spark-streaming-kafka-0-8:2.1.0 ...
>   2. Download the JAR of the artifact from Maven Central 
> http://search.maven.org/,
>  Group Id = org.apache.spark, Artifact Id = 
> spark-streaming-kafka-0-8-assembly, Version = 2.1.0.
>  Then, include the jar in the spark-submit command as
>  $ bin/spark-submit --jars  ...
> 
> Traceback (most recent call last):
>   File "/home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py", line 
> 18, in 
> kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", 
> {topic: 1})
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 69, in createStream
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 195, in _get_helper
> TypeError: 'JavaPackage' object is not callable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-19579) spark-submit fails to run Kafka Stream python script

2017-02-14 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao closed SPARK-19579.
---
Resolution: Won't Fix

> spark-submit fails to run Kafka Stream python script
> 
>
> Key: SPARK-19579
> URL: https://issues.apache.org/jira/browse/SPARK-19579
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.1.0
> Environment: Linux Ubuntu 16.10 64bit
>Reporter: Piotr Nestorow
>
> Kafka Stream python script is executed but it fails with:
> TypeError: 'JavaPackage' object is not callable
> The Spark Kafka streaming jar is provided:
> spark-streaming-kafka-0-10_2.11-2.1.0.jar
> Kafka version: kafka_2.11-0.10.1.1
> In the conf/spark-defaults.conf:
> spark.jars.packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.1.0
> Also at runtime:
> bin/spark-submit --jars 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar  
> --driver-class-path 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar 
> /home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py  
> localhost:2181 example_topic
> Also the Spark Kafka streaming example:
> 'examples/src/main/python/streaming/kafka_wordcount.py'
> can be used to check this problem.
> Nothing helps...
> Detailed output:
> _
>   Spark Streaming's Kafka libraries not found in class path. Try one of the 
> following.
>   1. Include the Kafka library and its dependencies with in the
>  spark-submit command as
>  $ bin/spark-submit --packages 
> org.apache.spark:spark-streaming-kafka-0-8:2.1.0 ...
>   2. Download the JAR of the artifact from Maven Central 
> http://search.maven.org/,
>  Group Id = org.apache.spark, Artifact Id = 
> spark-streaming-kafka-0-8-assembly, Version = 2.1.0.
>  Then, include the jar in the spark-submit command as
>  $ bin/spark-submit --jars  ...
> 
> Traceback (most recent call last):
>   File "/home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py", line 
> 18, in 
> kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", 
> {topic: 1})
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 69, in createStream
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 195, in _get_helper
> TypeError: 'JavaPackage' object is not callable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-19579) spark-submit fails to run Kafka Stream python script

2017-02-14 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865360#comment-15865360
 ] 

Saisai Shao edited comment on SPARK-19579 at 2/14/17 8:33 AM:
--

Only this specific part is not supported, all the other Python APIs are still 
supported.


was (Author: jerryshao):
Only this specific part is not supported, all the that Python APIs are still 
supported.

> spark-submit fails to run Kafka Stream python script
> 
>
> Key: SPARK-19579
> URL: https://issues.apache.org/jira/browse/SPARK-19579
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.1.0
> Environment: Linux Ubuntu 16.10 64bit
>Reporter: Piotr Nestorow
>
> Kafka Stream python script is executed but it fails with:
> TypeError: 'JavaPackage' object is not callable
> The Spark Kafka streaming jar is provided:
> spark-streaming-kafka-0-10_2.11-2.1.0.jar
> Kafka version: kafka_2.11-0.10.1.1
> In the conf/spark-defaults.conf:
> spark.jars.packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.1.0
> Also at runtime:
> bin/spark-submit --jars 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar  
> --driver-class-path 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar 
> /home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py  
> localhost:2181 example_topic
> Also the Spark Kafka streaming example:
> 'examples/src/main/python/streaming/kafka_wordcount.py'
> can be used to check this problem.
> Nothing helps...
> Detailed output:
> _
>   Spark Streaming's Kafka libraries not found in class path. Try one of the 
> following.
>   1. Include the Kafka library and its dependencies with in the
>  spark-submit command as
>  $ bin/spark-submit --packages 
> org.apache.spark:spark-streaming-kafka-0-8:2.1.0 ...
>   2. Download the JAR of the artifact from Maven Central 
> http://search.maven.org/,
>  Group Id = org.apache.spark, Artifact Id = 
> spark-streaming-kafka-0-8-assembly, Version = 2.1.0.
>  Then, include the jar in the spark-submit command as
>  $ bin/spark-submit --jars  ...
> 
> Traceback (most recent call last):
>   File "/home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py", line 
> 18, in 
> kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", 
> {topic: 1})
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 69, in createStream
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 195, in _get_helper
> TypeError: 'JavaPackage' object is not callable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19579) spark-submit fails to run Kafka Stream python script

2017-02-14 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865346#comment-15865346
 ] 

Saisai Shao commented on SPARK-19579:
-

The support of Python API is rejected by the community, so there will no 
roadmap for it.

> spark-submit fails to run Kafka Stream python script
> 
>
> Key: SPARK-19579
> URL: https://issues.apache.org/jira/browse/SPARK-19579
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.1.0
> Environment: Linux Ubuntu 16.10 64bit
>Reporter: Piotr Nestorow
>
> Kafka Stream python script is executed but it fails with:
> TypeError: 'JavaPackage' object is not callable
> The Spark Kafka streaming jar is provided:
> spark-streaming-kafka-0-10_2.11-2.1.0.jar
> Kafka version: kafka_2.11-0.10.1.1
> In the conf/spark-defaults.conf:
> spark.jars.packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.1.0
> Also at runtime:
> bin/spark-submit --jars 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar  
> --driver-class-path 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar 
> /home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py  
> localhost:2181 example_topic
> Also the Spark Kafka streaming example:
> 'examples/src/main/python/streaming/kafka_wordcount.py'
> can be used to check this problem.
> Nothing helps...
> Detailed output:
> _
>   Spark Streaming's Kafka libraries not found in class path. Try one of the 
> following.
>   1. Include the Kafka library and its dependencies with in the
>  spark-submit command as
>  $ bin/spark-submit --packages 
> org.apache.spark:spark-streaming-kafka-0-8:2.1.0 ...
>   2. Download the JAR of the artifact from Maven Central 
> http://search.maven.org/,
>  Group Id = org.apache.spark, Artifact Id = 
> spark-streaming-kafka-0-8-assembly, Version = 2.1.0.
>  Then, include the jar in the spark-submit command as
>  $ bin/spark-submit --jars  ...
> 
> Traceback (most recent call last):
>   File "/home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py", line 
> 18, in 
> kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", 
> {topic: 1})
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 69, in createStream
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 195, in _get_helper
> TypeError: 'JavaPackage' object is not callable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19579) spark-submit fails to run Kafka Stream python script

2017-02-14 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865330#comment-15865330
 ] 

Saisai Shao commented on SPARK-19579:
-

If you're using Python API to writing Streaming application, then it is not 
supported.

> spark-submit fails to run Kafka Stream python script
> 
>
> Key: SPARK-19579
> URL: https://issues.apache.org/jira/browse/SPARK-19579
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.1.0
> Environment: Linux Ubuntu 16.10 64bit
>Reporter: Piotr Nestorow
>
> Kafka Stream python script is executed but it fails with:
> TypeError: 'JavaPackage' object is not callable
> The Spark Kafka streaming jar is provided:
> spark-streaming-kafka-0-10_2.11-2.1.0.jar
> Kafka version: kafka_2.11-0.10.1.1
> In the conf/spark-defaults.conf:
> spark.jars.packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.1.0
> Also at runtime:
> bin/spark-submit --jars 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar  
> --driver-class-path 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar 
> /home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py  
> localhost:2181 example_topic
> Also the Spark Kafka streaming example:
> 'examples/src/main/python/streaming/kafka_wordcount.py'
> can be used to check this problem.
> Nothing helps...
> Detailed output:
> _
>   Spark Streaming's Kafka libraries not found in class path. Try one of the 
> following.
>   1. Include the Kafka library and its dependencies with in the
>  spark-submit command as
>  $ bin/spark-submit --packages 
> org.apache.spark:spark-streaming-kafka-0-8:2.1.0 ...
>   2. Download the JAR of the artifact from Maven Central 
> http://search.maven.org/,
>  Group Id = org.apache.spark, Artifact Id = 
> spark-streaming-kafka-0-8-assembly, Version = 2.1.0.
>  Then, include the jar in the spark-submit command as
>  $ bin/spark-submit --jars  ...
> 
> Traceback (most recent call last):
>   File "/home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py", line 
> 18, in 
> kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", 
> {topic: 1})
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 69, in createStream
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 195, in _get_helper
> TypeError: 'JavaPackage' object is not callable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-02-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864983#comment-15864983
 ] 

Saisai Shao commented on SPARK-19038:
-

I think the issue you met is the same as this JIRA mentioned, but PR 16482 
tries to use a different way to solve this problem, which is not correct for 
current Spark on YARN. Let me figure out a decent solution to fix this.

> Can't find keytab file when using Hive catalog
> --
>
> Key: SPARK-19038
> URL: https://issues.apache.org/jira/browse/SPARK-19038
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.2
> Environment: Hadoop / YARN 2.6, pyspark, yarn-client mode
>Reporter: Peter Parente
>  Labels: hive, kerberos, pyspark
>
> h2. Stack Trace
> {noformat}
> Py4JJavaErrorTraceback (most recent call last)
>  in ()
> > 1 sdf = sql.createDataFrame(df)
> /opt/spark2/python/pyspark/sql/context.py in createDataFrame(self, data, 
> schema, samplingRatio, verifySchema)
> 307 Py4JJavaError: ...
> 308 """
> --> 309 return self.sparkSession.createDataFrame(data, schema, 
> samplingRatio, verifySchema)
> 310 
> 311 @since(1.3)
> /opt/spark2/python/pyspark/sql/session.py in createDataFrame(self, data, 
> schema, samplingRatio, verifySchema)
> 524 rdd, schema = self._createFromLocal(map(prepare, data), 
> schema)
> 525 jrdd = 
> self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
> --> 526 jdf = self._jsparkSession.applySchemaToPythonRDD(jrdd.rdd(), 
> schema.json())
> 527 df = DataFrame(jdf, self._wrapped)
> 528 df._schema = schema
> /opt/spark2/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py in 
> __call__(self, *args)
>1131 answer = self.gateway_client.send_command(command)
>1132 return_value = get_return_value(
> -> 1133 answer, self.gateway_client, self.target_id, self.name)
>1134 
>1135 for temp_arg in temp_args:
> /opt/spark2/python/pyspark/sql/utils.py in deco(*a, **kw)
>  61 def deco(*a, **kw):
>  62 try:
> ---> 63 return f(*a, **kw)
>  64 except py4j.protocol.Py4JJavaError as e:
>  65 s = e.java_exception.toString()
> /opt/spark2/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py in 
> get_return_value(answer, gateway_client, target_id, name)
> 317 raise Py4JJavaError(
> 318 "An error occurred while calling {0}{1}{2}.\n".
> --> 319 format(target_id, ".", name), value)
> 320 else:
> 321 raise Py4JError(
> Py4JJavaError: An error occurred while calling o47.applySchemaToPythonRDD.
> : org.apache.spark.SparkException: Keytab file: 
> .keytab-f0b9b814-460e-4fa8-8e7d-029186b696c4 specified in spark.yarn.keytab 
> does not exist
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:113)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)
>   at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
>   at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
>   at 
> org.apache.spark.sql.hive.HiveSessionState$$anon$1.(HiveSessionState.scala:63)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
>   at 
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>   at 
> 

[jira] [Commented] (SPARK-19579) spark-submit fails to run Kafka Stream python script

2017-02-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864811#comment-15864811
 ] 

Saisai Shao commented on SPARK-19579:
-

Spark Streaming Kafka Python API doesn't support Kafka 0.10 module. You could 
only use against Kafka 0.8.x. 

> spark-submit fails to run Kafka Stream python script
> 
>
> Key: SPARK-19579
> URL: https://issues.apache.org/jira/browse/SPARK-19579
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.1.0
> Environment: Linux Ubuntu 16.10 64bit
>Reporter: Piotr Nestorow
>
> Kafka Stream python script is executed but it fails with:
> TypeError: 'JavaPackage' object is not callable
> The Spark Kafka streaming jar is provided:
> spark-streaming-kafka-0-10_2.11-2.1.0.jar
> Kafka version: kafka_2.11-0.10.1.1
> In the conf/spark-defaults.conf:
> spark.jars.packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.1.0
> Also at runtime:
> bin/spark-submit --jars 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar  
> --driver-class-path 
> /usr/local/spark/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar 
> /home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py  
> localhost:2181 example_topic
> Also the Spark Kafka streaming example:
> 'examples/src/main/python/streaming/kafka_wordcount.py'
> can be used to check this problem.
> Nothing helps...
> Detailed output:
> _
>   Spark Streaming's Kafka libraries not found in class path. Try one of the 
> following.
>   1. Include the Kafka library and its dependencies with in the
>  spark-submit command as
>  $ bin/spark-submit --packages 
> org.apache.spark:spark-streaming-kafka-0-8:2.1.0 ...
>   2. Download the JAR of the artifact from Maven Central 
> http://search.maven.org/,
>  Group Id = org.apache.spark, Artifact Id = 
> spark-streaming-kafka-0-8-assembly, Version = 2.1.0.
>  Then, include the jar in the spark-submit command as
>  $ bin/spark-submit --jars  ...
> 
> Traceback (most recent call last):
>   File "/home/sysveradmin/work/Programs/ApacheSpark/ex_kafka_stream.py", line 
> 18, in 
> kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", 
> {topic: 1})
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 69, in createStream
>   File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/kafka.py", 
> line 195, in _get_helper
> TypeError: 'JavaPackage' object is not callable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863371#comment-15863371
 ] 

Saisai Shao edited comment on SPARK-16742 at 2/13/17 9:34 AM:
--

The proposed solution is quite different from what existed in Spark on YARN. 
IIUC this solution looks doesn't honor delegation token, and wraps every HDFS 
operation with {{executeSecure}}, I simply doubt that this approach requires 
other components, like sql, streaming, should also know the existence of such 
APIs and try to wrap them. Also if newly added codes ignore this wrapper, this 
will lead to error. From my understanding it is quite intrusive.

Also how do you handle principal and keytab for driver/executors, do you need 
to ship keytab to every nodes and who is responsible for this?

And looks from your PR what you mainly focused is user impersonation, this is 
slightly different from what this JIRA mentioned about, also your main 
requirement is dynamic proxy user change, I would suggest to use another JIRA 
to track this, since this is a little different from support Kerberos in Mesos.


was (Author: jerryshao):
The proposed solution is quite different from what existed in Spark on YARN. 
IIUC this solution looks doesn't honor delegation token, and wraps every HDFS 
operation with {{executeSecure}}, I simply doubt that this approach requires 
other components, like sql, streaming, should also know the existence of such 
APIs and try to wrap them. Also if newly added codes ignore this wrapper, this 
will lead to error. From my understanding it is quite intrusive.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863371#comment-15863371
 ] 

Saisai Shao commented on SPARK-16742:
-

The proposed solution is quite different from what existed in Spark on YARN. 
IIUC this solution looks doesn't honor delegation token, and wraps every HDFS 
operation with {{executeSecure}}, I simply doubt that this approach requires 
other components, like sql, streaming, should also know the existence of such 
APIs and try to wrap them. Also if newly added codes ignore this wrapper, this 
will lead to error. From my understanding it is quite intrusive.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19545) Compilation error with method not found when build against Hadoop 2.6.0.

2017-02-10 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19545:
---

 Summary: Compilation error with method not found when build 
against Hadoop 2.6.0.
 Key: SPARK-19545
 URL: https://issues.apache.org/jira/browse/SPARK-19545
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 2.2.0
Reporter: Saisai Shao


{code}
./build/sbt -Phadoop-2.6 -Pyarn -Dhadoop.version=2.6.0
{code}

{code}
[error] 
/Users/sshao/projects/apache-spark/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:249:
 value setRolledLogsIncludePattern is not a member of 
org.apache.hadoop.yarn.api.records.LogAggregationContext
[error]   logAggregationContext.setRolledLogsIncludePattern(includePattern)
[error] ^
[error] 
/Users/sshao/projects/apache-spark/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:251:
 value setRolledLogsExcludePattern is not a member of 
org.apache.hadoop.yarn.api.records.LogAggregationContext
[error] 
logAggregationContext.setRolledLogsExcludePattern(excludePattern)
[error]   ^
[error] two errors found
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-19306) Fix inconsistent state in DiskBlockObjectWriter when exception occurred

2017-01-20 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-19306:

Summary: Fix inconsistent state in DiskBlockObjectWriter when exception 
occurred  (was: Fix inconsistent state in DiskBlockObjectWrite when exception 
occurred)

> Fix inconsistent state in DiskBlockObjectWriter when exception occurred
> ---
>
> Key: SPARK-19306
> URL: https://issues.apache.org/jira/browse/SPARK-19306
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> In {{DiskBlockObjectWriter}}, when some errors happened during writing, it 
> will call {{revertPartialWritesAndClose}}, if this method again failed due to 
> some hardware issues like out of disk, it will throw exception without 
> resetting the state of this writer, also skipping the revert. So here propose 
> to fix this issue to offer user a chance to recover from such issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19306) Fix inconsistent state in DiskBlockObjectWrite when exception occurred

2017-01-19 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19306:
---

 Summary: Fix inconsistent state in DiskBlockObjectWrite when 
exception occurred
 Key: SPARK-19306
 URL: https://issues.apache.org/jira/browse/SPARK-19306
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.1.0
Reporter: Saisai Shao
Priority: Minor


In {{DiskBlockObjectWriter}}, when some errors happened during writing, it will 
call {{revertPartialWritesAndClose}}, if this method again failed due to some 
hardware issues like out of disk, it will throw exception without resetting the 
state of this writer, also skipping the revert. So here propose to fix this 
issue to offer user a chance to recover from such issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19179) spark.yarn.access.namenodes description is wrong

2017-01-11 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820371#comment-15820371
 ] 

Saisai Shao commented on SPARK-19179:
-

Thanks [~tgraves] to point out the left thing, let me handle it.

> spark.yarn.access.namenodes description is wrong
> 
>
> Key: SPARK-19179
> URL: https://issues.apache.org/jira/browse/SPARK-19179
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.2
>Reporter: Thomas Graves
>Priority: Minor
>
> The description and name of spark.yarn.access.namenodesis off.  It 
> says this is for HDFS namenodes when really this is to specify any hadoop 
> filesystems.  It gets the credentials for those filesystems.
> We should at least update the description on it to be more generic.  We could 
> change the name on it but we would have to deprecated it and keep around 
> current name as many people use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814001#comment-15814001
 ] 

Saisai Shao commented on SPARK-19090:
-

Are you using SparkConf API to set configuration in application run-time? From 
the code I could see you did that . This won't be worked, at least for yarn 
cluster mode.

> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813973#comment-15813973
 ] 

Saisai Shao commented on SPARK-19090:
-

Spark shell is a real spark *application*. The underlying SparkSubmit logics 
are the same...

> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813941#comment-15813941
 ] 

Saisai Shao commented on SPARK-19090:
-

{code}
./bin/spark-shell --master yarn-client --conf spark.executor.cores=2
{code}

Please be aware that executor number (--num-executors/spark.executor.instances) 
and dynamic allocation cannot be coexisted, otherwise dynamic allocation will 
be turned off implicitly. So in your case you set executor numbers also, which 
means dynamic allocation is not on actually.

> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813801#comment-15813801
 ] 

Saisai Shao commented on SPARK-19090:
-

I also tested with Spark 1.5.0, I don't see an issue here, the core number is 
still expected as I set:

{noformat}
17/01/10 12:00:31 INFO yarn.YarnRMClient: Registering the ApplicationMaster
17/01/10 12:00:31 INFO yarn.YarnAllocator: Will request 1 executor containers, 
each with 2 cores and 1408 MB memory including 384 MB overhead
17/01/10 12:00:31 INFO yarn.YarnAllocator: Container request (host: Any, 
capability: )
17/01/10 12:00:31 INFO yarn.ApplicationMaster: Started progress reporter thread 
with (heartbeat : 3000, initial allocation : 200) intervals
{noformat}

Can you please tell how do you  run the application?




> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813782#comment-15813782
 ] 

Saisai Shao commented on SPARK-19090:
-

I tested with Spark 2.0 and latest master (2.2.0-SNAPSHOT), the behavior is 
correct, if there's an issue in it, it should be 1.x issue. I'm not sure are we 
still maintaining 1.5 branch to fix the old bug if existed.

> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813704#comment-15813704
 ] 

Saisai Shao commented on SPARK-19090:
-

Thanks for your elaboration, would you please tell which version of Spark did 
you run the test?

> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19090) Dynamic Resource Allocation not respecting spark.executor.cores

2017-01-05 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15803754#comment-15803754
 ] 

Saisai Shao commented on SPARK-19090:
-

"spark.executor.cores" has nothing to do with dynamic allocation, dynamic 
allocation schedules executors not cores. I guess the problem you saw, core 
number is always 1, is due to Dominant Resource Fairness in yarn, please enable 
this if you want cpu scheduling. Otherwise the core number will always be 1 
seen from yarn side.

> Dynamic Resource Allocation not respecting spark.executor.cores
> ---
>
> Key: SPARK-19090
> URL: https://issues.apache.org/jira/browse/SPARK-19090
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.5.2, 1.6.1, 2.0.1
>Reporter: nirav patel
>
> When enabling dynamic scheduling with yarn I see that all executors are using 
> only 1 core even if I specify "spark.executor.cores" to 6. If dynamic 
> scheduling is disabled then each executors will have 6 cores. i.e. it 
> respects  "spark.executor.cores". I have tested this against spark 1.5 . I 
> think it will be the same behavior with 2.x as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org




[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-01-05 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15803722#comment-15803722
 ] 

Saisai Shao commented on SPARK-19038:
-

Please see the comment I made in 
Github(https://github.com/apache/spark/pull/16482), from my understanding the 
behavior is expected.

> Can't find keytab file when using Hive catalog
> --
>
> Key: SPARK-19038
> URL: https://issues.apache.org/jira/browse/SPARK-19038
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.2
> Environment: Hadoop / YARN 2.6, pyspark, yarn-client mode
>Reporter: Peter Parente
>  Labels: hive, kerberos, pyspark
>
> h2. Stack Trace
> {noformat}
> Py4JJavaErrorTraceback (most recent call last)
>  in ()
> > 1 sdf = sql.createDataFrame(df)
> /opt/spark2/python/pyspark/sql/context.py in createDataFrame(self, data, 
> schema, samplingRatio, verifySchema)
> 307 Py4JJavaError: ...
> 308 """
> --> 309 return self.sparkSession.createDataFrame(data, schema, 
> samplingRatio, verifySchema)
> 310 
> 311 @since(1.3)
> /opt/spark2/python/pyspark/sql/session.py in createDataFrame(self, data, 
> schema, samplingRatio, verifySchema)
> 524 rdd, schema = self._createFromLocal(map(prepare, data), 
> schema)
> 525 jrdd = 
> self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
> --> 526 jdf = self._jsparkSession.applySchemaToPythonRDD(jrdd.rdd(), 
> schema.json())
> 527 df = DataFrame(jdf, self._wrapped)
> 528 df._schema = schema
> /opt/spark2/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py in 
> __call__(self, *args)
>1131 answer = self.gateway_client.send_command(command)
>1132 return_value = get_return_value(
> -> 1133 answer, self.gateway_client, self.target_id, self.name)
>1134 
>1135 for temp_arg in temp_args:
> /opt/spark2/python/pyspark/sql/utils.py in deco(*a, **kw)
>  61 def deco(*a, **kw):
>  62 try:
> ---> 63 return f(*a, **kw)
>  64 except py4j.protocol.Py4JJavaError as e:
>  65 s = e.java_exception.toString()
> /opt/spark2/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py in 
> get_return_value(answer, gateway_client, target_id, name)
> 317 raise Py4JJavaError(
> 318 "An error occurred while calling {0}{1}{2}.\n".
> --> 319 format(target_id, ".", name), value)
> 320 else:
> 321 raise Py4JError(
> Py4JJavaError: An error occurred while calling o47.applySchemaToPythonRDD.
> : org.apache.spark.SparkException: Keytab file: 
> .keytab-f0b9b814-460e-4fa8-8e7d-029186b696c4 specified in spark.yarn.keytab 
> does not exist
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:113)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)
>   at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
>   at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
>   at 
> org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
>   at 
> org.apache.spark.sql.hive.HiveSessionState$$anon$1.(HiveSessionState.scala:63)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
>   at 
> org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
>   at 
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>   at 
> org.apache.spark.sql.SparkSession.applySchemaToPythonRDD(SparkSession.scala:666)
>   at 
> 

[jira] [Commented] (SPARK-19033) HistoryServer still uses old ACLs even if ACLs are updated

2017-01-03 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796834#comment-15796834
 ] 

Saisai Shao commented on SPARK-19033:
-

Thanks a lot [~tgraves], I will think about how to better address this issue.

> HistoryServer still uses old ACLs even if ACLs are updated
> --
>
> Key: SPARK-19033
> URL: https://issues.apache.org/jira/browse/SPARK-19033
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> In the current implementation of HistoryServer, Application ACLs is picked 
> from event log rather than configuration:
> {code}
> val uiAclsEnabled = 
> conf.getBoolean("spark.history.ui.acls.enable", false)
> ui.getSecurityManager.setAcls(uiAclsEnabled)
> // make sure to set admin acls before view acls so they are 
> properly picked up
> 
> ui.getSecurityManager.setAdminAcls(appListener.adminAcls.getOrElse(""))
> ui.getSecurityManager.setViewAcls(attempt.sparkUser,
>   appListener.viewAcls.getOrElse(""))
> 
> ui.getSecurityManager.setAdminAclsGroups(appListener.adminAclsGroups.getOrElse(""))
> 
> ui.getSecurityManager.setViewAclsGroups(appListener.viewAclsGroups.getOrElse(""))
> {code}
> This will become a problem when ACLs is updated (newly added admin), only the 
> new application can be effected, the old applications were still using the 
> old ACLs. So these new admin still cannot check the logs of old applications.
> It is hard to say this is a bug, but in our scenario this is not the expected 
> behavior we wanted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19033) HistoryServer still uses old ACLs even if ACLs are updated

2016-12-29 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787031#comment-15787031
 ] 

Saisai Shao commented on SPARK-19033:
-

Ping [~vanzin], I found that you made this change, would you mind explaining 
the purpose of doing so? Thanks very much.

> HistoryServer still uses old ACLs even if ACLs are updated
> --
>
> Key: SPARK-19033
> URL: https://issues.apache.org/jira/browse/SPARK-19033
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> In the current implementation of HistoryServer, Application ACLs is picked 
> from event log rather than configuration:
> {code}
> val uiAclsEnabled = 
> conf.getBoolean("spark.history.ui.acls.enable", false)
> ui.getSecurityManager.setAcls(uiAclsEnabled)
> // make sure to set admin acls before view acls so they are 
> properly picked up
> 
> ui.getSecurityManager.setAdminAcls(appListener.adminAcls.getOrElse(""))
> ui.getSecurityManager.setViewAcls(attempt.sparkUser,
>   appListener.viewAcls.getOrElse(""))
> 
> ui.getSecurityManager.setAdminAclsGroups(appListener.adminAclsGroups.getOrElse(""))
> 
> ui.getSecurityManager.setViewAclsGroups(appListener.viewAclsGroups.getOrElse(""))
> {code}
> This will become a problem when ACLs is updated (newly added admin), only the 
> new application can be effected, the old applications were still using the 
> old ACLs. So these new admin still cannot check the logs of old applications.
> It is hard to say this is a bug, but in our scenario this is not the expected 
> behavior we wanted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-19033) HistoryServer still uses old ACLs even if ACLs are updated

2016-12-29 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-19033:

Summary: HistoryServer still uses old ACLs even if ACLs are updated  (was: 
HistoryServer will honor old ACLs even if ACLs are updated)

> HistoryServer still uses old ACLs even if ACLs are updated
> --
>
> Key: SPARK-19033
> URL: https://issues.apache.org/jira/browse/SPARK-19033
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> In the current implementation of HistoryServer, Application ACLs is picked 
> from event log rather than configuration:
> {code}
> val uiAclsEnabled = 
> conf.getBoolean("spark.history.ui.acls.enable", false)
> ui.getSecurityManager.setAcls(uiAclsEnabled)
> // make sure to set admin acls before view acls so they are 
> properly picked up
> 
> ui.getSecurityManager.setAdminAcls(appListener.adminAcls.getOrElse(""))
> ui.getSecurityManager.setViewAcls(attempt.sparkUser,
>   appListener.viewAcls.getOrElse(""))
> 
> ui.getSecurityManager.setAdminAclsGroups(appListener.adminAclsGroups.getOrElse(""))
> 
> ui.getSecurityManager.setViewAclsGroups(appListener.viewAclsGroups.getOrElse(""))
> {code}
> This will become a problem when ACLs is updated (newly added admin), only the 
> new application can be effected, the old applications were still using the 
> old ACLs. So these new admin still cannot check the logs of old applications.
> It is hard to say this is a bug, but in our scenario this is not the expected 
> behavior we wanted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19033) HistoryServer will honor old ACLs even if ACLs are updated

2016-12-29 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19033:
---

 Summary: HistoryServer will honor old ACLs even if ACLs are updated
 Key: SPARK-19033
 URL: https://issues.apache.org/jira/browse/SPARK-19033
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.1.0
Reporter: Saisai Shao
Priority: Minor


In the current implementation of HistoryServer, Application ACLs is picked from 
event log rather than configuration:

{code}
val uiAclsEnabled = conf.getBoolean("spark.history.ui.acls.enable", 
false)
ui.getSecurityManager.setAcls(uiAclsEnabled)
// make sure to set admin acls before view acls so they are 
properly picked up

ui.getSecurityManager.setAdminAcls(appListener.adminAcls.getOrElse(""))
ui.getSecurityManager.setViewAcls(attempt.sparkUser,
  appListener.viewAcls.getOrElse(""))

ui.getSecurityManager.setAdminAclsGroups(appListener.adminAclsGroups.getOrElse(""))

ui.getSecurityManager.setViewAclsGroups(appListener.viewAclsGroups.getOrElse(""))
{code}

This will become a problem when ACLs is updated (newly added admin), only the 
new application can be effected, the old applications were still using the old 
ACLs. So these new admin still cannot check the logs of old applications.

It is hard to say this is a bug, but in our scenario this is not the expected 
behavior we wanted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-19021) Generailize HDFSCredentialProvider to support non HDFS security FS

2016-12-28 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-19021:
---

 Summary: Generailize HDFSCredentialProvider to support non HDFS 
security FS
 Key: SPARK-19021
 URL: https://issues.apache.org/jira/browse/SPARK-19021
 Project: Spark
  Issue Type: Improvement
  Components: YARN
Affects Versions: 2.1.0
Reporter: Saisai Shao


Currently Spark can only get token renewal interval from security HDFS 
(hdfs://), if Spark runs with other security file systems like webHDFS 
(webhdfs://), wasb (wasb://), ADLS, it will ignore these tokens and not get 
token renewal intervals from these tokens. These will make Spark unable to work 
with these security clusters. So instead of only checking HDFS token, we should 
generalize to support different {{DelegationTokenIdentifier}}.

This is a follow-up work of SPARK-18840.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-19021) Generailize HDFSCredentialProvider to support non HDFS security FS

2016-12-28 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-19021:

Priority: Minor  (was: Major)

> Generailize HDFSCredentialProvider to support non HDFS security FS
> --
>
> Key: SPARK-19021
> URL: https://issues.apache.org/jira/browse/SPARK-19021
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> Currently Spark can only get token renewal interval from security HDFS 
> (hdfs://), if Spark runs with other security file systems like webHDFS 
> (webhdfs://), wasb (wasb://), ADLS, it will ignore these tokens and not get 
> token renewal intervals from these tokens. These will make Spark unable to 
> work with these security clusters. So instead of only checking HDFS token, we 
> should generalize to support different {{DelegationTokenIdentifier}}.
> This is a follow-up work of SPARK-18840.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-18975) Add an API to remove SparkListener from SparkContext

2016-12-21 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-18975:
---

 Summary: Add an API to remove SparkListener from SparkContext 
 Key: SPARK-18975
 URL: https://issues.apache.org/jira/browse/SPARK-18975
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Saisai Shao
Priority: Minor


In current Spark we could add customized {{SparkListener}} through 
{{SparkContext#addListener}} API, but there's no API to remove the registered 
one. In our scenario SparkListener will be added repeatedly accordingly to the 
changed environment. If lacks the ability to remove listeners, there might be 
bunch of registered listeners finally, this is unnecessary and potentially 
affect the performance. So here propose to add an API to remove registered 
listener.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746877#comment-15746877
 ] 

Saisai Shao commented on SPARK-18840:
-

[~vanzin], is it necessary to fix it in old version (2.0/1.6), seems this 
problem seldom happens.

> HDFSCredentialProvider throws exception in non-HDFS security environment
> 
>
> Key: SPARK-18840
> URL: https://issues.apache.org/jira/browse/SPARK-18840
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3, 2.1.0
>Reporter: Saisai Shao
>Assignee: Saisai Shao
>Priority: Minor
> Fix For: 2.1.1, 2.2.0
>
>
> Current in {{HDFSCredentialProvider}}, the code logic assumes HDFS delegation 
> token should be existed, this is ok for HDFS environment, but for some cloud 
> environment like Azure, HDFS is not required, so it will throw exception:
> {code}
> java.util.NoSuchElementException: head of empty list
> at scala.collection.immutable.Nil$.head(List.scala:337)
> at scala.collection.immutable.Nil$.head(List.scala:334)
> at 
> org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:627)
> {code}
> We should also consider this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15744370#comment-15744370
 ] 

Saisai Shao edited comment on SPARK-18840 at 12/13/16 9:30 AM:
---

This problem also existed in branch 1.6, but the fix is a little different 
compared to master.


was (Author: jerryshao):
This problem also existed in branch 1.6, but the fix is a little complicated 
compared to master.

> HDFSCredentialProvider throws exception in non-HDFS security environment
> 
>
> Key: SPARK-18840
> URL: https://issues.apache.org/jira/browse/SPARK-18840
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3, 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> Current in {{HDFSCredentialProvider}}, the code logic assumes HDFS delegation 
> token should be existed, this is ok for HDFS environment, but for some cloud 
> environment like Azure, HDFS is not required, so it will throw exception:
> {code}
> java.util.NoSuchElementException: head of empty list
> at scala.collection.immutable.Nil$.head(List.scala:337)
> at scala.collection.immutable.Nil$.head(List.scala:334)
> at 
> org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:627)
> {code}
> We should also consider this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-12 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-18840:

Priority: Minor  (was: Major)

> HDFSCredentialProvider throws exception in non-HDFS security environment
> 
>
> Key: SPARK-18840
> URL: https://issues.apache.org/jira/browse/SPARK-18840
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3, 2.1.0
>Reporter: Saisai Shao
>Priority: Minor
>
> Current in {{HDFSCredentialProvider}}, the code logic assumes HDFS delegation 
> token should be existed, this is ok for HDFS environment, but for some cloud 
> environment like Azure, HDFS is not required, so it will throw exception:
> {code}
> java.util.NoSuchElementException: head of empty list
> at scala.collection.immutable.Nil$.head(List.scala:337)
> at scala.collection.immutable.Nil$.head(List.scala:334)
> at 
> org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:627)
> {code}
> We should also consider this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15744370#comment-15744370
 ] 

Saisai Shao commented on SPARK-18840:
-

This problem also existed in branch 1.6, but the fix is a little complicated 
compared to master.

> HDFSCredentialProvider throws exception in non-HDFS security environment
> 
>
> Key: SPARK-18840
> URL: https://issues.apache.org/jira/browse/SPARK-18840
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.6.3, 2.1.0
>Reporter: Saisai Shao
>
> Current in {{HDFSCredentialProvider}}, the code logic assumes HDFS delegation 
> token should be existed, this is ok for HDFS environment, but for some cloud 
> environment like Azure, HDFS is not required, so it will throw exception:
> {code}
> java.util.NoSuchElementException: head of empty list
> at scala.collection.immutable.Nil$.head(List.scala:337)
> at scala.collection.immutable.Nil$.head(List.scala:334)
> at 
> org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:627)
> {code}
> We should also consider this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-12 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-18840:
---

 Summary: HDFSCredentialProvider throws exception in non-HDFS 
security environment
 Key: SPARK-18840
 URL: https://issues.apache.org/jira/browse/SPARK-18840
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 1.6.3, 2.1.0
Reporter: Saisai Shao


Current in {{HDFSCredentialProvider}}, the code logic assumes HDFS delegation 
token should be existed, this is ok for HDFS environment, but for some cloud 
environment like Azure, HDFS is not required, so it will throw exception:

{code}
java.util.NoSuchElementException: head of empty list
at scala.collection.immutable.Nil$.head(List.scala:337)
at scala.collection.immutable.Nil$.head(List.scala:334)
at 
org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:627)
{code}

We should also consider this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-12-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736957#comment-15736957
 ] 

Saisai Shao commented on SPARK-13955:
-

Yes, forgot to mention this zip file doesn't support nested directory.

> Spark in yarn mode fails
> 
>
> Key: SPARK-13955
> URL: https://issues.apache.org/jira/browse/SPARK-13955
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.0
>Reporter: Jeff Zhang
>Assignee: Marcelo Vanzin
> Fix For: 2.0.0
>
>
> I ran spark-shell in yarn client, but from the logs seems the spark assembly 
> jar is not uploaded to HDFS. This may be known issue in the process of 
> SPARK-11157, create this ticket to track this issue. [~vanzin]
> {noformat}
> 16/03/17 17:57:48 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 16/03/17 17:57:48 INFO Client: Setting up container launch context for our AM
> 16/03/17 17:57:48 INFO Client: Setting up the launch environment for our AM 
> container
> 16/03/17 17:57:48 INFO Client: Preparing resources for our AM container
> 16/03/17 17:57:48 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
> is set, falling back to uploading libraries under SPARK_HOME.
> 16/03/17 17:57:48 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.10.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.10.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.11.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.11.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/private/var/folders/dp/hmchg5dd3vbcvds26q91spdwgp/T/spark-abed04bf-6ac2-448b-91a9-dcc1c401a18f/__spark_conf__4163776487351314654.zip
>  -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/__spark_conf__4163776487351314654.zip
> 16/03/17 17:57:49 INFO SecurityManager: Changing view acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: Changing modify acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(jzhang); users 
> with modify permissions: Set(jzhang)
> 16/03/17 17:57:49 INFO Client: Submitting application 6 to ResourceManager
> {noformat}
> message in AM container
> {noformat}
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ExecutorLauncher
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-12-08 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734207#comment-15734207
 ] 

Saisai Shao commented on SPARK-13955:
-

Can you please check the runtime environment of launching container, it should 
lie in NM's local dir. 

{code}
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}/container_${contid}
{code}

When NM brings up container, it will create a container specific folder and put 
all the dependencies, files,. etc to that folder, also including launching 
script. You could check whether classpath is correct, or if archive is found 
there.

> Spark in yarn mode fails
> 
>
> Key: SPARK-13955
> URL: https://issues.apache.org/jira/browse/SPARK-13955
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.0
>Reporter: Jeff Zhang
>Assignee: Marcelo Vanzin
> Fix For: 2.0.0
>
>
> I ran spark-shell in yarn client, but from the logs seems the spark assembly 
> jar is not uploaded to HDFS. This may be known issue in the process of 
> SPARK-11157, create this ticket to track this issue. [~vanzin]
> {noformat}
> 16/03/17 17:57:48 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 16/03/17 17:57:48 INFO Client: Setting up container launch context for our AM
> 16/03/17 17:57:48 INFO Client: Setting up the launch environment for our AM 
> container
> 16/03/17 17:57:48 INFO Client: Preparing resources for our AM container
> 16/03/17 17:57:48 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
> is set, falling back to uploading libraries under SPARK_HOME.
> 16/03/17 17:57:48 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.10.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.10.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.11.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.11.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/private/var/folders/dp/hmchg5dd3vbcvds26q91spdwgp/T/spark-abed04bf-6ac2-448b-91a9-dcc1c401a18f/__spark_conf__4163776487351314654.zip
>  -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/__spark_conf__4163776487351314654.zip
> 16/03/17 17:57:49 INFO SecurityManager: Changing view acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: Changing modify acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(jzhang); users 
> with modify permissions: Set(jzhang)
> 16/03/17 17:57:49 INFO Client: Submitting application 6 to ResourceManager
> {noformat}
> message in AM container
> {noformat}
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ExecutorLauncher
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-12-08 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734184#comment-15734184
 ] 

Saisai Shao commented on SPARK-13955:
-

Do you have spark-yarn_2.11 jar in your archive?

> Spark in yarn mode fails
> 
>
> Key: SPARK-13955
> URL: https://issues.apache.org/jira/browse/SPARK-13955
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.0
>Reporter: Jeff Zhang
>Assignee: Marcelo Vanzin
> Fix For: 2.0.0
>
>
> I ran spark-shell in yarn client, but from the logs seems the spark assembly 
> jar is not uploaded to HDFS. This may be known issue in the process of 
> SPARK-11157, create this ticket to track this issue. [~vanzin]
> {noformat}
> 16/03/17 17:57:48 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 16/03/17 17:57:48 INFO Client: Setting up container launch context for our AM
> 16/03/17 17:57:48 INFO Client: Setting up the launch environment for our AM 
> container
> 16/03/17 17:57:48 INFO Client: Preparing resources for our AM container
> 16/03/17 17:57:48 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
> is set, falling back to uploading libraries under SPARK_HOME.
> 16/03/17 17:57:48 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.10.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.10.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.11.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.11.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/private/var/folders/dp/hmchg5dd3vbcvds26q91spdwgp/T/spark-abed04bf-6ac2-448b-91a9-dcc1c401a18f/__spark_conf__4163776487351314654.zip
>  -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/__spark_conf__4163776487351314654.zip
> 16/03/17 17:57:49 INFO SecurityManager: Changing view acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: Changing modify acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(jzhang); users 
> with modify permissions: Set(jzhang)
> 16/03/17 17:57:49 INFO Client: Submitting application 6 to ResourceManager
> {noformat}
> message in AM container
> {noformat}
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ExecutorLauncher
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-12-08 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734071#comment-15734071
 ] 

Saisai Shao commented on SPARK-13955:
-

IIRC {{spark.yarn.archive}} should be worked, I tried personally in my local 
machine, also our HDP distribution by default configured it.

If you want to use "spark.yarn.archive", you should zip all the Spark run-time 
required jars, put this archive either locally or on hdfs and configured the 
path for "spark.yarn.archive". Then yarn will add it to distributed cache.

Can you please tell how you configured and the error you met?

> Spark in yarn mode fails
> 
>
> Key: SPARK-13955
> URL: https://issues.apache.org/jira/browse/SPARK-13955
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.0
>Reporter: Jeff Zhang
>Assignee: Marcelo Vanzin
> Fix For: 2.0.0
>
>
> I ran spark-shell in yarn client, but from the logs seems the spark assembly 
> jar is not uploaded to HDFS. This may be known issue in the process of 
> SPARK-11157, create this ticket to track this issue. [~vanzin]
> {noformat}
> 16/03/17 17:57:48 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 16/03/17 17:57:48 INFO Client: Setting up container launch context for our AM
> 16/03/17 17:57:48 INFO Client: Setting up the launch environment for our AM 
> container
> 16/03/17 17:57:48 INFO Client: Preparing resources for our AM container
> 16/03/17 17:57:48 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
> is set, falling back to uploading libraries under SPARK_HOME.
> 16/03/17 17:57:48 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.10.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.10.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/Users/jzhang/github/spark/lib/apache-rat-0.11.jar -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/apache-rat-0.11.jar
> 16/03/17 17:57:49 INFO Client: Uploading resource 
> file:/private/var/folders/dp/hmchg5dd3vbcvds26q91spdwgp/T/spark-abed04bf-6ac2-448b-91a9-dcc1c401a18f/__spark_conf__4163776487351314654.zip
>  -> 
> hdfs://localhost:9000/user/jzhang/.sparkStaging/application_1458187008455_0006/__spark_conf__4163776487351314654.zip
> 16/03/17 17:57:49 INFO SecurityManager: Changing view acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: Changing modify acls to: jzhang
> 16/03/17 17:57:49 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(jzhang); users 
> with modify permissions: Set(jzhang)
> 16/03/17 17:57:49 INFO Client: Submitting application 6 to ResourceManager
> {noformat}
> message in AM container
> {noformat}
> Error: Could not find or load main class 
> org.apache.spark.deploy.yarn.ExecutorLauncher
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18743) StreamingContext.textFileStream(directory) has no events shown in Web UI

2016-12-07 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15728710#comment-15728710
 ] 

Saisai Shao commented on SPARK-18743:
-

>From my understanding, {{FileInputDStream}} is not a record based streaming 
>connector, it is a file based connector. How to define an event in file based 
>connector is a question. So in the implementation of Spark Streaming this 
>event number is deliberately set to 0 for this input DStream.

> StreamingContext.textFileStream(directory) has no events shown in Web UI
> 
>
> Key: SPARK-18743
> URL: https://issues.apache.org/jira/browse/SPARK-18743
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 1.6.0
> Environment: Cloudera
>Reporter: Viktor Vojnovski
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> StreamingContext.textFileStream input is not reflected in the Web UI, ie. the 
> Input rate stays at 0 events/sec (see attached screenshot).
> Please find below a reproduction scenario, and a link to the same issue being 
> reported on the spark user/developer lists.
> http://mail-archives.apache.org/mod_mbox/spark-user/201604.mbox/%3CCAEko17iCNeeOzEbwqH9vGAkgXEpH3Rw=bwmkdfoozcx30zj...@mail.gmail.com%3E
> http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Streaming-textFileStream-has-no-events-shown-in-web-UI-td17101.html
> [vvojnovski@machine:~] % cat a.py
> from __future__ import print_function
> from pyspark import SparkContext, SparkConf
> from pyspark.streaming import StreamingContext
> SparkContext.setSystemProperty('spark.executor.instances', '3')
> conf = (SparkConf()
> .setMaster("yarn-client")
> .setAppName("My app")
> .set("spark.executor.memory", "1g"))
> sc = SparkContext(conf=conf)
> ssc = StreamingContext(sc, 5)
> lines = ssc.textFileStream("testin")
> counts = lines.flatMap(lambda line: line.split(" "))\
>   .map(lambda x: (x, 1))\
>   .reduceByKey(lambda a, b: a+b)
> counts.pprint()
> ssc.start()
> ssc.awaitTermination()
> [vvojnovski@machine:~] % cat testin.input 
> 1 2
> 3 4
> 5 6
> 7 8
> 9 10
> 11 12
> [vvojnovski@machine:~] % hdfs dfs –mkdir testin
> [vvojnovski@machine:~] % spark-submit a.py &
> [vvojnovski@machine:~] % hdfs dfs -put testin.input testin/testin.input.1
> [vvojnovski@machine:~] % hdfs dfs -put testin.input testin/testin.input.2
> [vvojnovski@machine:~] % hdfs dfs -put testin.input testin/testin.input.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18554) leader master lost the leadership, when the slave become master, the perivious app's state display as waitting

2016-11-25 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15695947#comment-15695947
 ] 

Saisai Shao commented on SPARK-18554:
-

Nothing blocked actually, just no one reviewed that PR, also it is not a big 
issue (only the state is not showing correctly).

> leader master lost the leadership, when the slave become master, the 
> perivious app's state display as waitting
> --
>
> Key: SPARK-18554
> URL: https://issues.apache.org/jira/browse/SPARK-18554
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Web UI
>Affects Versions: 1.6.1
> Environment: java1.8
>Reporter: liujianhui
>Priority: Minor
>
> when the leader of master lost the leadship and the slave become master, the 
> state of app in the webui will display waiting; this code as follow
>  case MasterChangeAcknowledged(appId) => {
>   idToApp.get(appId) match {
> case Some(app) =>
>   logInfo("Application has been re-registered: " + appId)
>   app.state = ApplicationState.WAITING
> case None =>
>   logWarning("Master change ack from unknown app: " + appId)
>   }
>   if (canCompleteRecovery) { completeRecovery() }
> the state of app should be RUNNING instead of waiting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18554) leader master lost the leadership, when the slave become master, the perivious app's state display as waitting

2016-11-24 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694563#comment-15694563
 ] 

Saisai Shao commented on SPARK-18554:
-

I think this is a known issue and I have a fix for it long long ago 
(https://github.com/apache/spark/pull/10506), but still pending.

> leader master lost the leadership, when the slave become master, the 
> perivious app's state display as waitting
> --
>
> Key: SPARK-18554
> URL: https://issues.apache.org/jira/browse/SPARK-18554
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Web UI
>Affects Versions: 1.6.1
> Environment: java1.8
>Reporter: liujianhui
>Priority: Minor
>
> when the leader of master lost the leadship and the slave become master, the 
> state of app in the webui will display waiting; this code as follow
>  case MasterChangeAcknowledged(appId) => {
>   idToApp.get(appId) match {
> case Some(app) =>
>   logInfo("Application has been re-registered: " + appId)
>   app.state = ApplicationState.WAITING
> case None =>
>   logWarning("Master change ack from unknown app: " + appId)
>   }
>   if (canCompleteRecovery) { completeRecovery() }
> the state of app should be RUNNING instead of waiting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18225) job will miss when driver removed by master in spark streaming

2016-11-04 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635846#comment-15635846
 ] 

Saisai Shao edited comment on SPARK-18225 at 11/4/16 10:01 AM:
---

I think the problem [~liujianhui] trying to address is that when using web UI's 
kill function to kill the streaming application, though SparkContext is 
aborted, other Streaming specific stuffs like checkpointing is still working, 
so when the streaming application is recovered from last checkpoint, some jobs 
are missing (though checkpointed, but actually not run).

The problem here is that Web UI (core part) is not aware of Streaming specific 
things, what it could only do is to stop SparkContext related stuff, but it 
cannot gracefully stop the things out of Spark core, like streaming. 

I'm not sure is it necessary to fix this issue, since we don't encourage user 
to stop streaming app in this way. Also it may not be easy to fix (the reason 
is mentioned above, core part cannot be aware of Streaming specific things), 
and the problem is not a Streaming specific problem, other Context like SQL may 
also meet this problem if it has its own context specific things out of core 
should be stopped gracefully.


was (Author: jerryshao):
I think the problem [~liujianhui] trying to address is that when using web UI's 
kill function to kill the streaming application, though SparkContext is 
aborted, other Streaming specific stuffs like checkpointing is still working, 
so when the streaming application is recovered from last checkpoint, some jobs 
are missing (though checkpointed, but actually not run).

The problem here is that Web UI (core part) is not aware of Streaming specific 
things, what it could only do is to stop SparkContext related stuff, but it 
cannot fully stop the things out of Spark core, like streaming. 

I'm not sure is it necessary to fix this issue, since we don't encourage user 
to stop streaming app in this way. Also it may not be easy to fix (the reason 
is mentioned above, core part cannot be aware of Streaming specific things), 
and the problem is not a Streaming specific problem, other Context like SQL may 
also meet this problem if it has its own context specific things out of core.

> job will miss when driver removed by master in spark streaming 
> ---
>
> Key: SPARK-18225
> URL: https://issues.apache.org/jira/browse/SPARK-18225
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, Scheduler
>Affects Versions: 1.6.1, 1.6.2
>Reporter: liujianhui
>
> kill the application on spark ui, the master will send an ApplicationRemoved 
> to driver, driver will abort the all pending job,and then the job finish with 
> exception "Master removed our application:Killed",and then Jobscheduler will 
> remove the job from jobsets, but the jobgenerator still docheckpoint without 
> the job which removed before, and then driver stop;when recover  from the 
> check point file,it miss all jobs which aborted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18225) job will miss when driver removed by master in spark streaming

2016-11-04 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635846#comment-15635846
 ] 

Saisai Shao commented on SPARK-18225:
-

I think the problem [~liujianhui] trying to address is that when using web UI's 
kill function to kill the streaming application, though SparkContext is 
aborted, other Streaming specific stuffs like checkpointing is still working, 
so when the streaming application is recovered from last checkpoint, some jobs 
are missing (though checkpointed, but actually not run).

The problem here is that Web UI (core part) is not aware of Streaming specific 
things, what it could only do is to stop SparkContext related stuff, but it 
cannot fully stop the things out of Spark core, like streaming. 

I'm not sure is it necessary to fix this issue, since we don't encourage user 
to stop streaming app in this way. Also it may not be easy to fix (the reason 
is mentioned above, core part cannot be aware of Streaming specific things), 
and the problem is not a Streaming specific problem, other Context like SQL may 
also meet this problem if it has its own context specific things out of core.

> job will miss when driver removed by master in spark streaming 
> ---
>
> Key: SPARK-18225
> URL: https://issues.apache.org/jira/browse/SPARK-18225
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, Scheduler
>Affects Versions: 1.6.1, 1.6.2
>Reporter: liujianhui
>
> kill the application on spark ui, the master will send an ApplicationRemoved 
> to driver, driver will abort the all pending job,and then the job finish with 
> exception "Master removed our application:Killed",and then Jobscheduler will 
> remove the job from jobsets, but the jobgenerator still docheckpoint without 
> the job which removed before, and then driver stop;when recover  from the 
> check point file,it miss all jobs which aborted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18225) job will miss when driver removed by master in spark streaming

2016-11-03 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15631915#comment-15631915
 ] 

Saisai Shao commented on SPARK-18225:
-

I simply doubt that kill button provided on the web UI does not give a clean 
stop for spark streaming jobs. If you want a graceful stop, it would be better 
to call StreamingContext#stop instead.

> job will miss when driver removed by master in spark streaming 
> ---
>
> Key: SPARK-18225
> URL: https://issues.apache.org/jira/browse/SPARK-18225
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams, Scheduler
>Affects Versions: 1.6.1, 1.6.2
>Reporter: liujianhui
>
> kill the application on spark ui, the master will send an ApplicationRemoved 
> to driver, driver will abort the all pending job,and then the job finish with 
> exception "Master removed our application:Killed",and then Jobscheduler will 
> remove the job from jobsets, but the jobgenerator still docheckpoint without 
> the job which removed before, and then driver stop;when recover  from the 
> check point file,it miss all jobs which aborted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17999) Add getPreferredLocations for KafkaSourceRDD

2016-10-18 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17999:
---

 Summary: Add getPreferredLocations for KafkaSourceRDD
 Key: SPARK-17999
 URL: https://issues.apache.org/jira/browse/SPARK-17999
 Project: Spark
  Issue Type: Improvement
  Components: SQL, Streaming
Reporter: Saisai Shao
Priority: Minor


The newly implemented Structured Streaming KafkaSource did calculate the 
preferred locations for each topic partition, but didn't offer this information 
through RDD's {{getPreferredLocations}} method. So here propose to add this 
method in {{KafkaSourceRDD}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17984) Add support for numa aware feature

2016-10-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587459#comment-15587459
 ] 

Saisai Shao commented on SPARK-17984:
-

NUMA should be supported by most commodity servers as well as HPC. But 
{{numactl}} may not be installed by default in most OSes. Also other systems 
like Windows or Mac may not have equal tools, please be considered.

> Add support for numa aware feature
> --
>
> Key: SPARK-17984
> URL: https://issues.apache.org/jira/browse/SPARK-17984
> Project: Spark
>  Issue Type: New Feature
>  Components: Deploy, Mesos, YARN
>Affects Versions: 2.0.1
> Environment: Cluster Topo: 1 Master + 4 Slaves
> CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz(72 Cores)
> Memory: 128GB(2 NUMA Nodes)
> SW Version: Hadoop-5.7.0 + Spark-2.0.0
>Reporter: quanfuwang
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This Jira is target to add support numa aware feature which can help improve 
> performance by making core access local memory rather than remote one. 
>  A patch is being developed, see https://github.com/apache/spark/pull/15524.
> And the whole task includes 3 subtasks and will be developed iteratively:
> Numa aware support for Yarn based deployment mode
> Numa aware support for Mesos based deployment mode
> Numa aware support for Standalone based deployment mode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17787) spark submit throws error while using kafka Appender log4j:ERROR Could not instantiate class [kafka.producer.KafkaLog4jAppender]. java.lang.ClassNotFoundException: kaf

2016-10-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568020#comment-15568020
 ] 

Saisai Shao commented on SPARK-17787:
-

I guess you don't have jars on the node where AM/Executors launch in path 
"/home/hos/KafkaApp/kafka_2.10-0.8.0.jar".

> spark submit throws error while using kafka Appender log4j:ERROR Could not 
> instantiate class [kafka.producer.KafkaLog4jAppender]. 
> java.lang.ClassNotFoundException: kafka.producer.KafkaLog4jAppender
> -
>
> Key: SPARK-17787
> URL: https://issues.apache.org/jira/browse/SPARK-17787
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.6.2
>Reporter: Taukir
>
> While trying to push spark app logs to kafka , I am trying to use 
> KafkaAppender. Please find the command below.It throws the following  error
> spark-submit --class com.SampleApp --conf spark.ui.port=8081 --master 
> yarn-cluster
> --files 
> files/home/hos/KafkaApp/log4j.properties#log4j.properties,/home/hos/KafkaApp/kafka_2.10-0.8.0.jar
> --conf 
> spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/home/hos/KafkaApp/log4j.properties'
> --conf "spark.driver.extraClassPath=/home/hos/KafkaApp/kafka_2.10-0.8.0.jar"
> --conf 
> spark.executor.extraJavaOptions='-Dlog4j.configuration=file:/home/hos/KafkaApp/log4j.properties'
> --conf 
> "spark.executor.extraClassPath=/home/hos/KafkaApp/kafka_2.10-0.8.0.jar" 
> Kafka-App-assembly-1.0.jar
>  it it throws java.lang.ClassNotFoundException: 
> kafka.producer.KafkaLog4jAppender exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao closed SPARK-17814.
---
Resolution: Won't Fix

> spark submit arguments are truncated in yarn-cluster mode
> -
>
> Key: SPARK-17814
> URL: https://issues.apache.org/jira/browse/SPARK-17814
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit, YARN
>Affects Versions: 1.6.1
>Reporter: shreyas subramanya
>Priority: Minor
>
> {noformat}
> One of our customers is trying to pass in json through spark-submit as 
> follows:
> spark-submit --verbose --class SimpleClass --master yarn-cluster ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed arguments as: {"mode":"wf", 
> "arrays":{"array":[1]
> If the same application is submitted in yarn-client mode, as follows:
> spark-submit --verbose --class SimpleClass --master yarn-client ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed args as: {"mode":"wf", 
> "arrays":{"array":[1]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567964#comment-15567964
 ] 

Saisai Shao edited comment on SPARK-17814 at 10/12/16 7:57 AM:
---

By checking the yarn code, I think you might run into YARN's trap, Yarn will 
replace the environments with braces to $XXX, so in your case last braces will 
be replaced with {{""}}, that's why last two braces are truncated. Please see 
the yarn code here:

{code}
  @VisibleForTesting
  public static String expandEnvironment(String var,
  Path containerLogDir) {
var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
  containerLogDir.toString());
var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
  File.pathSeparator);

// replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
// as %VAR% and on Linux replaced as "$VAR"
if (Shell.WINDOWS) {
  var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
} else {
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");
}
return var;
  }
{code}

Especially this line {{var = 
var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");}}. The right 
braces will be replaced with empty String.

So this is actually not a Spark issue, and may not easily be fixed by Yarn. But 
you can have many workarounds to avoid passing json String with arguments.


was (Author: jerryshao):
By checking the yarn code, I think you might run into YARN's trap, Yarn will 
replace the environments with braces to $JAVA, so in your case last braces will 
be replaced with {{""}}, that's why last two braces are truncated. Please see 
the yarn code here:

{code}
  @VisibleForTesting
  public static String expandEnvironment(String var,
  Path containerLogDir) {
var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
  containerLogDir.toString());
var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
  File.pathSeparator);

// replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
// as %VAR% and on Linux replaced as "$VAR"
if (Shell.WINDOWS) {
  var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
} else {
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");
}
return var;
  }
{code}

Especially this line {{var = 
var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");}}. The right 
braces will be replaced with empty String.

So this is actually not a Spark issue, and may not easily be fixed by Yarn. But 
you can have many workarounds to avoid passing json String with arguments.

> spark submit arguments are truncated in yarn-cluster mode
> -
>
> Key: SPARK-17814
> URL: https://issues.apache.org/jira/browse/SPARK-17814
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit, YARN
>Affects Versions: 1.6.1
>Reporter: shreyas subramanya
>Priority: Minor
>
> {noformat}
> One of our customers is trying to pass in json through spark-submit as 
> follows:
> spark-submit --verbose --class SimpleClass --master yarn-cluster ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed arguments as: {"mode":"wf", 
> "arrays":{"array":[1]
> If the same application is submitted in yarn-client mode, as follows:
> spark-submit --verbose --class SimpleClass --master yarn-client ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed args as: {"mode":"wf", 
> "arrays":{"array":[1]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567964#comment-15567964
 ] 

Saisai Shao edited comment on SPARK-17814 at 10/12/16 7:56 AM:
---

By checking the yarn code, I think you might run into YARN's trap, Yarn will 
replace the environments with braces to $JAVA, so in your case last braces will 
be replaced with {{""}}, that's why last two braces are truncated. Please see 
the yarn code here:

{code}
  @VisibleForTesting
  public static String expandEnvironment(String var,
  Path containerLogDir) {
var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
  containerLogDir.toString());
var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
  File.pathSeparator);

// replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
// as %VAR% and on Linux replaced as "$VAR"
if (Shell.WINDOWS) {
  var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
} else {
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");
}
return var;
  }
{code}

Especially this line {{var = 
var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");}}. The right 
braces will be replaced with empty String.

So this is actually not a Spark issue, and may not easily be fixed by Yarn. But 
you can have many workarounds to avoid passing json String with arguments.


was (Author: jerryshao):
By checking the yarn code, I think you might run into YARN's trap, Yarn will 
replace the environments in command from {{"{{JAVA}}"}} to {{"$JAVA"}}, so in 
your case last braces will be replaced with {{""}}. Please see the yarn code 
here:

{code}
  @VisibleForTesting
  public static String expandEnvironment(String var,
  Path containerLogDir) {
var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
  containerLogDir.toString());
var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
  File.pathSeparator);

// replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
// as %VAR% and on Linux replaced as "$VAR"
if (Shell.WINDOWS) {
  var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
} else {
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");
}
return var;
  }
{code}

Especially this line {{var = 
var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");}}. The right 
braces will be replaced with empty String.

So this is actually not a Spark issue, and may not easily be fixed by Yarn. But 
you can have many workarounds to avoid passing json String with arguments.

> spark submit arguments are truncated in yarn-cluster mode
> -
>
> Key: SPARK-17814
> URL: https://issues.apache.org/jira/browse/SPARK-17814
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit, YARN
>Affects Versions: 1.6.1
>Reporter: shreyas subramanya
>Priority: Minor
>
> {noformat}
> One of our customers is trying to pass in json through spark-submit as 
> follows:
> spark-submit --verbose --class SimpleClass --master yarn-cluster ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed arguments as: {"mode":"wf", 
> "arrays":{"array":[1]
> If the same application is submitted in yarn-client mode, as follows:
> spark-submit --verbose --class SimpleClass --master yarn-client ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed args as: {"mode":"wf", 
> "arrays":{"array":[1]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567964#comment-15567964
 ] 

Saisai Shao commented on SPARK-17814:
-

By checking the yarn code, I think you might run into YARN's trap, Yarn will 
replace the environments in command from {{"{{JAVA}}"}} to {{"$JAVA"}}, so in 
your case last braces will be replaced with {{""}}. Please see the yarn code 
here:

{code}
  @VisibleForTesting
  public static String expandEnvironment(String var,
  Path containerLogDir) {
var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
  containerLogDir.toString());
var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
  File.pathSeparator);

// replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
// as %VAR% and on Linux replaced as "$VAR"
if (Shell.WINDOWS) {
  var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
} else {
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
  var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");
}
return var;
  }
{code}

Especially this line {{var = 
var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");}}. The right 
braces will be replaced with empty String.

So this is actually not a Spark issue, and may not easily be fixed by Yarn. But 
you can have many workarounds to avoid passing json String with arguments.

> spark submit arguments are truncated in yarn-cluster mode
> -
>
> Key: SPARK-17814
> URL: https://issues.apache.org/jira/browse/SPARK-17814
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit, YARN
>Affects Versions: 1.6.1
>Reporter: shreyas subramanya
>Priority: Minor
>
> {noformat}
> One of our customers is trying to pass in json through spark-submit as 
> follows:
> spark-submit --verbose --class SimpleClass --master yarn-cluster ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed arguments as: {"mode":"wf", 
> "arrays":{"array":[1]
> If the same application is submitted in yarn-client mode, as follows:
> spark-submit --verbose --class SimpleClass --master yarn-client ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed args as: {"mode":"wf", 
> "arrays":{"array":[1]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567933#comment-15567933
 ] 

Saisai Shao commented on SPARK-17814:
-

Which version of YARN do you use?

> spark submit arguments are truncated in yarn-cluster mode
> -
>
> Key: SPARK-17814
> URL: https://issues.apache.org/jira/browse/SPARK-17814
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit, YARN
>Affects Versions: 1.6.1
>Reporter: shreyas subramanya
>Priority: Minor
>
> {noformat}
> One of our customers is trying to pass in json through spark-submit as 
> follows:
> spark-submit --verbose --class SimpleClass --master yarn-cluster ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed arguments as: {"mode":"wf", 
> "arrays":{"array":[1]
> If the same application is submitted in yarn-client mode, as follows:
> spark-submit --verbose --class SimpleClass --master yarn-client ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed args as: {"mode":"wf", 
> "arrays":{"array":[1]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567931#comment-15567931
 ] 

Saisai Shao commented on SPARK-17814:
-

This looks like a YARN problem, after tracking the code path, I found that the 
command arguments are still correct in the Spark side:

{code}
16/10/12 15:26:50 INFO Client: command:
16/10/12 15:26:50 INFO Client: {{JAVA_HOME}}/bin/java -server -Xmx1024m 
-Djava.io.tmpdir={{PWD}}/tmp -Dspark.yarn.app.container.log.dir= 
org.apache.spark.deploy.yarn.ApplicationMaster --class 
'org.apache.spark.examples.SparkPi' --arg '{\"mode\":\"wf\", 
\"arrays\":{\"array\":[1]}}' --properties-file 
{{PWD}}/__spark_conf__/__spark_conf__.properties 1> /stdout 2> 
/stderr
{code}

But in the Yarn {{launch_container.sh}}, the command changes to:

{code}
exec /bin/bash -c "$JAVA_HOME/bin/java -server -Xmx1024m 
-Djava.io.tmpdir=$PWD/tmp 
-Dspark.yarn.app.container.log.dir=/Users/sshao/projects/hadoop-2.7.1/logs/userlogs/application_1476255818919_0004/container_1476255818919_0004_01_01
 org.apache.spark.deploy.yarn.ApplicationMaster --class 
'org.apache.spark.examples.SparkPi' --arg '{\"mode\":\"wf\", 
\"arrays\":{\"array\":[1]' --properties-file 
$PWD/__spark_conf__/__spark_conf__.properties 1> 
/Users/sshao/projects/hadoop-2.7.1/logs/userlogs/application_1476255818919_0004/container_1476255818919_0004_01_01/stdout
 2> 
/Users/sshao/projects/hadoop-2.7.1/logs/userlogs/application_1476255818919_0004/container_1476255818919_0004_01_01/stderr"
hadoop_shell_errorcode=$?
{code}

Here you can see "}}" is truncated. So I guess it might be a YARN issue, 
because YARN will replace some something like {{PWD}}.


> spark submit arguments are truncated in yarn-cluster mode
> -
>
> Key: SPARK-17814
> URL: https://issues.apache.org/jira/browse/SPARK-17814
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit, YARN
>Affects Versions: 1.6.1
>Reporter: shreyas subramanya
>Priority: Minor
>
> {noformat}
> One of our customers is trying to pass in json through spark-submit as 
> follows:
> spark-submit --verbose --class SimpleClass --master yarn-cluster ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed arguments as: {"mode":"wf", 
> "arrays":{"array":[1]
> If the same application is submitted in yarn-client mode, as follows:
> spark-submit --verbose --class SimpleClass --master yarn-client ./simple.jar 
> "{\"mode\":\"wf\", \"arrays\":{\"array\":[1]}}"
> The application reports the passed args as: {"mode":"wf", 
> "arrays":{"array":[1]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17846) A bad state of Running Applications with spark standalone HA

2016-10-11 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567662#comment-15567662
 ] 

Saisai Shao commented on SPARK-17846:
-

I think this issue should be the same as SPARK-14262.

> A bad state of Running Applications with spark standalone HA 
> -
>
> Key: SPARK-17846
> URL: https://issues.apache.org/jira/browse/SPARK-17846
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 1.6.0
>Reporter: dylanzhou
>Priority: Critical
> Attachments: Problem screenshots.jpg
>
>
> i am use standalone mode,when i use HA from two ways,i found the applications 
> states was "WAITING",Is this a bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17686) Propose to print Scala version in "spark-submit --version" command

2016-09-27 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17686:
---

 Summary: Propose to print Scala version in "spark-submit 
--version" command
 Key: SPARK-17686
 URL: https://issues.apache.org/jira/browse/SPARK-17686
 Project: Spark
  Issue Type: Improvement
  Components: Spark Submit
Reporter: Saisai Shao
Priority: Minor


Currently we have a use case that needs to upload different jars to Spark 
according to scala version. For now only after launching Spark application can 
we know which version of Scala it depends on. It makes hard for some services 
which needs to support different Scala + Spark versions to pick the right jars. 

So here propose to print out Scala version according to Spark version in 
"spark-submit --version", so that user could leverage this output to make the 
choice without needing to launching application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17678) Spark 1.6 Scala-2.11 repl doesn't honor "spark.replClassServer.port"

2016-09-26 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17678:
---

 Summary: Spark 1.6 Scala-2.11 repl doesn't honor 
"spark.replClassServer.port"
 Key: SPARK-17678
 URL: https://issues.apache.org/jira/browse/SPARK-17678
 Project: Spark
  Issue Type: Bug
  Components: Spark Shell
Affects Versions: 1.6.3
Reporter: Saisai Shao


Spark 1.6 Scala-2.11 repl doesn't honor "spark.replClassServer.port" 
configuration, so user cannot set a fixed port number through 
"spark.replClassServer.port".

There's no issue in Spark2.0+, since this class is removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-09-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515836#comment-15515836
 ] 

Saisai Shao commented on SPARK-17637:
-

[~zhanzhang] would you mind sharing more details about your scheduling 
mechanism, also does it specifically bind to dynamic allocation? 

> Packed scheduling for Spark tasks across executors
> --
>
> Key: SPARK-17637
> URL: https://issues.apache.org/jira/browse/SPARK-17637
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Reporter: Zhan Zhang
>Priority: Minor
>
> Currently Spark scheduler implements round robin scheduling for tasks to 
> executors. Which is great as it distributes the load evenly across the 
> cluster, but this leads to significant resource waste in some cases, 
> especially when dynamic allocation is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-17624) Flaky test? StateStoreSuite maintenance

2016-09-21 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512237#comment-15512237
 ] 

Saisai Shao edited comment on SPARK-17624 at 9/22/16 5:36 AM:
--

I cannot reproduce locally on my Mac laptop, Maybe you test machine is not so 
powerful to handle this unit test?


was (Author: jerryshao):
I cannot reproduce locally on my 

> Flaky test? StateStoreSuite maintenance
> ---
>
> Key: SPARK-17624
> URL: https://issues.apache.org/jira/browse/SPARK-17624
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.0.1
>Reporter: Adam Roberts
>Priority: Minor
>
> I've noticed this test failing consistently (25x in a row) with a two core 
> machine but not on an eight core machine
> If we increase the spark.rpc.numRetries value used in the test from 1 to 2 (3 
> being the default in Spark), the test reliably passes, we can also gain 
> reliability by setting the master to be anything other than just local.
> Is there a reason spark.rpc.numRetries is set to be 1?
> I see this failure is also mentioned here so it's been flaky for a while 
> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-2-0-0-RC5-td18367.html
> If we run without the "quietly" code so we get debug info:
> {code}
> 16:26:15.213 WARN org.apache.spark.rpc.netty.NettyRpcEndpointRef: Error 
> sending message [message = 
> VerifyIfInstanceActive(StateStoreId(/home/aroberts/Spark-DK/sql/core/target/tmp/spark-cc44f5fa-b675-426f-9440-76785c365507/ૺꎖ鮎衲넅-28e9196f-8b2d-43ba-8421-44a5c5e98ceb,0,0),driver)]
>  in 1 attempts
> org.apache.spark.SparkException: Exception thrown in awaitResult
> at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
> at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
> at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
> at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
> at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
> at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
> at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
> at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStoreCoordinatorRef.verifyIfInstanceActive(StateStoreCoordinator.scala:91)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$$anonfun$3.apply(StateStore.scala:227)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$$anonfun$3.apply(StateStore.scala:227)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$.org$apache$spark$sql$execution$streaming$state$StateStore$$verifyIfStoreInstanceActive(StateStore.scala:227)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$$anonfun$org$apache$spark$sql$execution$streaming$state$StateStore$$doMaintenance$2.apply(StateStore.scala:199)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$$anonfun$org$apache$spark$sql$execution$streaming$state$StateStore$$doMaintenance$2.apply(StateStore.scala:197)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$.org$apache$spark$sql$execution$streaming$state$StateStore$$doMaintenance(StateStore.scala:197)
> at 
> org.apache.spark.sql.execution.streaming.state.StateStore$$anon$1.run(StateStore.scala:180)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:319)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:191)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.lang.Thread.run(Thread.java:785)
> Caused by: org.apache.spark.SparkException: Could not find 
> StateStoreCoordinator.
> at 
> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:154)
> at 
> 

<    1   2   3   4   5   6   7   8   9   10   >