[jira] [Updated] (SPARK-23995) initial job has not accept any resources and executor keep exit

2018-04-16 Thread Cong Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cong Shen updated SPARK-23995:
--
Environment: 
Spark version:2.3.0

JDK version: 1.8.0_131

System: CentOS v7

{\{export JAVA_HOME=/usr/java/jdk1.8.0_144 }}

{{export SPARK_MASTER_IP=IP }}

{{export PYSPARK_PYTHON=/opt/anaconda3/bin/python }}

{{export SPARK_WORKER_MEMORY=2g }}

{{export SPARK_WORK_INSTANCES=1 }}

{{export SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}}

 

The firewalls are stopped.

  was:
Spark version:2.3.0

JDK version: 1.8.0_131

System: CentOS v7

{{export JAVA_HOME=/usr/java/jdk1.8.0_144 }}

{{export SPARK_MASTER_IP=IP export PYSPARK_PYTHON=/opt/anaconda3/bin/python 
export SPARK_WORKER_MEMORY=2g export SPARK_WORK_INSTANCES=1 export 
SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}}

 

The firewalls are stopped.


> initial job has not accept any resources and executor keep exit
> ---
>
> Key: SPARK-23995
> URL: https://issues.apache.org/jira/browse/SPARK-23995
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy
>Affects Versions: 2.3.0
> Environment: Spark version:2.3.0
> JDK version: 1.8.0_131
> System: CentOS v7
> {\{export JAVA_HOME=/usr/java/jdk1.8.0_144 }}
> {{export SPARK_MASTER_IP=IP }}
> {{export PYSPARK_PYTHON=/opt/anaconda3/bin/python }}
> {{export SPARK_WORKER_MEMORY=2g }}
> {{export SPARK_WORK_INSTANCES=1 }}
> {{export SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}}
>  
> The firewalls are stopped.
>Reporter: Cong Shen
>Priority: Major
>  Labels: executor, standalone,
>
> I have a spark cluster using cloud resource in two instances. One as master 
> and one as worker. The total resource is 4 cores and 10G ram. I can start 
> shell, and worker can register successfully.But when I run simple code.
> The error from shell is:
> TaskSchedulerImpl:66 - Initial job has not accept any resources.
> master log:
>  
> {code:java}
> // code placeholder
> 2018-04-12 13:09:14 INFO Master:54 - Registering app Spark shell 2018-04-12 
> 13:09:14 INFO Master:54 - Registered app Spark shell with ID 
> app-20180412130914- 2018-04-12 13:09:14 INFO Master:54 - Launching 
> executor app-20180412130914-/0 on worker 
> worker-20180411144020-192.**.**.**-44986 2018-04-12 13:11:15 INFO Master:54 - 
> Removing executor app-20180412130914-/0 because it is EXITED 2018-04-12 
> 13:11:15 INFO Master:54 - Launching executor app-20180412130914-/1 on 
> worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:13:16 INFO 
> Master:54 - Removing executor app-20180412130914-/1 because it is EXITED 
> 2018-04-12 13:13:16 INFO Master:54 - Launching executor 
> app-20180412130914-/2 on worker worker-20180411144020-192.**.**.**-44986 
> 2018-04-12 13:15:17 INFO Master:54 - Removing executor 
> app-20180412130914-/2 because it is EXITED 2018-04-12 13:15:17 INFO 
> Master:54 - Launching executor app-20180412130914-/3 on worker 
> worker-20180411144020-192.**.**.**-44986 2018-04-12 13:16:15 INFO Master:54 - 
> Removing app app-20180412130914- 2018-04-12 13:16:15 INFO Master:54 - 
> 192.**.**.**:39766 got disassociated, removing it. 2018-04-12 13:16:15 INFO 
> Master:54 - IP:39928 got disassociated, removing it. 2018-04-12 13:16:15 WARN 
> Master:66 - Got status update for unknown executor app-20180412130914-/3
> {code}
> Worker log:
>  
> {code:java}
> // code placeholder
> 2018-04-12 13:09:12 INFO  Worker:54 - Asked to launch executor
> app-20180412130914-/0 for Spark shell
> 2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing view acls to: root
> 2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing modify acls to: root
> 2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing view acls groups to: 
> 2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing modify acls groups  
> to: 
> 2018-04-12 13:09:12 INFO  SecurityManager:54 - SecurityManager: 
> authentication disabled; ui acls disabled; users  with view permissions:
> Set(root); groups with view permissions: Set(); users  with modify 
> permissions: Set(root); groups with modify permissions: Set()
> 2018-04-12 13:09:12 INFO  ExecutorRunner:54 - Launch command: 
> "/usr/java/jdk1.8.0_144/bin/java" "-cp" 
> "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" 
> "-Xmx1024M" "-Dspark.driver.port=39928" 
> "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
> "spark://CoarseGrainedScheduler@IP:39928" "--executor-id" "0" "--hostname" 
> "192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-" 
> "--worker-url" "spark://Worker@192.**.**.**:44986"
> 2018-04-12 13:11:13 INFO  Worker:54 - Executor app-20180412130914-/0 
> finished with state EXITED message Command exited with code 1 exitStatus 1

[jira] [Created] (SPARK-23995) initial job has not accept any resources and executor keep exit

2018-04-16 Thread Cong Shen (JIRA)
Cong Shen created SPARK-23995:
-

 Summary: initial job has not accept any resources and executor 
keep exit
 Key: SPARK-23995
 URL: https://issues.apache.org/jira/browse/SPARK-23995
 Project: Spark
  Issue Type: Bug
  Components: Deploy
Affects Versions: 2.3.0
 Environment: Spark version:2.3.0

JDK version: 1.8.0_131

System: CentOS v7

{{export JAVA_HOME=/usr/java/jdk1.8.0_144 }}

{{export SPARK_MASTER_IP=IP export PYSPARK_PYTHON=/opt/anaconda3/bin/python 
export SPARK_WORKER_MEMORY=2g export SPARK_WORK_INSTANCES=1 export 
SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}}

 

The firewalls are stopped.
Reporter: Cong Shen


I have a spark cluster using cloud resource in two instances. One as master and 
one as worker. The total resource is 4 cores and 10G ram. I can start shell, 
and worker can register successfully.But when I run simple code.

The error from shell is:

TaskSchedulerImpl:66 - Initial job has not accept any resources.

master log:

 
{code:java}
// code placeholder
2018-04-12 13:09:14 INFO Master:54 - Registering app Spark shell 2018-04-12 
13:09:14 INFO Master:54 - Registered app Spark shell with ID 
app-20180412130914- 2018-04-12 13:09:14 INFO Master:54 - Launching executor 
app-20180412130914-/0 on worker worker-20180411144020-192.**.**.**-44986 
2018-04-12 13:11:15 INFO Master:54 - Removing executor 
app-20180412130914-/0 because it is EXITED 2018-04-12 13:11:15 INFO 
Master:54 - Launching executor app-20180412130914-/1 on worker 
worker-20180411144020-192.**.**.**-44986 2018-04-12 13:13:16 INFO Master:54 - 
Removing executor app-20180412130914-/1 because it is EXITED 2018-04-12 
13:13:16 INFO Master:54 - Launching executor app-20180412130914-/2 on 
worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:15:17 INFO 
Master:54 - Removing executor app-20180412130914-/2 because it is EXITED 
2018-04-12 13:15:17 INFO Master:54 - Launching executor 
app-20180412130914-/3 on worker worker-20180411144020-192.**.**.**-44986 
2018-04-12 13:16:15 INFO Master:54 - Removing app app-20180412130914- 
2018-04-12 13:16:15 INFO Master:54 - 192.**.**.**:39766 got disassociated, 
removing it. 2018-04-12 13:16:15 INFO Master:54 - IP:39928 got disassociated, 
removing it. 2018-04-12 13:16:15 WARN Master:66 - Got status update for unknown 
executor app-20180412130914-/3

{code}
Worker log:

 
{code:java}
// code placeholder

2018-04-12 13:09:12 INFO  Worker:54 - Asked to launch executor
app-20180412130914-/0 for Spark shell
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing view acls to: root
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing modify acls to: root
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing modify acls groups  to: 
2018-04-12 13:09:12 INFO  SecurityManager:54 - SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions:Set(root); groups 
with view permissions: Set(); users  with modify permissions: Set(root); groups 
with modify permissions: Set()
2018-04-12 13:09:12 INFO  ExecutorRunner:54 - Launch command: 
"/usr/java/jdk1.8.0_144/bin/java" "-cp" 
"/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" 
"-Xmx1024M" "-Dspark.driver.port=39928" 
"org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
"spark://CoarseGrainedScheduler@IP:39928" "--executor-id" "0" "--hostname" 
"192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-" 
"--worker-url" "spark://Worker@192.**.**.**:44986"
2018-04-12 13:11:13 INFO  Worker:54 - Executor app-20180412130914-/0 
finished with state EXITED message Command exited with code 1 exitStatus 1
2018-04-12 13:11:13 INFO  Worker:54 - Asked to launch executor 
app-20180412130914-/1 for Spark shell
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing view acls to: root
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing modify acls to: root
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing modify acls groups to: 
2018-04-12 13:11:13 INFO  SecurityManager:54 - SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(root); groups 
with view permissions: Set(); users  with modify permissions: Set(root); groups 
with modify permissions: Set()
2018-04-12 13:11:13 INFO  ExecutorRunner:54 - Launch command: 
"/usr/java/jdk1.8.0_144/bin/java" "-cp" 
"/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" 
"-Xmx1024M" "-Dspark.driver.port=39928" 
"org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
"spark://CoarseGrainedScheduler@spark-master.novalocal:39928" "--executor-id" 
"1" "--hostname" "192.**.**.**" "--cores" "4" "--app-id"