[jira] [Updated] (SPARK-23995) initial job has not accept any resources and executor keep exit
[ https://issues.apache.org/jira/browse/SPARK-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cong Shen updated SPARK-23995: -- Environment: Spark version:2.3.0 JDK version: 1.8.0_131 System: CentOS v7 {\{export JAVA_HOME=/usr/java/jdk1.8.0_144 }} {{export SPARK_MASTER_IP=IP }} {{export PYSPARK_PYTHON=/opt/anaconda3/bin/python }} {{export SPARK_WORKER_MEMORY=2g }} {{export SPARK_WORK_INSTANCES=1 }} {{export SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}} The firewalls are stopped. was: Spark version:2.3.0 JDK version: 1.8.0_131 System: CentOS v7 {{export JAVA_HOME=/usr/java/jdk1.8.0_144 }} {{export SPARK_MASTER_IP=IP export PYSPARK_PYTHON=/opt/anaconda3/bin/python export SPARK_WORKER_MEMORY=2g export SPARK_WORK_INSTANCES=1 export SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}} The firewalls are stopped. > initial job has not accept any resources and executor keep exit > --- > > Key: SPARK-23995 > URL: https://issues.apache.org/jira/browse/SPARK-23995 > Project: Spark > Issue Type: Bug > Components: Deploy >Affects Versions: 2.3.0 > Environment: Spark version:2.3.0 > JDK version: 1.8.0_131 > System: CentOS v7 > {\{export JAVA_HOME=/usr/java/jdk1.8.0_144 }} > {{export SPARK_MASTER_IP=IP }} > {{export PYSPARK_PYTHON=/opt/anaconda3/bin/python }} > {{export SPARK_WORKER_MEMORY=2g }} > {{export SPARK_WORK_INSTANCES=1 }} > {{export SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}} > > The firewalls are stopped. >Reporter: Cong Shen >Priority: Major > Labels: executor, standalone, > > I have a spark cluster using cloud resource in two instances. One as master > and one as worker. The total resource is 4 cores and 10G ram. I can start > shell, and worker can register successfully.But when I run simple code. > The error from shell is: > TaskSchedulerImpl:66 - Initial job has not accept any resources. > master log: > > {code:java} > // code placeholder > 2018-04-12 13:09:14 INFO Master:54 - Registering app Spark shell 2018-04-12 > 13:09:14 INFO Master:54 - Registered app Spark shell with ID > app-20180412130914- 2018-04-12 13:09:14 INFO Master:54 - Launching > executor app-20180412130914-/0 on worker > worker-20180411144020-192.**.**.**-44986 2018-04-12 13:11:15 INFO Master:54 - > Removing executor app-20180412130914-/0 because it is EXITED 2018-04-12 > 13:11:15 INFO Master:54 - Launching executor app-20180412130914-/1 on > worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:13:16 INFO > Master:54 - Removing executor app-20180412130914-/1 because it is EXITED > 2018-04-12 13:13:16 INFO Master:54 - Launching executor > app-20180412130914-/2 on worker worker-20180411144020-192.**.**.**-44986 > 2018-04-12 13:15:17 INFO Master:54 - Removing executor > app-20180412130914-/2 because it is EXITED 2018-04-12 13:15:17 INFO > Master:54 - Launching executor app-20180412130914-/3 on worker > worker-20180411144020-192.**.**.**-44986 2018-04-12 13:16:15 INFO Master:54 - > Removing app app-20180412130914- 2018-04-12 13:16:15 INFO Master:54 - > 192.**.**.**:39766 got disassociated, removing it. 2018-04-12 13:16:15 INFO > Master:54 - IP:39928 got disassociated, removing it. 2018-04-12 13:16:15 WARN > Master:66 - Got status update for unknown executor app-20180412130914-/3 > {code} > Worker log: > > {code:java} > // code placeholder > 2018-04-12 13:09:12 INFO Worker:54 - Asked to launch executor > app-20180412130914-/0 for Spark shell > 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing view acls to: root > 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing modify acls to: root > 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing view acls groups to: > 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing modify acls groups > to: > 2018-04-12 13:09:12 INFO SecurityManager:54 - SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(root); groups with view permissions: Set(); users with modify > permissions: Set(root); groups with modify permissions: Set() > 2018-04-12 13:09:12 INFO ExecutorRunner:54 - Launch command: > "/usr/java/jdk1.8.0_144/bin/java" "-cp" > "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" > "-Xmx1024M" "-Dspark.driver.port=39928" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@IP:39928" "--executor-id" "0" "--hostname" > "192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-" > "--worker-url" "spark://Worker@192.**.**.**:44986" > 2018-04-12 13:11:13 INFO Worker:54 - Executor app-20180412130914-/0 > finished with state EXITED message Command exited with code 1 exitStatus 1
[jira] [Created] (SPARK-23995) initial job has not accept any resources and executor keep exit
Cong Shen created SPARK-23995: - Summary: initial job has not accept any resources and executor keep exit Key: SPARK-23995 URL: https://issues.apache.org/jira/browse/SPARK-23995 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 2.3.0 Environment: Spark version:2.3.0 JDK version: 1.8.0_131 System: CentOS v7 {{export JAVA_HOME=/usr/java/jdk1.8.0_144 }} {{export SPARK_MASTER_IP=IP export PYSPARK_PYTHON=/opt/anaconda3/bin/python export SPARK_WORKER_MEMORY=2g export SPARK_WORK_INSTANCES=1 export SPARK_WORkER_CORES=4 export SPARK_EXECUTOR_MEMORY=1g}} The firewalls are stopped. Reporter: Cong Shen I have a spark cluster using cloud resource in two instances. One as master and one as worker. The total resource is 4 cores and 10G ram. I can start shell, and worker can register successfully.But when I run simple code. The error from shell is: TaskSchedulerImpl:66 - Initial job has not accept any resources. master log: {code:java} // code placeholder 2018-04-12 13:09:14 INFO Master:54 - Registering app Spark shell 2018-04-12 13:09:14 INFO Master:54 - Registered app Spark shell with ID app-20180412130914- 2018-04-12 13:09:14 INFO Master:54 - Launching executor app-20180412130914-/0 on worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:11:15 INFO Master:54 - Removing executor app-20180412130914-/0 because it is EXITED 2018-04-12 13:11:15 INFO Master:54 - Launching executor app-20180412130914-/1 on worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:13:16 INFO Master:54 - Removing executor app-20180412130914-/1 because it is EXITED 2018-04-12 13:13:16 INFO Master:54 - Launching executor app-20180412130914-/2 on worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:15:17 INFO Master:54 - Removing executor app-20180412130914-/2 because it is EXITED 2018-04-12 13:15:17 INFO Master:54 - Launching executor app-20180412130914-/3 on worker worker-20180411144020-192.**.**.**-44986 2018-04-12 13:16:15 INFO Master:54 - Removing app app-20180412130914- 2018-04-12 13:16:15 INFO Master:54 - 192.**.**.**:39766 got disassociated, removing it. 2018-04-12 13:16:15 INFO Master:54 - IP:39928 got disassociated, removing it. 2018-04-12 13:16:15 WARN Master:66 - Got status update for unknown executor app-20180412130914-/3 {code} Worker log: {code:java} // code placeholder 2018-04-12 13:09:12 INFO Worker:54 - Asked to launch executor app-20180412130914-/0 for Spark shell 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing view acls to: root 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing modify acls to: root 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing view acls groups to: 2018-04-12 13:09:12 INFO SecurityManager:54 - Changing modify acls groups to: 2018-04-12 13:09:12 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2018-04-12 13:09:12 INFO ExecutorRunner:54 - Launch command: "/usr/java/jdk1.8.0_144/bin/java" "-cp" "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.port=39928" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@IP:39928" "--executor-id" "0" "--hostname" "192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-" "--worker-url" "spark://Worker@192.**.**.**:44986" 2018-04-12 13:11:13 INFO Worker:54 - Executor app-20180412130914-/0 finished with state EXITED message Command exited with code 1 exitStatus 1 2018-04-12 13:11:13 INFO Worker:54 - Asked to launch executor app-20180412130914-/1 for Spark shell 2018-04-12 13:11:13 INFO SecurityManager:54 - Changing view acls to: root 2018-04-12 13:11:13 INFO SecurityManager:54 - Changing modify acls to: root 2018-04-12 13:11:13 INFO SecurityManager:54 - Changing view acls groups to: 2018-04-12 13:11:13 INFO SecurityManager:54 - Changing modify acls groups to: 2018-04-12 13:11:13 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2018-04-12 13:11:13 INFO ExecutorRunner:54 - Launch command: "/usr/java/jdk1.8.0_144/bin/java" "-cp" "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.port=39928" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@spark-master.novalocal:39928" "--executor-id" "1" "--hostname" "192.**.**.**" "--cores" "4" "--app-id"