Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
OK, yarn.scheduler.maximum-allocation-mb is 16384. I have ran it again, the command to run it is: ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster - -driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 > > > 15/11/24 16:15:56 INFO yarn.ApplicationMaster: Registered signal handlers for > [TERM, HUP, INT] > > 15/11/24 16:15:57 INFO yarn.ApplicationMaster: ApplicationAttemptId: > appattempt_1447834709734_0120_01 > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: > hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(hdfs-test); > users with modify permissions: Set(hdfs-test) > > 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Starting the user application > in a separate Thread > > 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context > initialization > > 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context > initialization ... > 15/11/24 16:15:58 INFO spark.SparkContext: Running Spark version 1.5.0 > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: > hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(hdfs-test); > users with modify permissions: Set(hdfs-test) > 15/11/24 16:15:58 INFO slf4j.Slf4jLogger: Slf4jLogger started > 15/11/24 16:15:59 INFO Remoting: Starting remoting > > 15/11/24 16:15:59 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkDriver@X.X.X.X > ] > > 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'sparkDriver' > on port 61904. > 15/11/24 16:15:59 INFO spark.SparkEnv: Registering MapOutputTracker > 15/11/24 16:15:59 INFO spark.SparkEnv: Registering BlockManagerMaster > > 15/11/24 16:15:59 INFO storage.DiskBlockManager: Created local directory at > /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/blockmgr-33fbe6c4-5138-4eff-83b4-fb0c886667b7 > > 15/11/24 16:15:59 INFO storage.MemoryStore: MemoryStore started with capacity > 1966.1 MB > > 15/11/24 16:15:59 INFO spark.HttpFileServer: HTTP File server directory is > /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/spark-fbbfa2bd-6d30-421e-a634-4546134b3b5f/httpd-e31d7b8e-ca8f-400e-8b4b-d2993fb6f1d1 > 15/11/24 16:15:59 INFO spark.HttpServer: Starting HTTP Server > 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/11/24 16:15:59 INFO server.AbstractConnector: Started > SocketConnector@0.0.0.0:14692 > > 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'HTTP file > server' on port 14692. > 15/11/24 16:15:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator > > 15/11/24 16:15:59 INFO ui.JettyUtils: Adding filter: > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter > 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/11/24 16:15:59 INFO server.AbstractConnector: Started > SelectChannelConnector@0.0.0.0:15948 > 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'SparkUI' on > port 15948. > > 15/11/24 16:15:59 INFO ui.SparkUI: Started SparkUI at X.X.X.X > > 15/11/24 16:15:59 INFO cluster.YarnClusterScheduler: Created > YarnClusterScheduler > > 15/11/24 16:15:59 WARN metrics.MetricsSystem: Using default name DAGScheduler > for source because > spark.app.id is not set. > > 15/11/24 16:15:59 INFO util.Utils: Successfully started service > 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41830. > 15/11/24 16:15:59 INFO netty.NettyBlockTransferService: Server created on > 41830 > > 15/11/24 16:15:59 INFO storage.BlockManagerMaster: Trying to register > BlockManager > > 15/11/24 16:15:59 INFO storage.BlockManagerMasterEndpoint: Registering block > manager X.X.X.X:41830 with 1966.1 MB RAM, BlockManagerId(driver, 10.12.30.2, > 41830) > > 15/11/24 16:15:59 INFO storage.BlockManagerMaster: Registered BlockManager > 15/11/24 16:16:00 INFO scheduler.EventLoggingListener: Logging events to > hdfs:///tmp/latest-spark-events/application_1447834709734_0120_1 > > 15/11/24 16:16:00 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: > ApplicationMaster registered as > AkkaRpcEndpointRef(Actor[akka://sparkDriver/user/YarnAM#293602859]) > > 15/11/24 16:16:00 INFO client.RMProxy: Connecting to ResourceManager at > X.X.X.X > > 15/11/24 16:16:00 INFO yarn.YarnRMClient: Registering the ApplicationMaster > > 15/11/24 16:16:00 INFO yarn.ApplicationMaster: Started progress reporter > thread with (heartbeat : 3000, initial allocation : 200) intervals > > 15/11/24 16:16:29 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend > is
Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
If yarn has only 50 cores then it can support max 49 executors plus 1 driver application master. Regards Sab On 24-Nov-2015 1:58 pm, "谢廷稳"wrote: > OK, yarn.scheduler.maximum-allocation-mb is 16384. > > I have ran it again, the command to run it is: > ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master > yarn-cluster - > -driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 > > > >> >> >> 15/11/24 16:15:56 INFO yarn.ApplicationMaster: Registered signal handlers >> for [TERM, HUP, INT] >> >> 15/11/24 16:15:57 INFO yarn.ApplicationMaster: ApplicationAttemptId: >> appattempt_1447834709734_0120_01 >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(hdfs-test); users with modify permissions: Set(hdfs-test) >> >> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Starting the user application >> in a separate Thread >> >> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization >> >> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> 15/11/24 16:15:58 INFO spark.SparkContext: Running Spark version 1.5.0 >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(hdfs-test); users with modify permissions: Set(hdfs-test) >> 15/11/24 16:15:58 INFO slf4j.Slf4jLogger: Slf4jLogger started >> 15/11/24 16:15:59 INFO Remoting: Starting remoting >> >> 15/11/24 16:15:59 INFO Remoting: Remoting started; listening on addresses >> :[akka.tcp://sparkDriver@X.X.X.X >> ] >> >> 15/11/24 16:15:59 INFO util.Utils: Successfully started service >> 'sparkDriver' on port 61904. >> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering MapOutputTracker >> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering BlockManagerMaster >> >> 15/11/24 16:15:59 INFO storage.DiskBlockManager: Created local directory at >> /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/blockmgr-33fbe6c4-5138-4eff-83b4-fb0c886667b7 >> >> 15/11/24 16:15:59 INFO storage.MemoryStore: MemoryStore started with >> capacity 1966.1 MB >> >> 15/11/24 16:15:59 INFO spark.HttpFileServer: HTTP File server directory is >> /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/spark-fbbfa2bd-6d30-421e-a634-4546134b3b5f/httpd-e31d7b8e-ca8f-400e-8b4b-d2993fb6f1d1 >> 15/11/24 16:15:59 INFO spark.HttpServer: Starting HTTP Server >> 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT >> 15/11/24 16:15:59 INFO server.AbstractConnector: Started >> SocketConnector@0.0.0.0:14692 >> >> 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'HTTP file >> server' on port 14692. >> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator >> >> 15/11/24 16:15:59 INFO ui.JettyUtils: Adding filter: >> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >> 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT >> 15/11/24 16:15:59 INFO server.AbstractConnector: Started >> SelectChannelConnector@0.0.0.0:15948 >> 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'SparkUI' on >> port 15948. >> >> 15/11/24 16:15:59 INFO ui.SparkUI: Started SparkUI at X.X.X.X >> >> 15/11/24 16:15:59 INFO cluster.YarnClusterScheduler: Created >> YarnClusterScheduler >> >> 15/11/24 16:15:59 WARN metrics.MetricsSystem: Using default name >> DAGScheduler for source because >> spark.app.id is not set. >> >> 15/11/24 16:15:59 INFO util.Utils: Successfully started service >> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41830. >> 15/11/24 16:15:59 INFO netty.NettyBlockTransferService: Server created on >> 41830 >> >> 15/11/24 16:15:59 INFO storage.BlockManagerMaster: Trying to register >> BlockManager >> >> 15/11/24 16:15:59 INFO storage.BlockManagerMasterEndpoint: Registering block >> manager X.X.X.X:41830 with 1966.1 MB RAM, BlockManagerId(driver, 10.12.30.2, >> 41830) >> >> 15/11/24 16:15:59 INFO storage.BlockManagerMaster: Registered BlockManager >> 15/11/24 16:16:00 INFO scheduler.EventLoggingListener: Logging events to >> hdfs:///tmp/latest-spark-events/application_1447834709734_0120_1 >> >> 15/11/24 16:16:00 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: >> ApplicationMaster registered as >> AkkaRpcEndpointRef(Actor[akka://sparkDriver/user/YarnAM#293602859]) >> >> 15/11/24 16:16:00 INFO client.RMProxy: Connecting to ResourceManager at >> X.X.X.X >> >> >> 15/11/24
Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
Did you set this configuration "spark.dynamicAllocation.initialExecutors" ? You can set spark.dynamicAllocation.initialExecutors 50 to take try again. I guess you might be hitting this issue since you're running 1.5.0, https://issues.apache.org/jira/browse/SPARK-9092. But it still cannot explain why 49 executors can be worked. On Tue, Nov 24, 2015 at 4:42 PM, Sabarish Sasidharan < sabarish.sasidha...@manthan.com> wrote: > If yarn has only 50 cores then it can support max 49 executors plus 1 > driver application master. > > Regards > Sab > On 24-Nov-2015 1:58 pm, "谢廷稳"wrote: > >> OK, yarn.scheduler.maximum-allocation-mb is 16384. >> >> I have ran it again, the command to run it is: >> ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master >> yarn-cluster - >> -driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 >> >> >> >>> >>> >>> 15/11/24 16:15:56 INFO yarn.ApplicationMaster: Registered signal handlers >>> for [TERM, HUP, INT] >>> >>> 15/11/24 16:15:57 INFO yarn.ApplicationMaster: ApplicationAttemptId: >>> appattempt_1447834709734_0120_01 >>> >>> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: >>> hdfs-test >>> >>> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: >>> hdfs-test >>> >>> 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: >>> authentication disabled; ui acls disabled; users with view permissions: >>> Set(hdfs-test); users with modify permissions: Set(hdfs-test) >>> >>> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Starting the user >>> application in a separate Thread >>> >>> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context >>> initialization >>> >>> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context >>> initialization ... >>> 15/11/24 16:15:58 INFO spark.SparkContext: Running Spark version 1.5.0 >>> >>> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: >>> hdfs-test >>> >>> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: >>> hdfs-test >>> >>> 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: >>> authentication disabled; ui acls disabled; users with view permissions: >>> Set(hdfs-test); users with modify permissions: Set(hdfs-test) >>> 15/11/24 16:15:58 INFO slf4j.Slf4jLogger: Slf4jLogger started >>> 15/11/24 16:15:59 INFO Remoting: Starting remoting >>> >>> 15/11/24 16:15:59 INFO Remoting: Remoting started; listening on addresses >>> :[akka.tcp://sparkDriver@X.X.X.X >>> ] >>> >>> 15/11/24 16:15:59 INFO util.Utils: Successfully started service >>> 'sparkDriver' on port 61904. >>> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering MapOutputTracker >>> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering BlockManagerMaster >>> >>> 15/11/24 16:15:59 INFO storage.DiskBlockManager: Created local directory at >>> /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/blockmgr-33fbe6c4-5138-4eff-83b4-fb0c886667b7 >>> >>> 15/11/24 16:15:59 INFO storage.MemoryStore: MemoryStore started with >>> capacity 1966.1 MB >>> >>> 15/11/24 16:15:59 INFO spark.HttpFileServer: HTTP File server directory is >>> /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/spark-fbbfa2bd-6d30-421e-a634-4546134b3b5f/httpd-e31d7b8e-ca8f-400e-8b4b-d2993fb6f1d1 >>> 15/11/24 16:15:59 INFO spark.HttpServer: Starting HTTP Server >>> 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT >>> 15/11/24 16:15:59 INFO server.AbstractConnector: Started >>> SocketConnector@0.0.0.0:14692 >>> >>> 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'HTTP file >>> server' on port 14692. >>> >>> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator >>> >>> 15/11/24 16:15:59 INFO ui.JettyUtils: Adding filter: >>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >>> 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT >>> 15/11/24 16:15:59 INFO server.AbstractConnector: Started >>> SelectChannelConnector@0.0.0.0:15948 >>> 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'SparkUI' >>> on port 15948. >>> >>> 15/11/24 16:15:59 INFO ui.SparkUI: Started SparkUI at X.X.X.X >>> >>> 15/11/24 16:15:59 INFO cluster.YarnClusterScheduler: Created >>> YarnClusterScheduler >>> >>> 15/11/24 16:15:59 WARN metrics.MetricsSystem: Using default name >>> DAGScheduler for source because >>> spark.app.id is not set. >>> >>> 15/11/24 16:15:59 INFO util.Utils: Successfully started service >>> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41830. >>> 15/11/24 16:15:59 INFO netty.NettyBlockTransferService: Server created on >>> 41830 >>> >>> 15/11/24 16:15:59 INFO storage.BlockManagerMaster: Trying to register >>> BlockManager >>> >>> 15/11/24 16:15:59 INFO storage.BlockManagerMasterEndpoint: Registering >>> block manager X.X.X.X:41830 with 1966.1 MB RAM, BlockManagerId(driver,
Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
@Sab Thank you for your reply, but the cluster has 6 nodes which contain 300 cores and Spark application did not request resource from YARN. @SaiSai I have ran it successful with " spark.dynamicAllocation.initialExecutors" equals 50, but in http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation it says that "spark.dynamicAllocation.initialExecutors" equals " spark.dynamicAllocation.minExecutors". So, I think something was wrong, did it? Thanks. 2015-11-24 16:47 GMT+08:00 Saisai Shao: > Did you set this configuration "spark.dynamicAllocation.initialExecutors" > ? > > You can set spark.dynamicAllocation.initialExecutors 50 to take try again. > > I guess you might be hitting this issue since you're running 1.5.0, > https://issues.apache.org/jira/browse/SPARK-9092. But it still cannot > explain why 49 executors can be worked. > > On Tue, Nov 24, 2015 at 4:42 PM, Sabarish Sasidharan < > sabarish.sasidha...@manthan.com> wrote: > >> If yarn has only 50 cores then it can support max 49 executors plus 1 >> driver application master. >> >> Regards >> Sab >> On 24-Nov-2015 1:58 pm, "谢廷稳" wrote: >> >>> OK, yarn.scheduler.maximum-allocation-mb is 16384. >>> >>> I have ran it again, the command to run it is: >>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master >>> yarn-cluster - >>> -driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 >>> >>> >>> 15/11/24 16:15:56 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 15/11/24 16:15:57 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1447834709734_0120_01 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: hdfs-test 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: hdfs-test 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs-test); users with modify permissions: Set(hdfs-test) 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context initialization 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 15/11/24 16:15:58 INFO spark.SparkContext: Running Spark version 1.5.0 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: hdfs-test 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: hdfs-test 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs-test); users with modify permissions: Set(hdfs-test) 15/11/24 16:15:58 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/11/24 16:15:59 INFO Remoting: Starting remoting 15/11/24 16:15:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@X.X.X.X ] 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'sparkDriver' on port 61904. 15/11/24 16:15:59 INFO spark.SparkEnv: Registering MapOutputTracker 15/11/24 16:15:59 INFO spark.SparkEnv: Registering BlockManagerMaster 15/11/24 16:15:59 INFO storage.DiskBlockManager: Created local directory at /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/blockmgr-33fbe6c4-5138-4eff-83b4-fb0c886667b7 15/11/24 16:15:59 INFO storage.MemoryStore: MemoryStore started with capacity 1966.1 MB 15/11/24 16:15:59 INFO spark.HttpFileServer: HTTP File server directory is /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/spark-fbbfa2bd-6d30-421e-a634-4546134b3b5f/httpd-e31d7b8e-ca8f-400e-8b4b-d2993fb6f1d1 15/11/24 16:15:59 INFO spark.HttpServer: Starting HTTP Server 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/11/24 16:15:59 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:14692 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'HTTP file server' on port 14692. 15/11/24 16:15:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/11/24 16:15:59 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/11/24 16:15:59 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:15948 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'SparkUI' on port 15948. 15/11/24 16:15:59 INFO ui.SparkUI: Started SparkUI at X.X.X.X 15/11/24 16:15:59 INFO cluster.YarnClusterScheduler:
Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
The document is right. Because of a bug introduce in https://issues.apache.org/jira/browse/SPARK-9092 which makes this configuration fail to work. It is fixed in https://issues.apache.org/jira/browse/SPARK-10790, you could change to newer version of Spark. On Tue, Nov 24, 2015 at 5:12 PM, 谢廷稳wrote: > @Sab Thank you for your reply, but the cluster has 6 nodes which contain > 300 cores and Spark application did not request resource from YARN. > > @SaiSai I have ran it successful with " > spark.dynamicAllocation.initialExecutors" equals 50, but in > http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation > it says that > > "spark.dynamicAllocation.initialExecutors" equals " > spark.dynamicAllocation.minExecutors". So, I think something was wrong, > did it? > > Thanks. > > > > 2015-11-24 16:47 GMT+08:00 Saisai Shao : > >> Did you set this configuration "spark.dynamicAllocation.initialExecutors" >> ? >> >> You can set spark.dynamicAllocation.initialExecutors 50 to take try >> again. >> >> I guess you might be hitting this issue since you're running 1.5.0, >> https://issues.apache.org/jira/browse/SPARK-9092. But it still cannot >> explain why 49 executors can be worked. >> >> On Tue, Nov 24, 2015 at 4:42 PM, Sabarish Sasidharan < >> sabarish.sasidha...@manthan.com> wrote: >> >>> If yarn has only 50 cores then it can support max 49 executors plus 1 >>> driver application master. >>> >>> Regards >>> Sab >>> On 24-Nov-2015 1:58 pm, "谢廷稳" wrote: >>> OK, yarn.scheduler.maximum-allocation-mb is 16384. I have ran it again, the command to run it is: ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster - -driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 > > > 15/11/24 16:15:56 INFO yarn.ApplicationMaster: Registered signal handlers > for [TERM, HUP, INT] > > 15/11/24 16:15:57 INFO yarn.ApplicationMaster: ApplicationAttemptId: > appattempt_1447834709734_0120_01 > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: > hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: > hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(hdfs-test); users with modify permissions: Set(hdfs-test) > > 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Starting the user > application in a separate Thread > > 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context > initialization > > 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context > initialization ... > 15/11/24 16:15:58 INFO spark.SparkContext: Running Spark version 1.5.0 > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: > hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: > hdfs-test > > 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(hdfs-test); users with modify permissions: Set(hdfs-test) > 15/11/24 16:15:58 INFO slf4j.Slf4jLogger: Slf4jLogger started > 15/11/24 16:15:59 INFO Remoting: Starting remoting > > 15/11/24 16:15:59 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkDriver@X.X.X.X > ] > > 15/11/24 16:15:59 INFO util.Utils: Successfully started service > 'sparkDriver' on port 61904. > 15/11/24 16:15:59 INFO spark.SparkEnv: Registering MapOutputTracker > 15/11/24 16:15:59 INFO spark.SparkEnv: Registering BlockManagerMaster > > 15/11/24 16:15:59 INFO storage.DiskBlockManager: Created local directory > at > /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/blockmgr-33fbe6c4-5138-4eff-83b4-fb0c886667b7 > > 15/11/24 16:15:59 INFO storage.MemoryStore: MemoryStore started with > capacity 1966.1 MB > > 15/11/24 16:15:59 INFO spark.HttpFileServer: HTTP File server directory > is > /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/spark-fbbfa2bd-6d30-421e-a634-4546134b3b5f/httpd-e31d7b8e-ca8f-400e-8b4b-d2993fb6f1d1 > 15/11/24 16:15:59 INFO spark.HttpServer: Starting HTTP Server > 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/11/24 16:15:59 INFO server.AbstractConnector: Started > SocketConnector@0.0.0.0:14692 > > 15/11/24 16:15:59 INFO util.Utils: Successfully started service 'HTTP > file server' on port 14692. > > 15/11/24 16:15:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator > > 15/11/24 16:15:59 INFO ui.JettyUtils: Adding filter: >
Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
Thank you very much, after change to newer version, it did work well! 2015-11-24 17:15 GMT+08:00 Saisai Shao: > The document is right. Because of a bug introduce in > https://issues.apache.org/jira/browse/SPARK-9092 which makes this > configuration fail to work. > > It is fixed in https://issues.apache.org/jira/browse/SPARK-10790, you > could change to newer version of Spark. > > On Tue, Nov 24, 2015 at 5:12 PM, 谢廷稳 wrote: > >> @Sab Thank you for your reply, but the cluster has 6 nodes which contain >> 300 cores and Spark application did not request resource from YARN. >> >> @SaiSai I have ran it successful with " >> spark.dynamicAllocation.initialExecutors" equals 50, but in >> http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation >> it says that >> >> "spark.dynamicAllocation.initialExecutors" equals " >> spark.dynamicAllocation.minExecutors". So, I think something was wrong, >> did it? >> >> Thanks. >> >> >> >> 2015-11-24 16:47 GMT+08:00 Saisai Shao : >> >>> Did you set this configuration "spark.dynamicAllocation.initialExecutors" >>> ? >>> >>> You can set spark.dynamicAllocation.initialExecutors 50 to take try >>> again. >>> >>> I guess you might be hitting this issue since you're running 1.5.0, >>> https://issues.apache.org/jira/browse/SPARK-9092. But it still cannot >>> explain why 49 executors can be worked. >>> >>> On Tue, Nov 24, 2015 at 4:42 PM, Sabarish Sasidharan < >>> sabarish.sasidha...@manthan.com> wrote: >>> If yarn has only 50 cores then it can support max 49 executors plus 1 driver application master. Regards Sab On 24-Nov-2015 1:58 pm, "谢廷稳" wrote: > OK, yarn.scheduler.maximum-allocation-mb is 16384. > > I have ran it again, the command to run it is: > ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master > yarn-cluster - > -driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 > > > >> >> >> 15/11/24 16:15:56 INFO yarn.ApplicationMaster: Registered signal >> handlers for [TERM, HUP, INT] >> >> 15/11/24 16:15:57 INFO yarn.ApplicationMaster: ApplicationAttemptId: >> appattempt_1447834709734_0120_01 >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(hdfs-test); users with modify permissions: Set(hdfs-test) >> >> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Starting the user >> application in a separate Thread >> >> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization >> >> 15/11/24 16:15:58 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> 15/11/24 16:15:58 INFO spark.SparkContext: Running Spark version 1.5.0 >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing view acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: Changing modify acls to: >> hdfs-test >> >> 15/11/24 16:15:58 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(hdfs-test); users with modify permissions: Set(hdfs-test) >> 15/11/24 16:15:58 INFO slf4j.Slf4jLogger: Slf4jLogger started >> 15/11/24 16:15:59 INFO Remoting: Starting remoting >> >> 15/11/24 16:15:59 INFO Remoting: Remoting started; listening on >> addresses :[akka.tcp://sparkDriver@X.X.X.X >> ] >> >> 15/11/24 16:15:59 INFO util.Utils: Successfully started service >> 'sparkDriver' on port 61904. >> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering MapOutputTracker >> 15/11/24 16:15:59 INFO spark.SparkEnv: Registering BlockManagerMaster >> >> 15/11/24 16:15:59 INFO storage.DiskBlockManager: Created local directory >> at >> /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/blockmgr-33fbe6c4-5138-4eff-83b4-fb0c886667b7 >> >> 15/11/24 16:15:59 INFO storage.MemoryStore: MemoryStore started with >> capacity 1966.1 MB >> >> 15/11/24 16:15:59 INFO spark.HttpFileServer: HTTP File server directory >> is >> /data1/hadoop/nm-local-dir/usercache/hdfs-test/appcache/application_1447834709734_0120/spark-fbbfa2bd-6d30-421e-a634-4546134b3b5f/httpd-e31d7b8e-ca8f-400e-8b4b-d2993fb6f1d1 >> 15/11/24 16:15:59 INFO spark.HttpServer: Starting HTTP Server >> 15/11/24 16:15:59 INFO server.Server: jetty-8.y.z-SNAPSHOT >> 15/11/24 16:15:59 INFO server.AbstractConnector: Started >> SocketConnector@0.0.0.0:14692 >> >>
Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
Hi Tingwen, Would you minding sharing your changes in ExecutorAllocationManager#addExecutors(). >From my understanding and test, dynamic allocation can be worked when you set the min to max number of executors to the same number. Please check your Spark and Yarn log to make sure the executors are correctly started, the warning log means currently resource is not enough to submit tasks. Thanks Saisai On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳wrote: > Hi all, > I ran a SparkPi on YARN with Dynamic Allocation enabled and set > spark.dynamicAllocation.maxExecutors > equals > spark.dynamicAllocation.minExecutors,then I submit an application using: > ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master > yarn-cluster --driver-memory 4g --executor-memory 8g > lib/spark-examples*.jar 200 > > then, this application was submitted successfully, but the AppMaster > always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: > Initial job has not accepted any resources; check your cluster UI to ensure > that workers are registered and have sufficient resources” > and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG > ExecutorAllocationManager: Not adding executors because our current target > total is already 50 (limit 50)” in the console. > > I have fixed it by modifying code in > ExecutorAllocationManager.addExecutors,Does this a bug or it was designed > that we can’t set maxExecutors equals minExecutors? > > Thanks, > Weber >
Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
I don't think it is a bug, maybe something wrong with your Spark / Yarn configurations. On Tue, Nov 24, 2015 at 12:13 PM, 谢廷稳wrote: > OK,the YARN cluster was used by myself,it have 6 node witch can run over > 100 executor, and the YARN RM logs showed that the Spark application did > not requested resource from it. > > Is this a bug? Should I create a JIRA for this problem? > > 2015-11-24 12:00 GMT+08:00 Saisai Shao : > >> OK, so this looks like your Yarn cluster does not allocate containers >> which you supposed should be 50. Does the yarn cluster have enough resource >> after allocating AM container, if not, that is the problem. >> >> The problem not lies in dynamic allocation from my guess of your >> description. I said I'm OK with min and max executors to the same number. >> >> On Tue, Nov 24, 2015 at 11:54 AM, 谢廷稳 wrote: >> >>> Hi Saisai, >>> I'm sorry for did not describe it clearly,YARN debug log said I have 50 >>> executors,but ResourceManager showed that I only have 1 container for the >>> AppMaster. >>> >>> I have checked YARN RM logs,after AppMaster changed state from ACCEPTED >>> to RUNNING,it did not have log about this job any more.So,the problem is I >>> did not have any executor but ExecutorAllocationManager think I have.Would >>> you minding having a test in your cluster environment? >>> Thanks, >>> Weber >>> >>> 2015-11-24 11:00 GMT+08:00 Saisai Shao : >>> I think this behavior is expected, since you already have 50 executors launched, so no need to acquire additional executors. You change is not solid, it is just hiding the log. Again I think you should check the logs of Yarn and Spark to see if executors are started correctly. Why resource is still not enough where you already have 50 executors. On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳 wrote: > Hi SaiSai, > I have changed "if (numExecutorsTarget >= maxNumExecutors)" to "if > (numExecutorsTarget > maxNumExecutors)" of the first line in the > ExecutorAllocationManager#addExecutors() and it rans well. > In my opinion,when I was set minExecutors equals maxExecutors,when the > first time to add Executors,numExecutorsTarget equals maxNumExecutors and > it repeat printe "DEBUG ExecutorAllocationManager: Not adding > executors because our current target total is already 50 (limit 50)". > Thanks > Weber > > 2015-11-23 21:00 GMT+08:00 Saisai Shao : > >> Hi Tingwen, >> >> Would you minding sharing your changes in >> ExecutorAllocationManager#addExecutors(). >> >> From my understanding and test, dynamic allocation can be worked when >> you set the min to max number of executors to the same number. >> >> Please check your Spark and Yarn log to make sure the executors are >> correctly started, the warning log means currently resource is not enough >> to submit tasks. >> >> Thanks >> Saisai >> >> >> On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 wrote: >> >>> Hi all, >>> I ran a SparkPi on YARN with Dynamic Allocation enabled and set >>> spark.dynamicAllocation.maxExecutors >>> equals >>> spark.dynamicAllocation.minExecutors,then I submit an application >>> using: >>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi >>> --master yarn-cluster --driver-memory 4g --executor-memory 8g >>> lib/spark-examples*.jar 200 >>> >>> then, this application was submitted successfully, but the AppMaster >>> always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: >>> Initial job has not accepted any resources; check your cluster UI to >>> ensure >>> that workers are registered and have sufficient resources” >>> and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG >>> ExecutorAllocationManager: Not adding executors because our current >>> target >>> total is already 50 (limit 50)” in the console. >>> >>> I have fixed it by modifying code in >>> ExecutorAllocationManager.addExecutors,Does this a bug or it was >>> designed >>> that we can’t set maxExecutors equals minExecutors? >>> >>> Thanks, >>> Weber >>> >> >> > >>> >> >
Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
Hi Saisai, Would you mind giving me some tips about this problem? After check YARN RM logs, I think Spark application didn't request resources from it, So, I guess this problem is none of YARN's business. and the spark conf of my cluster will be list in the following: spark.shuffle.service.enabled true spark.dynamicAllocation.enabled true spark.shuffle.service.port 7337 spark.dynamicAllocation.maxExecutors 50 spark.dynamicAllocation.minExecutors 50 If I change spark.dynamicAllocation.minExecutors from 50 to 49 or else less than 50 it will work. So, I think even if it isn't a bug about Dynamic Alloction, something else may be wrong. And, you said that you are OK with min and max executors to the same number, Could you tell me your test cluster environment? Thanks 2015-11-24 13:10 GMT+08:00 Saisai Shao: > I don't think it is a bug, maybe something wrong with your Spark / Yarn > configurations. > > On Tue, Nov 24, 2015 at 12:13 PM, 谢廷稳 wrote: > >> OK,the YARN cluster was used by myself,it have 6 node witch can run over >> 100 executor, and the YARN RM logs showed that the Spark application did >> not requested resource from it. >> >> Is this a bug? Should I create a JIRA for this problem? >> >> 2015-11-24 12:00 GMT+08:00 Saisai Shao : >> >>> OK, so this looks like your Yarn cluster does not allocate containers >>> which you supposed should be 50. Does the yarn cluster have enough resource >>> after allocating AM container, if not, that is the problem. >>> >>> The problem not lies in dynamic allocation from my guess of your >>> description. I said I'm OK with min and max executors to the same number. >>> >>> On Tue, Nov 24, 2015 at 11:54 AM, 谢廷稳 wrote: >>> Hi Saisai, I'm sorry for did not describe it clearly,YARN debug log said I have 50 executors,but ResourceManager showed that I only have 1 container for the AppMaster. I have checked YARN RM logs,after AppMaster changed state from ACCEPTED to RUNNING,it did not have log about this job any more.So,the problem is I did not have any executor but ExecutorAllocationManager think I have.Would you minding having a test in your cluster environment? Thanks, Weber 2015-11-24 11:00 GMT+08:00 Saisai Shao : > I think this behavior is expected, since you already have 50 executors > launched, so no need to acquire additional executors. You change is not > solid, it is just hiding the log. > > Again I think you should check the logs of Yarn and Spark to see if > executors are started correctly. Why resource is still not enough where > you > already have 50 executors. > > On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳 wrote: > >> Hi SaiSai, >> I have changed "if (numExecutorsTarget >= maxNumExecutors)" to "if >> (numExecutorsTarget > maxNumExecutors)" of the first line in the >> ExecutorAllocationManager#addExecutors() and it rans well. >> In my opinion,when I was set minExecutors equals maxExecutors,when >> the first time to add Executors,numExecutorsTarget equals maxNumExecutors >> and it repeat printe "DEBUG ExecutorAllocationManager: Not adding >> executors because our current target total is already 50 (limit 50)". >> Thanks >> Weber >> >> 2015-11-23 21:00 GMT+08:00 Saisai Shao : >> >>> Hi Tingwen, >>> >>> Would you minding sharing your changes in >>> ExecutorAllocationManager#addExecutors(). >>> >>> From my understanding and test, dynamic allocation can be worked >>> when you set the min to max number of executors to the same number. >>> >>> Please check your Spark and Yarn log to make sure the executors are >>> correctly started, the warning log means currently resource is not >>> enough >>> to submit tasks. >>> >>> Thanks >>> Saisai >>> >>> >>> On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 wrote: >>> Hi all, I ran a SparkPi on YARN with Dynamic Allocation enabled and set spark.dynamicAllocation.maxExecutors equals spark.dynamicAllocation.minExecutors,then I submit an application using: ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 then, this application was submitted successfully, but the AppMaster always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources” and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG ExecutorAllocationManager: Not
Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
can you show your parameter values in your env ? yarn.nodemanager.resource.cpu-vcores yarn.nodemanager.resource.memory-mb cherrywayb...@gmail.com From: 谢廷稳 Date: 2015-11-24 12:13 To: Saisai Shao CC: spark users Subject: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction OK,the YARN cluster was used by myself,it have 6 node witch can run over 100 executor, and the YARN RM logs showed that the Spark application did not requested resource from it. Is this a bug? Should I create a JIRA for this problem? 2015-11-24 12:00 GMT+08:00 Saisai Shao: OK, so this looks like your Yarn cluster does not allocate containers which you supposed should be 50. Does the yarn cluster have enough resource after allocating AM container, if not, that is the problem. The problem not lies in dynamic allocation from my guess of your description. I said I'm OK with min and max executors to the same number. On Tue, Nov 24, 2015 at 11:54 AM, 谢廷稳 wrote: Hi Saisai, I'm sorry for did not describe it clearly,YARN debug log said I have 50 executors,but ResourceManager showed that I only have 1 container for the AppMaster. I have checked YARN RM logs,after AppMaster changed state from ACCEPTED to RUNNING,it did not have log about this job any more.So,the problem is I did not have any executor but ExecutorAllocationManager think I have.Would you minding having a test in your cluster environment? Thanks, Weber 2015-11-24 11:00 GMT+08:00 Saisai Shao : I think this behavior is expected, since you already have 50 executors launched, so no need to acquire additional executors. You change is not solid, it is just hiding the log. Again I think you should check the logs of Yarn and Spark to see if executors are started correctly. Why resource is still not enough where you already have 50 executors. On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳 wrote: Hi SaiSai, I have changed "if (numExecutorsTarget >= maxNumExecutors)" to "if (numExecutorsTarget > maxNumExecutors)" of the first line in the ExecutorAllocationManager#addExecutors() and it rans well. In my opinion,when I was set minExecutors equals maxExecutors,when the first time to add Executors,numExecutorsTarget equals maxNumExecutors and it repeat printe "DEBUG ExecutorAllocationManager: Not adding executors because our current target total is already 50 (limit 50)". Thanks Weber 2015-11-23 21:00 GMT+08:00 Saisai Shao : Hi Tingwen, Would you minding sharing your changes in ExecutorAllocationManager#addExecutors(). From my understanding and test, dynamic allocation can be worked when you set the min to max number of executors to the same number. Please check your Spark and Yarn log to make sure the executors are correctly started, the warning log means currently resource is not enough to submit tasks. Thanks Saisai On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 wrote: Hi all, I ran a SparkPi on YARN with Dynamic Allocation enabled and set spark.dynamicAllocation.maxExecutors equals spark.dynamicAllocation.minExecutors,then I submit an application using: ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 then, this application was submitted successfully, but the AppMaster always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources” and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG ExecutorAllocationManager: Not adding executors because our current target total is already 50 (limit 50)” in the console. I have fixed it by modifying code in ExecutorAllocationManager.addExecutors,Does this a bug or it was designed that we can’t set maxExecutors equals minExecutors? Thanks, Weber
Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
Hi Saisai, I'm sorry for did not describe it clearly,YARN debug log said I have 50 executors,but ResourceManager showed that I only have 1 container for the AppMaster. I have checked YARN RM logs,after AppMaster changed state from ACCEPTED to RUNNING,it did not have log about this job any more.So,the problem is I did not have any executor but ExecutorAllocationManager think I have.Would you minding having a test in your cluster environment? Thanks, Weber 2015-11-24 11:00 GMT+08:00 Saisai Shao: > I think this behavior is expected, since you already have 50 executors > launched, so no need to acquire additional executors. You change is not > solid, it is just hiding the log. > > Again I think you should check the logs of Yarn and Spark to see if > executors are started correctly. Why resource is still not enough where you > already have 50 executors. > > On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳 wrote: > >> Hi SaiSai, >> I have changed "if (numExecutorsTarget >= maxNumExecutors)" to "if >> (numExecutorsTarget > maxNumExecutors)" of the first line in the >> ExecutorAllocationManager#addExecutors() and it rans well. >> In my opinion,when I was set minExecutors equals maxExecutors,when the >> first time to add Executors,numExecutorsTarget equals maxNumExecutors and >> it repeat printe "DEBUG ExecutorAllocationManager: Not adding executors >> because our current target total is already 50 (limit 50)". >> Thanks >> Weber >> >> 2015-11-23 21:00 GMT+08:00 Saisai Shao : >> >>> Hi Tingwen, >>> >>> Would you minding sharing your changes in >>> ExecutorAllocationManager#addExecutors(). >>> >>> From my understanding and test, dynamic allocation can be worked when >>> you set the min to max number of executors to the same number. >>> >>> Please check your Spark and Yarn log to make sure the executors are >>> correctly started, the warning log means currently resource is not enough >>> to submit tasks. >>> >>> Thanks >>> Saisai >>> >>> >>> On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 wrote: >>> Hi all, I ran a SparkPi on YARN with Dynamic Allocation enabled and set spark.dynamicAllocation.maxExecutors equals spark.dynamicAllocation.minExecutors,then I submit an application using: ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200 then, this application was submitted successfully, but the AppMaster always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources” and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG ExecutorAllocationManager: Not adding executors because our current target total is already 50 (limit 50)” in the console. I have fixed it by modifying code in ExecutorAllocationManager.addExecutors,Does this a bug or it was designed that we can’t set maxExecutors equals minExecutors? Thanks, Weber >>> >>> >> >
Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
Hi SaiSai, I have changed "if (numExecutorsTarget >= maxNumExecutors)" to "if (numExecutorsTarget > maxNumExecutors)" of the first line in the ExecutorAllocationManager#addExecutors() and it rans well. In my opinion,when I was set minExecutors equals maxExecutors,when the first time to add Executors,numExecutorsTarget equals maxNumExecutors and it repeat printe "DEBUG ExecutorAllocationManager: Not adding executors because our current target total is already 50 (limit 50)". Thanks Weber 2015-11-23 21:00 GMT+08:00 Saisai Shao: > Hi Tingwen, > > Would you minding sharing your changes in > ExecutorAllocationManager#addExecutors(). > > From my understanding and test, dynamic allocation can be worked when you > set the min to max number of executors to the same number. > > Please check your Spark and Yarn log to make sure the executors are > correctly started, the warning log means currently resource is not enough > to submit tasks. > > Thanks > Saisai > > > On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 wrote: > >> Hi all, >> I ran a SparkPi on YARN with Dynamic Allocation enabled and set >> spark.dynamicAllocation.maxExecutors >> equals >> spark.dynamicAllocation.minExecutors,then I submit an application using: >> ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master >> yarn-cluster --driver-memory 4g --executor-memory 8g >> lib/spark-examples*.jar 200 >> >> then, this application was submitted successfully, but the AppMaster >> always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: >> Initial job has not accepted any resources; check your cluster UI to ensure >> that workers are registered and have sufficient resources” >> and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG >> ExecutorAllocationManager: Not adding executors because our current target >> total is already 50 (limit 50)” in the console. >> >> I have fixed it by modifying code in >> ExecutorAllocationManager.addExecutors,Does this a bug or it was designed >> that we can’t set maxExecutors equals minExecutors? >> >> Thanks, >> Weber >> > >
Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
I think this behavior is expected, since you already have 50 executors launched, so no need to acquire additional executors. You change is not solid, it is just hiding the log. Again I think you should check the logs of Yarn and Spark to see if executors are started correctly. Why resource is still not enough where you already have 50 executors. On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳wrote: > Hi SaiSai, > I have changed "if (numExecutorsTarget >= maxNumExecutors)" to "if > (numExecutorsTarget > maxNumExecutors)" of the first line in the > ExecutorAllocationManager#addExecutors() and it rans well. > In my opinion,when I was set minExecutors equals maxExecutors,when the > first time to add Executors,numExecutorsTarget equals maxNumExecutors and > it repeat printe "DEBUG ExecutorAllocationManager: Not adding executors > because our current target total is already 50 (limit 50)". > Thanks > Weber > > 2015-11-23 21:00 GMT+08:00 Saisai Shao : > >> Hi Tingwen, >> >> Would you minding sharing your changes in >> ExecutorAllocationManager#addExecutors(). >> >> From my understanding and test, dynamic allocation can be worked when you >> set the min to max number of executors to the same number. >> >> Please check your Spark and Yarn log to make sure the executors are >> correctly started, the warning log means currently resource is not enough >> to submit tasks. >> >> Thanks >> Saisai >> >> >> On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 wrote: >> >>> Hi all, >>> I ran a SparkPi on YARN with Dynamic Allocation enabled and set >>> spark.dynamicAllocation.maxExecutors >>> equals >>> spark.dynamicAllocation.minExecutors,then I submit an application using: >>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master >>> yarn-cluster --driver-memory 4g --executor-memory 8g >>> lib/spark-examples*.jar 200 >>> >>> then, this application was submitted successfully, but the AppMaster >>> always saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: >>> Initial job has not accepted any resources; check your cluster UI to ensure >>> that workers are registered and have sufficient resources” >>> and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG >>> ExecutorAllocationManager: Not adding executors because our current target >>> total is already 50 (limit 50)” in the console. >>> >>> I have fixed it by modifying code in >>> ExecutorAllocationManager.addExecutors,Does this a bug or it was designed >>> that we can’t set maxExecutors equals minExecutors? >>> >>> Thanks, >>> Weber >>> >> >> >