Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-24 Thread 谢廷稳
X
>
> 15/11/24 16:16:00 INFO yarn.YarnRMClient: Registering the ApplicationMaster
>
> 15/11/24 16:16:00 INFO yarn.ApplicationMaster: Started progress reporter 
> thread with (heartbeat : 3000, initial allocation : 200) intervals
>
> 15/11/24 16:16:29 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend 
> is ready for scheduling beginning after waiting 
> maxRegisteredResourcesWaitingTime: 3(ms)
>
> 15/11/24 16:16:29 INFO cluster.YarnClusterScheduler: 
> YarnClusterScheduler.postStartHook done
>
> 15/11/24 16:16:29 INFO spark.SparkContext: Starting job: reduce at 
> SparkPi.scala:36
>
> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Got job 0 (reduce at 
> SparkPi.scala:36) with 200 output partitions
>
> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Final stage: ResultStage 
> 0(reduce at SparkPi.scala:36)
>
> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Parents of final stage: List()
> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Missing parents: List()
>
> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Submitting ResultStage 0 
> (MapPartitionsRDD[1] at map at SparkPi.scala:32), which has no missing parents
>
> 15/11/24 16:16:30 INFO storage.MemoryStore: ensureFreeSpace(1888) called with 
> curMem=0, maxMem=2061647216
>
> 15/11/24 16:16:30 INFO storage.MemoryStore: Block broadcast_0 stored as 
> values in memory (estimated size 1888.0 B, free 1966.1 MB)
> 15/11/24 16:16:30 INFO storage.MemoryStore: ensureFreeSpace(1202) called with 
> curMem=1888, maxMem=2061647216
>
> 15/11/24 16:16:30 INFO storage.MemoryStore: Block broadcast_0_piece0 stored 
> as bytes in memory (estimated size 1202.0 B, free 1966.1 MB)
>
> 15/11/24 16:16:30 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in 
> memory on X.X.X.X:41830 (size: 1202.0 B, free: 1966.1 MB)
>
>
> 15/11/24 16:16:30 INFO spark.SparkContext: Created broadcast 0 from broadcast 
> at DAGScheduler.scala:861
>
> 15/11/24 16:16:30 INFO scheduler.DAGScheduler: Submitting 200 missing tasks 
> from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32)
>
> 15/11/24 16:16:30 INFO cluster.YarnClusterScheduler: Adding task set 0.0 with 
> 200 tasks
>
> 15/11/24 16:16:45 WARN cluster.YarnClusterScheduler: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient resources
>
> 15/11/24 16:17:00 WARN cluster.YarnClusterScheduler: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient resources
>
> 15/11/24 16:17:15 WARN cluster.YarnClusterScheduler: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient resources
>
> 15/11/24 16:17:30 WARN cluster.YarnClusterScheduler: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient resources
>
> 15/11/24 16:17:45 WARN cluster.YarnClusterScheduler: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient resources
>
> 15/11/24 16:18:00 WARN cluster.YarnClusterScheduler: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient resources
>
>
2015-11-24 15:14 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:

> What about this configure in Yarn "yarn.scheduler.maximum-allocation-mb"
>
> I'm curious why 49 executors can be worked, but 50 failed. Would you
> provide your application master log, if container request is issued, there
> will be log like:
>
> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Will request 2 executor
> containers, each with 1 cores and 1408 MB memory including 384 MB overhead
> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host: Any,
> capability: <memory:1408, vCores:1>)
> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host: Any,
> capability: <memory:1408, vCores:1>)
>
>
>
> On Tue, Nov 24, 2015 at 2:56 PM, 谢廷稳 <xieting...@gmail.com> wrote:
>
>> OK,  the YARN conf will be list in the following:
>>
>> yarn.nodemanager.resource.memory-mb:115200
>> yarn.nodemanager.resource.cpu-vcores:50
>>
>> I think the YARN resource is sufficient. In the previous letter I have
>> said that I think Spark application didn't request resources from YARN.
>>
>> Thanks
>>
>> 2015-11-24 14:30 GMT+08:00 cherrywayb...@gmail.com <
>> cherrywayb...@gmail.com>:
>>
>>> can you show your parameter values in your env ?
>>> yarn.nodemanager.res

Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-24 Thread Sabarish Sasidharan
quest 2 executor
>> containers, each with 1 cores and 1408 MB memory including 384 MB overhead
>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host: Any,
>> capability: <memory:1408, vCores:1>)
>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host: Any,
>> capability: <memory:1408, vCores:1>)
>>
>>
>>
>> On Tue, Nov 24, 2015 at 2:56 PM, 谢廷稳 <xieting...@gmail.com> wrote:
>>
>>> OK,  the YARN conf will be list in the following:
>>>
>>> yarn.nodemanager.resource.memory-mb:115200
>>> yarn.nodemanager.resource.cpu-vcores:50
>>>
>>> I think the YARN resource is sufficient. In the previous letter I have
>>> said that I think Spark application didn't request resources from YARN.
>>>
>>> Thanks
>>>
>>> 2015-11-24 14:30 GMT+08:00 cherrywayb...@gmail.com <
>>> cherrywayb...@gmail.com>:
>>>
>>>> can you show your parameter values in your env ?
>>>> yarn.nodemanager.resource.cpu-vcores
>>>> yarn.nodemanager.resource.memory-mb
>>>>
>>>> --
>>>> cherrywayb...@gmail.com
>>>>
>>>>
>>>> *From:* 谢廷稳 <xieting...@gmail.com>
>>>> *Date:* 2015-11-24 12:13
>>>> *To:* Saisai Shao <sai.sai.s...@gmail.com>
>>>> *CC:* spark users <user@spark.apache.org>
>>>> *Subject:* Re: A Problem About Running Spark 1.5 on YARN with Dynamic
>>>> Alloction
>>>> OK,the YARN cluster was used by myself,it have 6 node witch can run
>>>> over 100 executor, and the YARN RM logs showed that the Spark application
>>>> did not requested resource from it.
>>>>
>>>> Is this a bug? Should I create a JIRA for this problem?
>>>>
>>>> 2015-11-24 12:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>>
>>>>> OK, so this looks like your Yarn cluster  does not allocate containers
>>>>> which you supposed should be 50. Does the yarn cluster have enough 
>>>>> resource
>>>>> after allocating AM container, if not, that is the problem.
>>>>>
>>>>> The problem not lies in dynamic allocation from my guess of your
>>>>> description. I said I'm OK with min and max executors to the same number.
>>>>>
>>>>> On Tue, Nov 24, 2015 at 11:54 AM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>>>
>>>>>> Hi Saisai,
>>>>>> I'm sorry for did not describe it clearly,YARN debug log said I have
>>>>>> 50 executors,but ResourceManager showed that I only have 1 container for
>>>>>> the AppMaster.
>>>>>>
>>>>>> I have checked YARN RM logs,after AppMaster changed state
>>>>>> from ACCEPTED to RUNNING,it did not have log about this job any 
>>>>>> more.So,the
>>>>>> problem is I did not have any executor but ExecutorAllocationManager 
>>>>>> think
>>>>>> I have.Would you minding having a test in your cluster environment?
>>>>>> Thanks,
>>>>>> Weber
>>>>>>
>>>>>> 2015-11-24 11:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>>>>
>>>>>>> I think this behavior is expected, since you already have 50
>>>>>>> executors launched, so no need to acquire additional executors. You 
>>>>>>> change
>>>>>>> is not solid, it is just hiding the log.
>>>>>>>
>>>>>>> Again I think you should check the logs of Yarn and Spark to see if
>>>>>>> executors are started correctly. Why resource is still not enough where 
>>>>>>> you
>>>>>>> already have 50 executors.
>>>>>>>
>>>>>>> On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi SaiSai,
>>>>>>>> I have changed  "if (numExecutorsTarget >= maxNumExecutors)"  to
>>>>>>>> "if (numExecutorsTarget > maxNumExecutors)" of the first line in the
>>>>>>>> ExecutorAllocationManager#addExecutors() and it rans well.
>>>>>>>> In my opinion,when I was set minExecutors equals maxExecutors,when
>>>>>>>> the first time to add Executors,numExecutorsTarget equals 
>>>>>>>> maxNumExecutors
>>>>>>>> and it repeat printe "DEBUG ExecutorAllocationManager: Not adding
>>>>>>>> executors because our current target total is already 50 (limit 50)
>>>>>>>> ".
>>>>>>>> Thanks
>>>>>>>> Weber
>>>>>>>>
>>>>>>>> 2015-11-23 21:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>>>>>>
>>>>>>>>> Hi Tingwen,
>>>>>>>>>
>>>>>>>>> Would you minding sharing your changes in
>>>>>>>>> ExecutorAllocationManager#addExecutors().
>>>>>>>>>
>>>>>>>>> From my understanding and test, dynamic allocation can be worked
>>>>>>>>> when you set the min to max number of executors to the same number.
>>>>>>>>>
>>>>>>>>> Please check your Spark and Yarn log to make sure the executors
>>>>>>>>> are correctly started, the warning log means currently resource is not
>>>>>>>>> enough to submit tasks.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Saisai
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>> I ran a SparkPi on YARN with Dynamic Allocation enabled and set 
>>>>>>>>>> spark.dynamicAllocation.maxExecutors
>>>>>>>>>> equals
>>>>>>>>>> spark.dynamicAllocation.minExecutors,then I submit an application
>>>>>>>>>> using:
>>>>>>>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>>>>>>>> --master yarn-cluster --driver-memory 4g --executor-memory 8g
>>>>>>>>>> lib/spark-examples*.jar 200
>>>>>>>>>>
>>>>>>>>>> then, this application was submitted successfully, but the
>>>>>>>>>> AppMaster always saying “15/11/23 20:13:08 WARN
>>>>>>>>>> cluster.YarnClusterScheduler: Initial job has not accepted any 
>>>>>>>>>> resources;
>>>>>>>>>> check your cluster UI to ensure that workers are registered and have
>>>>>>>>>> sufficient resources”
>>>>>>>>>> and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG
>>>>>>>>>> ExecutorAllocationManager: Not adding executors because our current 
>>>>>>>>>> target
>>>>>>>>>> total is already 50 (limit 50)” in the console.
>>>>>>>>>>
>>>>>>>>>> I have fixed it by modifying code in
>>>>>>>>>> ExecutorAllocationManager.addExecutors,Does this a bug or it was 
>>>>>>>>>> designed
>>>>>>>>>> that we can’t set maxExecutors equals minExecutors?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Weber
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-24 Thread Saisai Shao
ter UI to ensure that workers are 
>>> registered and have sufficient resources
>>>
>>> 15/11/24 16:17:30 WARN cluster.YarnClusterScheduler: Initial job has not 
>>> accepted any resources; check your cluster UI to ensure that workers are 
>>> registered and have sufficient resources
>>>
>>> 15/11/24 16:17:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>> accepted any resources; check your cluster UI to ensure that workers are 
>>> registered and have sufficient resources
>>>
>>> 15/11/24 16:18:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>> accepted any resources; check your cluster UI to ensure that workers are 
>>> registered and have sufficient resources
>>>
>>>
>> 2015-11-24 15:14 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>
>>> What about this configure in Yarn "yarn.scheduler.maximum-allocation-mb"
>>>
>>> I'm curious why 49 executors can be worked, but 50 failed. Would you
>>> provide your application master log, if container request is issued, there
>>> will be log like:
>>>
>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Will request 2 executor
>>> containers, each with 1 cores and 1408 MB memory including 384 MB overhead
>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host: Any,
>>> capability: <memory:1408, vCores:1>)
>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host: Any,
>>> capability: <memory:1408, vCores:1>)
>>>
>>>
>>>
>>> On Tue, Nov 24, 2015 at 2:56 PM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>
>>>> OK,  the YARN conf will be list in the following:
>>>>
>>>> yarn.nodemanager.resource.memory-mb:115200
>>>> yarn.nodemanager.resource.cpu-vcores:50
>>>>
>>>> I think the YARN resource is sufficient. In the previous letter I have
>>>> said that I think Spark application didn't request resources from YARN.
>>>>
>>>> Thanks
>>>>
>>>> 2015-11-24 14:30 GMT+08:00 cherrywayb...@gmail.com <
>>>> cherrywayb...@gmail.com>:
>>>>
>>>>> can you show your parameter values in your env ?
>>>>> yarn.nodemanager.resource.cpu-vcores
>>>>> yarn.nodemanager.resource.memory-mb
>>>>>
>>>>> --
>>>>> cherrywayb...@gmail.com
>>>>>
>>>>>
>>>>> *From:* 谢廷稳 <xieting...@gmail.com>
>>>>> *Date:* 2015-11-24 12:13
>>>>> *To:* Saisai Shao <sai.sai.s...@gmail.com>
>>>>> *CC:* spark users <user@spark.apache.org>
>>>>> *Subject:* Re: A Problem About Running Spark 1.5 on YARN with Dynamic
>>>>> Alloction
>>>>> OK,the YARN cluster was used by myself,it have 6 node witch can run
>>>>> over 100 executor, and the YARN RM logs showed that the Spark application
>>>>> did not requested resource from it.
>>>>>
>>>>> Is this a bug? Should I create a JIRA for this problem?
>>>>>
>>>>> 2015-11-24 12:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>>>
>>>>>> OK, so this looks like your Yarn cluster  does not allocate
>>>>>> containers which you supposed should be 50. Does the yarn cluster have
>>>>>> enough resource after allocating AM container, if not, that is the 
>>>>>> problem.
>>>>>>
>>>>>> The problem not lies in dynamic allocation from my guess of your
>>>>>> description. I said I'm OK with min and max executors to the same number.
>>>>>>
>>>>>> On Tue, Nov 24, 2015 at 11:54 AM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Saisai,
>>>>>>> I'm sorry for did not describe it clearly,YARN debug log said I have
>>>>>>> 50 executors,but ResourceManager showed that I only have 1 container for
>>>>>>> the AppMaster.
>>>>>>>
>>>>>>> I have checked YARN RM logs,after AppMaster changed state
>>>>>>> from ACCEPTED to RUNNING,it did not have log about this job any 
>>>>>>> more.So,the
>>>>>>> problem is I did not have any executor but ExecutorAllocationManager 
>>>>>>> think
>>>>>>&

Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-24 Thread 谢廷稳
ze 1202.0 B, free 1966.1 MB)
>>>>
>>>> 15/11/24 16:16:30 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 
>>>> in memory on X.X.X.X:41830 (size: 1202.0 B, free: 1966.1 MB)
>>>>
>>>>
>>>> 15/11/24 16:16:30 INFO spark.SparkContext: Created broadcast 0 from 
>>>> broadcast at DAGScheduler.scala:861
>>>>
>>>> 15/11/24 16:16:30 INFO scheduler.DAGScheduler: Submitting 200 missing 
>>>> tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32)
>>>>
>>>> 15/11/24 16:16:30 INFO cluster.YarnClusterScheduler: Adding task set 0.0 
>>>> with 200 tasks
>>>>
>>>> 15/11/24 16:16:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>> registered and have sufficient resources
>>>>
>>>> 15/11/24 16:17:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>> registered and have sufficient resources
>>>>
>>>> 15/11/24 16:17:15 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>> registered and have sufficient resources
>>>>
>>>> 15/11/24 16:17:30 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>> registered and have sufficient resources
>>>>
>>>> 15/11/24 16:17:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>> registered and have sufficient resources
>>>>
>>>> 15/11/24 16:18:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>> registered and have sufficient resources
>>>>
>>>>
>>> 2015-11-24 15:14 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>
>>>> What about this configure in Yarn "yarn.scheduler.maximum-allocation-mb
>>>> "
>>>>
>>>> I'm curious why 49 executors can be worked, but 50 failed. Would you
>>>> provide your application master log, if container request is issued, there
>>>> will be log like:
>>>>
>>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Will request 2 executor
>>>> containers, each with 1 cores and 1408 MB memory including 384 MB overhead
>>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host:
>>>> Any, capability: <memory:1408, vCores:1>)
>>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host:
>>>> Any, capability: <memory:1408, vCores:1>)
>>>>
>>>>
>>>>
>>>> On Tue, Nov 24, 2015 at 2:56 PM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>>
>>>>> OK,  the YARN conf will be list in the following:
>>>>>
>>>>> yarn.nodemanager.resource.memory-mb:115200
>>>>> yarn.nodemanager.resource.cpu-vcores:50
>>>>>
>>>>> I think the YARN resource is sufficient. In the previous letter I
>>>>> have said that I think Spark application didn't request resources
>>>>> from YARN.
>>>>>
>>>>> Thanks
>>>>>
>>>>> 2015-11-24 14:30 GMT+08:00 cherrywayb...@gmail.com <
>>>>> cherrywayb...@gmail.com>:
>>>>>
>>>>>> can you show your parameter values in your env ?
>>>>>> yarn.nodemanager.resource.cpu-vcores
>>>>>> yarn.nodemanager.resource.memory-mb
>>>>>>
>>>>>> --
>>>>>> cherrywayb...@gmail.com
>>>>>>
>>>>>>
>>>>>> *From:* 谢廷稳 <xieting...@gmail.com>
>>>>>> *Date:* 2015-11-24 12:13
>>>>>> *To:* Saisai Shao <sai.sai.s...@gmail.com>
>>>>>> *CC:* spark users <user@spark.apache.org>
>>>>>> *Subject:* Re: A Problem About Running Spark 1.5 on YARN with
>>>>>> Dynamic Alloction
>>>>>> OK,the YARN cluster was used by myself,it have 6 node witch can run
>>>>>> over 100 executor, and the YARN RM logs

Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-24 Thread Saisai Shao
;>> 0(reduce at SparkPi.scala:36)
>>>>>
>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Parents of final stage: 
>>>>> List()
>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Missing parents: List()
>>>>>
>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Submitting ResultStage 0 
>>>>> (MapPartitionsRDD[1] at map at SparkPi.scala:32), which has no missing 
>>>>> parents
>>>>>
>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: ensureFreeSpace(1888) called 
>>>>> with curMem=0, maxMem=2061647216
>>>>>
>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: Block broadcast_0 stored as 
>>>>> values in memory (estimated size 1888.0 B, free 1966.1 MB)
>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: ensureFreeSpace(1202) called 
>>>>> with curMem=1888, maxMem=2061647216
>>>>>
>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: Block broadcast_0_piece0 
>>>>> stored as bytes in memory (estimated size 1202.0 B, free 1966.1 MB)
>>>>>
>>>>> 15/11/24 16:16:30 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 
>>>>> in memory on X.X.X.X:41830 (size: 1202.0 B, free: 1966.1 MB)
>>>>>
>>>>>
>>>>> 15/11/24 16:16:30 INFO spark.SparkContext: Created broadcast 0 from 
>>>>> broadcast at DAGScheduler.scala:861
>>>>>
>>>>> 15/11/24 16:16:30 INFO scheduler.DAGScheduler: Submitting 200 missing 
>>>>> tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32)
>>>>>
>>>>> 15/11/24 16:16:30 INFO cluster.YarnClusterScheduler: Adding task set 0.0 
>>>>> with 200 tasks
>>>>>
>>>>> 15/11/24 16:16:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>> registered and have sufficient resources
>>>>>
>>>>> 15/11/24 16:17:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>> registered and have sufficient resources
>>>>>
>>>>> 15/11/24 16:17:15 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>> registered and have sufficient resources
>>>>>
>>>>> 15/11/24 16:17:30 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>> registered and have sufficient resources
>>>>>
>>>>> 15/11/24 16:17:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>> registered and have sufficient resources
>>>>>
>>>>> 15/11/24 16:18:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>> registered and have sufficient resources
>>>>>
>>>>>
>>>> 2015-11-24 15:14 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>>
>>>>> What about this configure in Yarn "
>>>>> yarn.scheduler.maximum-allocation-mb"
>>>>>
>>>>> I'm curious why 49 executors can be worked, but 50 failed. Would you
>>>>> provide your application master log, if container request is issued, there
>>>>> will be log like:
>>>>>
>>>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Will request 2 executor
>>>>> containers, each with 1 cores and 1408 MB memory including 384 MB overhead
>>>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host:
>>>>> Any, capability: <memory:1408, vCores:1>)
>>>>> 15/10/14 17:35:37 INFO yarn.YarnAllocator: Container request (host:
>>>>> Any, capability: <memory:1408, vCores:1>)
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 24, 2015 at 2:56 PM, 谢廷稳 <xieting...@gmail.com> wrote:
>>>>>
>>>>>> OK,  the YARN conf will be list in the following:
>>>>>>
>>>>>> yarn.nodemanager.resource.memo

Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-24 Thread 谢廷稳
t : 3000, initial allocation : 200) intervals
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO cluster.YarnClusterSchedulerBackend: 
>>>>>> SchedulerBackend is ready for scheduling beginning after waiting 
>>>>>> maxRegisteredResourcesWaitingTime: 3(ms)
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO cluster.YarnClusterScheduler: 
>>>>>> YarnClusterScheduler.postStartHook done
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO spark.SparkContext: Starting job: reduce at 
>>>>>> SparkPi.scala:36
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Got job 0 (reduce at 
>>>>>> SparkPi.scala:36) with 200 output partitions
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Final stage: ResultStage 
>>>>>> 0(reduce at SparkPi.scala:36)
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Parents of final stage: 
>>>>>> List()
>>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Missing parents: List()
>>>>>>
>>>>>> 15/11/24 16:16:29 INFO scheduler.DAGScheduler: Submitting ResultStage 0 
>>>>>> (MapPartitionsRDD[1] at map at SparkPi.scala:32), which has no missing 
>>>>>> parents
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: ensureFreeSpace(1888) called 
>>>>>> with curMem=0, maxMem=2061647216
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: Block broadcast_0 stored as 
>>>>>> values in memory (estimated size 1888.0 B, free 1966.1 MB)
>>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: ensureFreeSpace(1202) called 
>>>>>> with curMem=1888, maxMem=2061647216
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO storage.MemoryStore: Block broadcast_0_piece0 
>>>>>> stored as bytes in memory (estimated size 1202.0 B, free 1966.1 MB)
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO storage.BlockManagerInfo: Added 
>>>>>> broadcast_0_piece0 in memory on X.X.X.X:41830 (size: 1202.0 B, free: 
>>>>>> 1966.1 MB)
>>>>>>
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO spark.SparkContext: Created broadcast 0 from 
>>>>>> broadcast at DAGScheduler.scala:861
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO scheduler.DAGScheduler: Submitting 200 missing 
>>>>>> tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32)
>>>>>>
>>>>>> 15/11/24 16:16:30 INFO cluster.YarnClusterScheduler: Adding task set 0.0 
>>>>>> with 200 tasks
>>>>>>
>>>>>> 15/11/24 16:16:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>>> registered and have sufficient resources
>>>>>>
>>>>>> 15/11/24 16:17:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>>> registered and have sufficient resources
>>>>>>
>>>>>> 15/11/24 16:17:15 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>>> registered and have sufficient resources
>>>>>>
>>>>>> 15/11/24 16:17:30 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>>> registered and have sufficient resources
>>>>>>
>>>>>> 15/11/24 16:17:45 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>>> registered and have sufficient resources
>>>>>>
>>>>>> 15/11/24 16:18:00 WARN cluster.YarnClusterScheduler: Initial job has not 
>>>>>> accepted any resources; check your cluster UI to ensure that workers are 
>>>>>> registered and have sufficient resources
>>>>>>
>>>>>>
>>>>> 2015-11-24 15:14 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>>

Re: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction

2015-11-23 Thread cherrywayb...@gmail.com
can you show your parameter values in your env ?
yarn.nodemanager.resource.cpu-vcores 
yarn.nodemanager.resource.memory-mb



cherrywayb...@gmail.com
 
From: 谢廷稳
Date: 2015-11-24 12:13
To: Saisai Shao
CC: spark users
Subject: Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction
OK,the YARN cluster was used by myself,it have 6 node witch can run over 100 
executor, and the YARN RM logs showed that the Spark application did not 
requested resource from it.

Is this a bug? Should I create a JIRA for this problem?

2015-11-24 12:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
OK, so this looks like your Yarn cluster  does not allocate containers which 
you supposed should be 50. Does the yarn cluster have enough resource after 
allocating AM container, if not, that is the problem.

The problem not lies in dynamic allocation from my guess of your description. I 
said I'm OK with min and max executors to the same number.

On Tue, Nov 24, 2015 at 11:54 AM, 谢廷稳 <xieting...@gmail.com> wrote:
Hi Saisai,
I'm sorry for did not describe it clearly,YARN debug log said I have 50 
executors,but ResourceManager showed that I only have 1 container for the 
AppMaster.

I have checked YARN RM logs,after AppMaster changed state from ACCEPTED to 
RUNNING,it did not have log about this job any more.So,the problem is I did not 
have any executor but ExecutorAllocationManager think I have.Would you minding 
having a test in your cluster environment?
Thanks,
Weber

2015-11-24 11:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
I think this behavior is expected, since you already have 50 executors 
launched, so no need to acquire additional executors. You change is not solid, 
it is just hiding the log.

Again I think you should check the logs of Yarn and Spark to see if executors 
are started correctly. Why resource is still not enough where you already have 
50 executors.

On Tue, Nov 24, 2015 at 10:48 AM, 谢廷稳 <xieting...@gmail.com> wrote:
Hi SaiSai,
I have changed  "if (numExecutorsTarget >= maxNumExecutors)"  to "if 
(numExecutorsTarget > maxNumExecutors)" of the first line in the 
ExecutorAllocationManager#addExecutors() and it rans well.
In my opinion,when I was set minExecutors equals maxExecutors,when the first 
time to add Executors,numExecutorsTarget equals maxNumExecutors and it repeat 
printe "DEBUG ExecutorAllocationManager: Not adding executors because our 
current target total is already 50 (limit 50)".
Thanks
Weber

2015-11-23 21:00 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
Hi Tingwen,

Would you minding sharing your changes in 
ExecutorAllocationManager#addExecutors().

From my understanding and test, dynamic allocation can be worked when you set 
the min to max number of executors to the same number.

Please check your Spark and Yarn log to make sure the executors are correctly 
started, the warning log means currently resource is not enough to submit tasks.

Thanks
Saisai


On Mon, Nov 23, 2015 at 8:41 PM, 谢廷稳 <xieting...@gmail.com> wrote:
Hi all,
I ran a SparkPi on YARN with Dynamic Allocation enabled and set 
spark.dynamicAllocation.maxExecutors equals
spark.dynamicAllocation.minExecutors,then I submit an application using:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --driver-memory 4g --executor-memory 8g lib/spark-examples*.jar 200

then, this application was submitted successfully, but the AppMaster always 
saying “15/11/23 20:13:08 WARN cluster.YarnClusterScheduler: Initial job has 
not accepted any resources; check your cluster UI to ensure that workers are 
registered and have sufficient resources” 
and when I open DEBUG,I found “15/11/23 20:24:00 DEBUG 
ExecutorAllocationManager: Not adding executors because our current target 
total is already 50 (limit 50)” in the console.

I have fixed it by modifying code in 
ExecutorAllocationManager.addExecutors,Does this a bug or it was designed that 
we can’t set maxExecutors equals minExecutors?

Thanks,
Weber