subject:"Re\: problem getting fine grained scaling workig"

Re: problem getting fine grained scaling workig

2016-06-08 Thread Stephen Gran

Hi,

Thanks for doing the update.  Let's see if I contribute a few more times 
- if it becomes a pain for you / others to gatekeep me, we can revisit 
access then.

Cheers,

On 08/06/16 13:49, Darin Johnson wrote:
> Will do today, if you'd like to help with the documentation I could give
> you access.
>
> On Wed, Jun 8, 2016 at 3:14 AM, Stephen Gran 
> wrote:
>
>> Hi,
>>
>> Can someone with access please correct the screenshot here:
>> https://cwiki.apache.org/confluence/display/MYRIAD/Fine-grained+Scaling
>>
>> This gives the strong impression that you don't need an NM with non-zero
>> resources.  I think this is what initially steered me down the wrong path.
>>
>> Cheers,
>>
>> On 03/06/16 16:38, Darin Johnson wrote:
>>> That is correct you need at least one node manager with the minimum
>>> requirements to launch an ApplicationMaster.  Otherwise YARN will throw
>> an
>>> exception.
>>>
>>> On Fri, Jun 3, 2016 at 10:52 AM, yuliya Feldman
>> >>
 I believe you need at least one NM that is not subject to fine grain
 scaling.
 So far if total resources on the cluster is less then a single container
 needs for AM you won't be able to submit any app.As exception below
>> tells
 you.
 (Invalid resource request, requested memory < 0, or requested memory
>>> max
 configured, requestedMemory=1536, maxMemory=0
   at)
 I believe by default when starting Myriad cluster one NM with non 0
 capacity should start by default.
 In addition see in RM log whether offers with resources are coming to
>> RM -
 this info should be in the log.

 From: Stephen Gran 
To: "dev@myriad.incubator.apache.org" <
>> dev@myriad.incubator.apache.org>
Sent: Friday, June 3, 2016 1:29 AM
Subject: problem getting fine grained scaling workig

 Hi,

 I'm trying to get fine grained scaling going on a test mesos cluster.  I
 have a single master and 2 agents.  I am running 2 node managers with
 the zero profile, one per agent.  I can see both of them in the RM UI
 reporting correctly as having 0 resources.

 I'm getting stack traces when I try to launch a sample application,
 though.  I feel like I'm just missing something obvious somewhere - can
 anyone shed any light?

 This is on a build of yesterday's git head.

 Cheers,

 root@master:/srv/apps/hadoop# bin/yarn jar
 share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
 /outDir
 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
 master.testing.local/10.0.5.3:8032
 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
 job: job_1464902078156_0001
 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
 area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
 java.io.IOException:
 org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
 Invalid resource request, requested memory < 0, or requested memory >
 max configured, requestedMemory=1536, maxMemory=0
   at


>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
   at


>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
   at


>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
   at


>> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
   at


>> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
   at


>> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
   at


>> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
   at


>> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
   at


>> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
   at


>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
   at

Re: problem getting fine grained scaling workig

2016-06-08 Thread Darin Johnson

Will do today, if you'd like to help with the documentation I could give
you access.

On Wed, Jun 8, 2016 at 3:14 AM, Stephen Gran 
wrote:

> Hi,
>
> Can someone with access please correct the screenshot here:
> https://cwiki.apache.org/confluence/display/MYRIAD/Fine-grained+Scaling
>
> This gives the strong impression that you don't need an NM with non-zero
> resources.  I think this is what initially steered me down the wrong path.
>
> Cheers,
>
> On 03/06/16 16:38, Darin Johnson wrote:
> > That is correct you need at least one node manager with the minimum
> > requirements to launch an ApplicationMaster.  Otherwise YARN will throw
> an
> > exception.
> >
> > On Fri, Jun 3, 2016 at 10:52 AM, yuliya Feldman
>  >> wrote:
> >
> >> I believe you need at least one NM that is not subject to fine grain
> >> scaling.
> >> So far if total resources on the cluster is less then a single container
> >> needs for AM you won't be able to submit any app.As exception below
> tells
> >> you.
> >> (Invalid resource request, requested memory < 0, or requested memory
> >max
> >> configured, requestedMemory=1536, maxMemory=0
> >>  at)
> >> I believe by default when starting Myriad cluster one NM with non 0
> >> capacity should start by default.
> >> In addition see in RM log whether offers with resources are coming to
> RM -
> >> this info should be in the log.
> >>
> >>From: Stephen Gran 
> >>   To: "dev@myriad.incubator.apache.org" <
> dev@myriad.incubator.apache.org>
> >>   Sent: Friday, June 3, 2016 1:29 AM
> >>   Subject: problem getting fine grained scaling workig
> >>
> >> Hi,
> >>
> >> I'm trying to get fine grained scaling going on a test mesos cluster.  I
> >> have a single master and 2 agents.  I am running 2 node managers with
> >> the zero profile, one per agent.  I can see both of them in the RM UI
> >> reporting correctly as having 0 resources.
> >>
> >> I'm getting stack traces when I try to launch a sample application,
> >> though.  I feel like I'm just missing something obvious somewhere - can
> >> anyone shed any light?
> >>
> >> This is on a build of yesterday's git head.
> >>
> >> Cheers,
> >>
> >> root@master:/srv/apps/hadoop# bin/yarn jar
> >> share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
> >> /outDir
> >> 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
> >> master.testing.local/10.0.5.3:8032
> >> 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
> >> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
> >> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
> >> job: job_1464902078156_0001
> >> 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
> >> area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
> >> java.io.IOException:
> >> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
> >> Invalid resource request, requested memory < 0, or requested memory >
> >> max configured, requestedMemory=1536, maxMemory=0
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
> >>  at
> >>
> >>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> >>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> >>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> >>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> >>  at java.security.AccessController.doPrivileged(Native Method)
> >>  at javax.security.auth.Subject.doAs(Subject.java:422)
> >>  at
> >>
> >>
>

Re: problem getting fine grained scaling workig

2016-06-08 Thread Stephen Gran

Hi,

Can someone with access please correct the screenshot here:
https://cwiki.apache.org/confluence/display/MYRIAD/Fine-grained+Scaling

This gives the strong impression that you don't need an NM with non-zero 
resources.  I think this is what initially steered me down the wrong path.

Cheers,

On 03/06/16 16:38, Darin Johnson wrote:
> That is correct you need at least one node manager with the minimum
> requirements to launch an ApplicationMaster.  Otherwise YARN will throw an
> exception.
>
> On Fri, Jun 3, 2016 at 10:52 AM, yuliya Feldman > wrote:
>
>> I believe you need at least one NM that is not subject to fine grain
>> scaling.
>> So far if total resources on the cluster is less then a single container
>> needs for AM you won't be able to submit any app.As exception below tells
>> you.
>> (Invalid resource request, requested memory < 0, or requested memory >max
>> configured, requestedMemory=1536, maxMemory=0
>>  at)
>> I believe by default when starting Myriad cluster one NM with non 0
>> capacity should start by default.
>> In addition see in RM log whether offers with resources are coming to RM -
>> this info should be in the log.
>>
>>From: Stephen Gran 
>>   To: "dev@myriad.incubator.apache.org" 
>>   Sent: Friday, June 3, 2016 1:29 AM
>>   Subject: problem getting fine grained scaling workig
>>
>> Hi,
>>
>> I'm trying to get fine grained scaling going on a test mesos cluster.  I
>> have a single master and 2 agents.  I am running 2 node managers with
>> the zero profile, one per agent.  I can see both of them in the RM UI
>> reporting correctly as having 0 resources.
>>
>> I'm getting stack traces when I try to launch a sample application,
>> though.  I feel like I'm just missing something obvious somewhere - can
>> anyone shed any light?
>>
>> This is on a build of yesterday's git head.
>>
>> Cheers,
>>
>> root@master:/srv/apps/hadoop# bin/yarn jar
>> share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
>> /outDir
>> 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
>> master.testing.local/10.0.5.3:8032
>> 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
>> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
>> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
>> job: job_1464902078156_0001
>> 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
>> java.io.IOException:
>> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
>> Invalid resource request, requested memory < 0, or requested memory >
>> max configured, requestedMemory=1536, maxMemory=0
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
>>  at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
>>  at
>>
>> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
>>  at
>>
>> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
>>  at
>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>>  at java.security.AccessController.doPrivileged(Native Method)
>>  at javax.security.auth.Subject.doAs(Subject.java:422)
>>  at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>>
>>  at
>> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
>>  at
>>
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>>  at

Re: problem getting fine grained scaling workig

2016-06-06 Thread Darin Johnson

No worries, keep me posted.  I think we did a good proof of concept, we're
trying to make it solid now so if you find any issues let us know.

Darin
On Jun 5, 2016 2:57 PM, "Stephen Gran"  wrote:

> Hi,
>
> Brilliant!  Working now.
>
> Thank you very much,
>
> On 05/06/16 18:09, Darin Johnson wrote:
> > Stephen,
> >
> > I was able to recreate the problem (specific due to 2.7.2, they changed
> the
> > defaults on the following two properties to true).  Setting them to false
> > allowed me to again run map reduce jobs.  I'll try to update the
> > documentation later today.
> >
> >
> >
> >  yarn.nodemanager.pmem-check-enabled
> >
> >  false
> >
> >
> >
> >
> >
> >  yarn.nodemanager.vmem-check-enabled
> >
> >  false
> >
> >
> >
> > Darin
> >
> > On Sun, Jun 5, 2016 at 10:30 AM, Stephen Gran 
> > wrote:
> >
> >> Hi,
> >>
> >> I think those are the properties I added when I started getting this
> >> error.  Removing them doesn't seem to make any difference, sadly.
> >>
> >> This is hadoop 2.7.2
> >>
> >> Cheers,
> >>
> >> On 05/06/16 14:45, Darin Johnson wrote:
> >>> Hey Stephen,
> >>>
> >>> I think you're pretty close.
> >>>
> >>> Looking at the config I'd suggest removing these properties:
> >>>
> >>>  
> >>>   yarn.nodemanager.resource.memory-mb
> >>>   4096
> >>>   
> >>>   
> >>>   yarn.scheduler.maximum-allocation-vcores
> >>>   12
> >>>   
> >>>   
> >>>   yarn.scheduler.maximum-allocation-mb
> >>>   8192
> >>>   
> >>> 
> >>>  yarn.nodemanager.vmem-check-enabled
> >>>   false
> >>>   Whether virtual memory limits will be enforced for
> >>> containers
> >>> 
> >>> 
> >>>  yarn.nodemanager.vmem-pmem-ratio
> >>>   4
> >>>   Ratio between virtual memory to physical memory when
> >>> setting memory limits for containers
> >>> 
> >>>
> >>> I'll try them out on my test cluster later today/tonight and see if I
> can
> >>> recreate the problem.  What version of hadoop are you running?  I'll
> make
> >>> sure I'm consistent with that as well.
> >>>
> >>> Thanks,
> >>>
> >>> Darin
> >>> On Jun 5, 2016 8:15 AM, "Stephen Gran" 
> wrote:
> >>>
>  Hi,
> 
>  Attached.  Thanks very much for looking.
> 
>  Cheers,
> 
>  On 05/06/16 12:51, Darin Johnson wrote:
> > Hey Steven can you please send your yarn-site.xml, I'm guessing
> you're
> >> on
> > the right track.
> >
> > Darin
> > Hi,
> >
> > OK.  That helps, thank you.  I think I just misunderstood the docs
> (or
> > they never said explicitly that you did need at least some static
> > resource), and I scaled down the initial nm.medium that got
> started.  I
> > get a bit further now, and jobs start but are killed with:
> >
> > Diagnostics: Container
> > [pid=3865,containerID=container_1465112239753_0001_03_01] is
> >> running
> > beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> > memory used; 2.6 GB of 0B virtual memory used. Killing container
> >
> > When I've seen this in the past with yarn but without myriad, it was
> > usually about ratios of vmem to mem and things like that - I've tried
> > some of those knobs, but I didn't expect much result and didn't get
> >> any.
> >
> > What strikes me about the error message is that the vmem and mem
> > allocations are for 0.
> >
> > I'm sorry for asking what are probably naive questions here, I
> couldn't
> > find a different forum.  If there is one, please point me there so I
> > don't disrupt the dev flow here.
> >
> > I can see this in the logs:
> >
> >
> > 2016-06-05 07:39:25,687 INFO
> >
> 
> >>
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> > container_1465112239753_0001_03_01 Container Transitioned from
> NEW
> > to ALLOCATED
> > 2016-06-05 07:39:25,688 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
> USER=root
> >OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> > RESULT=SUCCESS  APPID=application_1465112239753_0001
> > CONTAINERID=container_1465112239753_0001_03_01
> > 2016-06-05 07:39:25,688 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> > Assigned container container_1465112239753_0001_03_01 of capacity
> >  on host slave2.testing.local:26688, which has 1
> > containers,  used and 
> > available after allocation
> > 2016-06-05 07:39:25,689 INFO
> >
> 
> >>
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> > Sending NMToken for nodeId : slave2.testing.local:26688 for
> container :
> > container_1465112239753_0001_03_01
> >

Re: problem getting fine grained scaling workig

2016-06-05 Thread Stephen Gran

Hi,

Brilliant!  Working now.

Thank you very much,

On 05/06/16 18:09, Darin Johnson wrote:
> Stephen,
>
> I was able to recreate the problem (specific due to 2.7.2, they changed the
> defaults on the following two properties to true).  Setting them to false
> allowed me to again run map reduce jobs.  I'll try to update the
> documentation later today.
>
>
>
>  yarn.nodemanager.pmem-check-enabled
>
>  false
>
>
>
>
>
>  yarn.nodemanager.vmem-check-enabled
>
>  false
>
>
>
> Darin
>
> On Sun, Jun 5, 2016 at 10:30 AM, Stephen Gran 
> wrote:
>
>> Hi,
>>
>> I think those are the properties I added when I started getting this
>> error.  Removing them doesn't seem to make any difference, sadly.
>>
>> This is hadoop 2.7.2
>>
>> Cheers,
>>
>> On 05/06/16 14:45, Darin Johnson wrote:
>>> Hey Stephen,
>>>
>>> I think you're pretty close.
>>>
>>> Looking at the config I'd suggest removing these properties:
>>>
>>>  
>>>   yarn.nodemanager.resource.memory-mb
>>>   4096
>>>   
>>>   
>>>   yarn.scheduler.maximum-allocation-vcores
>>>   12
>>>   
>>>   
>>>   yarn.scheduler.maximum-allocation-mb
>>>   8192
>>>   
>>> 
>>>  yarn.nodemanager.vmem-check-enabled
>>>   false
>>>   Whether virtual memory limits will be enforced for
>>> containers
>>> 
>>> 
>>>  yarn.nodemanager.vmem-pmem-ratio
>>>   4
>>>   Ratio between virtual memory to physical memory when
>>> setting memory limits for containers
>>> 
>>>
>>> I'll try them out on my test cluster later today/tonight and see if I can
>>> recreate the problem.  What version of hadoop are you running?  I'll make
>>> sure I'm consistent with that as well.
>>>
>>> Thanks,
>>>
>>> Darin
>>> On Jun 5, 2016 8:15 AM, "Stephen Gran"  wrote:
>>>
 Hi,

 Attached.  Thanks very much for looking.

 Cheers,

 On 05/06/16 12:51, Darin Johnson wrote:
> Hey Steven can you please send your yarn-site.xml, I'm guessing you're
>> on
> the right track.
>
> Darin
> Hi,
>
> OK.  That helps, thank you.  I think I just misunderstood the docs (or
> they never said explicitly that you did need at least some static
> resource), and I scaled down the initial nm.medium that got started.  I
> get a bit further now, and jobs start but are killed with:
>
> Diagnostics: Container
> [pid=3865,containerID=container_1465112239753_0001_03_01] is
>> running
> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> memory used; 2.6 GB of 0B virtual memory used. Killing container
>
> When I've seen this in the past with yarn but without myriad, it was
> usually about ratios of vmem to mem and things like that - I've tried
> some of those knobs, but I didn't expect much result and didn't get
>> any.
>
> What strikes me about the error message is that the vmem and mem
> allocations are for 0.
>
> I'm sorry for asking what are probably naive questions here, I couldn't
> find a different forum.  If there is one, please point me there so I
> don't disrupt the dev flow here.
>
> I can see this in the logs:
>
>
> 2016-06-05 07:39:25,687 INFO
>

>> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_01 Container Transitioned from NEW
> to ALLOCATED
> 2016-06-05 07:39:25,688 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
>OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> RESULT=SUCCESS  APPID=application_1465112239753_0001
> CONTAINERID=container_1465112239753_0001_03_01
> 2016-06-05 07:39:25,688 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> Assigned container container_1465112239753_0001_03_01 of capacity
>  on host slave2.testing.local:26688, which has 1
> containers,  used and 
> available after allocation
> 2016-06-05 07:39:25,689 INFO
>

>> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> Sending NMToken for nodeId : slave2.testing.local:26688 for container :
> container_1465112239753_0001_03_01
> 2016-06-05 07:39:25,696 INFO
>

>> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_01 Container Transitioned from
> ALLOCATED to ACQUIRED
> 2016-06-05 07:39:25,696 INFO
>

>> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> Clear node set for appattempt_1465112239753_0001_03
> 2016-06-05 07:39:25,696 INFO
>

>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
>

Re: problem getting fine grained scaling workig

2016-06-05 Thread Darin Johnson

Stephen,

I was able to recreate the problem (specific due to 2.7.2, they changed the
defaults on the following two properties to true).  Setting them to false
allowed me to again run map reduce jobs.  I'll try to update the
documentation later today.

  

yarn.nodemanager.pmem-check-enabled

false

  

  

yarn.nodemanager.vmem-check-enabled

false

  

Darin

On Sun, Jun 5, 2016 at 10:30 AM, Stephen Gran 
wrote:

> Hi,
>
> I think those are the properties I added when I started getting this
> error.  Removing them doesn't seem to make any difference, sadly.
>
> This is hadoop 2.7.2
>
> Cheers,
>
> On 05/06/16 14:45, Darin Johnson wrote:
> > Hey Stephen,
> >
> > I think you're pretty close.
> >
> > Looking at the config I'd suggest removing these properties:
> >
> > 
> >  yarn.nodemanager.resource.memory-mb
> >  4096
> >  
> >  
> >  yarn.scheduler.maximum-allocation-vcores
> >  12
> >  
> >  
> >  yarn.scheduler.maximum-allocation-mb
> >  8192
> >  
> >
> > yarn.nodemanager.vmem-check-enabled
> >  false
> >  Whether virtual memory limits will be enforced for
> > containers
> >
> > 
> > yarn.nodemanager.vmem-pmem-ratio
> >  4
> >  Ratio between virtual memory to physical memory when
> > setting memory limits for containers
> >
> >
> > I'll try them out on my test cluster later today/tonight and see if I can
> > recreate the problem.  What version of hadoop are you running?  I'll make
> > sure I'm consistent with that as well.
> >
> > Thanks,
> >
> > Darin
> > On Jun 5, 2016 8:15 AM, "Stephen Gran"  wrote:
> >
> >> Hi,
> >>
> >> Attached.  Thanks very much for looking.
> >>
> >> Cheers,
> >>
> >> On 05/06/16 12:51, Darin Johnson wrote:
> >>> Hey Steven can you please send your yarn-site.xml, I'm guessing you're
> on
> >>> the right track.
> >>>
> >>> Darin
> >>> Hi,
> >>>
> >>> OK.  That helps, thank you.  I think I just misunderstood the docs (or
> >>> they never said explicitly that you did need at least some static
> >>> resource), and I scaled down the initial nm.medium that got started.  I
> >>> get a bit further now, and jobs start but are killed with:
> >>>
> >>> Diagnostics: Container
> >>> [pid=3865,containerID=container_1465112239753_0001_03_01] is
> running
> >>> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> >>> memory used; 2.6 GB of 0B virtual memory used. Killing container
> >>>
> >>> When I've seen this in the past with yarn but without myriad, it was
> >>> usually about ratios of vmem to mem and things like that - I've tried
> >>> some of those knobs, but I didn't expect much result and didn't get
> any.
> >>>
> >>> What strikes me about the error message is that the vmem and mem
> >>> allocations are for 0.
> >>>
> >>> I'm sorry for asking what are probably naive questions here, I couldn't
> >>> find a different forum.  If there is one, please point me there so I
> >>> don't disrupt the dev flow here.
> >>>
> >>> I can see this in the logs:
> >>>
> >>>
> >>> 2016-06-05 07:39:25,687 INFO
> >>>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> >>> container_1465112239753_0001_03_01 Container Transitioned from NEW
> >>> to ALLOCATED
> >>> 2016-06-05 07:39:25,688 INFO
> >>> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
> >>>   OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> >>> RESULT=SUCCESS  APPID=application_1465112239753_0001
> >>> CONTAINERID=container_1465112239753_0001_03_01
> >>> 2016-06-05 07:39:25,688 INFO
> >>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> >>> Assigned container container_1465112239753_0001_03_01 of capacity
> >>>  on host slave2.testing.local:26688, which has 1
> >>> containers,  used and 
> >>> available after allocation
> >>> 2016-06-05 07:39:25,689 INFO
> >>>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> >>> Sending NMToken for nodeId : slave2.testing.local:26688 for container :
> >>> container_1465112239753_0001_03_01
> >>> 2016-06-05 07:39:25,696 INFO
> >>>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> >>> container_1465112239753_0001_03_01 Container Transitioned from
> >>> ALLOCATED to ACQUIRED
> >>> 2016-06-05 07:39:25,696 INFO
> >>>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> >>> Clear node set for appattempt_1465112239753_0001_03
> >>> 2016-06-05 07:39:25,696 INFO
> >>>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> >>> Storing attempt: AppId: application_1465112239753_0001 AttemptId:
> >>> appattempt_1465112239753_0001_03 MasterContainer: Container:
> >>> [ContainerId:

Re: problem getting fine grained scaling workig

2016-06-05 Thread Darin Johnson

Hey Stephen,

I think you're pretty close.

Looking at the config I'd suggest removing these properties:

   
yarn.nodemanager.resource.memory-mb
4096


yarn.scheduler.maximum-allocation-vcores
12


yarn.scheduler.maximum-allocation-mb
8192

  
   yarn.nodemanager.vmem-check-enabled
false
Whether virtual memory limits will be enforced for
containers
  

   yarn.nodemanager.vmem-pmem-ratio
4
Ratio between virtual memory to physical memory when
setting memory limits for containers
  

I'll try them out on my test cluster later today/tonight and see if I can
recreate the problem.  What version of hadoop are you running?  I'll make
sure I'm consistent with that as well.

Thanks,

Darin
On Jun 5, 2016 8:15 AM, "Stephen Gran"  wrote:

> Hi,
>
> Attached.  Thanks very much for looking.
>
> Cheers,
>
> On 05/06/16 12:51, Darin Johnson wrote:
> > Hey Steven can you please send your yarn-site.xml, I'm guessing you're on
> > the right track.
> >
> > Darin
> > Hi,
> >
> > OK.  That helps, thank you.  I think I just misunderstood the docs (or
> > they never said explicitly that you did need at least some static
> > resource), and I scaled down the initial nm.medium that got started.  I
> > get a bit further now, and jobs start but are killed with:
> >
> > Diagnostics: Container
> > [pid=3865,containerID=container_1465112239753_0001_03_01] is running
> > beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> > memory used; 2.6 GB of 0B virtual memory used. Killing container
> >
> > When I've seen this in the past with yarn but without myriad, it was
> > usually about ratios of vmem to mem and things like that - I've tried
> > some of those knobs, but I didn't expect much result and didn't get any.
> >
> > What strikes me about the error message is that the vmem and mem
> > allocations are for 0.
> >
> > I'm sorry for asking what are probably naive questions here, I couldn't
> > find a different forum.  If there is one, please point me there so I
> > don't disrupt the dev flow here.
> >
> > I can see this in the logs:
> >
> >
> > 2016-06-05 07:39:25,687 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> > container_1465112239753_0001_03_01 Container Transitioned from NEW
> > to ALLOCATED
> > 2016-06-05 07:39:25,688 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
> >  OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> > RESULT=SUCCESS  APPID=application_1465112239753_0001
> > CONTAINERID=container_1465112239753_0001_03_01
> > 2016-06-05 07:39:25,688 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> > Assigned container container_1465112239753_0001_03_01 of capacity
> >  on host slave2.testing.local:26688, which has 1
> > containers,  used and 
> > available after allocation
> > 2016-06-05 07:39:25,689 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> > Sending NMToken for nodeId : slave2.testing.local:26688 for container :
> > container_1465112239753_0001_03_01
> > 2016-06-05 07:39:25,696 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> > container_1465112239753_0001_03_01 Container Transitioned from
> > ALLOCATED to ACQUIRED
> > 2016-06-05 07:39:25,696 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> > Clear node set for appattempt_1465112239753_0001_03
> > 2016-06-05 07:39:25,696 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> > Storing attempt: AppId: application_1465112239753_0001 AttemptId:
> > appattempt_1465112239753_0001_03 MasterContainer: Container:
> > [ContainerId: container_1465112239753_0001_03_01, NodeId:
> > slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> > Resource: , Priority: 0, Token: Token { kind:
> > ContainerToken, service: 10.0.5.5:26688 }, ]
> > 2016-06-05 07:39:25,697 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> > appattempt_1465112239753_0001_03 State change from SCHEDULED to
> > ALLOCATED_SAVING
> > 2016-06-05 07:39:25,698 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> > appattempt_1465112239753_0001_03 State change from ALLOCATED_SAVING
> > to ALLOCATED
> > 2016-06-05 07:39:25,699 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> > Launching masterappattempt_1465112239753_0001_03
> > 2016-06-05 07:39:25,705 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> > Setting up container Container: [ContainerId:
> > container_1465112239753_0001_03_01, NodeId:
> > slave2.testing.local:26688,

Re: problem getting fine grained scaling workig

2016-06-05 Thread Stephen Gran

Hi,

Attached.  Thanks very much for looking.

Cheers,

On 05/06/16 12:51, Darin Johnson wrote:
> Hey Steven can you please send your yarn-site.xml, I'm guessing you're on
> the right track.
>
> Darin
> Hi,
>
> OK.  That helps, thank you.  I think I just misunderstood the docs (or
> they never said explicitly that you did need at least some static
> resource), and I scaled down the initial nm.medium that got started.  I
> get a bit further now, and jobs start but are killed with:
>
> Diagnostics: Container
> [pid=3865,containerID=container_1465112239753_0001_03_01] is running
> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> memory used; 2.6 GB of 0B virtual memory used. Killing container
>
> When I've seen this in the past with yarn but without myriad, it was
> usually about ratios of vmem to mem and things like that - I've tried
> some of those knobs, but I didn't expect much result and didn't get any.
>
> What strikes me about the error message is that the vmem and mem
> allocations are for 0.
>
> I'm sorry for asking what are probably naive questions here, I couldn't
> find a different forum.  If there is one, please point me there so I
> don't disrupt the dev flow here.
>
> I can see this in the logs:
>
>
> 2016-06-05 07:39:25,687 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_01 Container Transitioned from NEW
> to ALLOCATED
> 2016-06-05 07:39:25,688 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
>  OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> RESULT=SUCCESS  APPID=application_1465112239753_0001
> CONTAINERID=container_1465112239753_0001_03_01
> 2016-06-05 07:39:25,688 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> Assigned container container_1465112239753_0001_03_01 of capacity
>  on host slave2.testing.local:26688, which has 1
> containers,  used and 
> available after allocation
> 2016-06-05 07:39:25,689 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> Sending NMToken for nodeId : slave2.testing.local:26688 for container :
> container_1465112239753_0001_03_01
> 2016-06-05 07:39:25,696 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_01 Container Transitioned from
> ALLOCATED to ACQUIRED
> 2016-06-05 07:39:25,696 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> Clear node set for appattempt_1465112239753_0001_03
> 2016-06-05 07:39:25,696 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> Storing attempt: AppId: application_1465112239753_0001 AttemptId:
> appattempt_1465112239753_0001_03 MasterContainer: Container:
> [ContainerId: container_1465112239753_0001_03_01, NodeId:
> slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> Resource: , Priority: 0, Token: Token { kind:
> ContainerToken, service: 10.0.5.5:26688 }, ]
> 2016-06-05 07:39:25,697 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_03 State change from SCHEDULED to
> ALLOCATED_SAVING
> 2016-06-05 07:39:25,698 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_03 State change from ALLOCATED_SAVING
> to ALLOCATED
> 2016-06-05 07:39:25,699 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Launching masterappattempt_1465112239753_0001_03
> 2016-06-05 07:39:25,705 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Setting up container Container: [ContainerId:
> container_1465112239753_0001_03_01, NodeId:
> slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> Resource: , Priority: 0, Token: Token { kind:
> ContainerToken, service: 10.0.5.5:26688 }, ] for AM
> appattempt_1465112239753_0001_03
> 2016-06-05 07:39:25,705 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Command to launch container container_1465112239753_0001_03_01 :
> $JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Dhadoop.root.logfile=syslog  -Xmx1024m
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout
> 2>/stderr
> 2016-06-05 07:39:25,706 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Create AMRMToken for ApplicationAttempt:
> appattempt_1465112239753_0001_03
> 2016-06-05 07:39:25,707 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Creating password for

Re: problem getting fine grained scaling workig

2016-06-05 Thread Stephen Gran

Hi,

OK.  That helps, thank you.  I think I just misunderstood the docs (or 
they never said explicitly that you did need at least some static 
resource), and I scaled down the initial nm.medium that got started.  I 
get a bit further now, and jobs start but are killed with:

Diagnostics: Container 
[pid=3865,containerID=container_1465112239753_0001_03_01] is running 
beyond virtual memory limits. Current usage: 50.7 MB of 0B physical 
memory used; 2.6 GB of 0B virtual memory used. Killing container

When I've seen this in the past with yarn but without myriad, it was 
usually about ratios of vmem to mem and things like that - I've tried 
some of those knobs, but I didn't expect much result and didn't get any.

What strikes me about the error message is that the vmem and mem 
allocations are for 0.

I'm sorry for asking what are probably naive questions here, I couldn't 
find a different forum.  If there is one, please point me there so I 
don't disrupt the dev flow here.

I can see this in the logs:


2016-06-05 07:39:25,687 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1465112239753_0001_03_01 Container Transitioned from NEW 
to ALLOCATED
2016-06-05 07:39:25,688 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root 
OPERATION=AM Allocated ContainerTARGET=SchedulerApp 
RESULT=SUCCESS  APPID=application_1465112239753_0001 
CONTAINERID=container_1465112239753_0001_03_01
2016-06-05 07:39:25,688 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
Assigned container container_1465112239753_0001_03_01 of capacity 
 on host slave2.testing.local:26688, which has 1 
containers,  used and  
available after allocation
2016-06-05 07:39:25,689 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
 
Sending NMToken for nodeId : slave2.testing.local:26688 for container : 
container_1465112239753_0001_03_01
2016-06-05 07:39:25,696 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1465112239753_0001_03_01 Container Transitioned from 
ALLOCATED to ACQUIRED
2016-06-05 07:39:25,696 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
 
Clear node set for appattempt_1465112239753_0001_03
2016-06-05 07:39:25,696 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Storing attempt: AppId: application_1465112239753_0001 AttemptId: 
appattempt_1465112239753_0001_03 MasterContainer: Container: 
[ContainerId: container_1465112239753_0001_03_01, NodeId: 
slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387, 
Resource: , Priority: 0, Token: Token { kind: 
ContainerToken, service: 10.0.5.5:26688 }, ]
2016-06-05 07:39:25,697 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1465112239753_0001_03 State change from SCHEDULED to 
ALLOCATED_SAVING
2016-06-05 07:39:25,698 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1465112239753_0001_03 State change from ALLOCATED_SAVING 
to ALLOCATED
2016-06-05 07:39:25,699 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
Launching masterappattempt_1465112239753_0001_03
2016-06-05 07:39:25,705 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
Setting up container Container: [ContainerId: 
container_1465112239753_0001_03_01, NodeId: 
slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387, 
Resource: , Priority: 0, Token: Token { kind: 
ContainerToken, service: 10.0.5.5:26688 }, ] for AM 
appattempt_1465112239753_0001_03
2016-06-05 07:39:25,705 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
Command to launch container container_1465112239753_0001_03_01 : 
$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir= 
-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
-Dhadoop.root.logfile=syslog  -Xmx1024m 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout 
2>/stderr
2016-06-05 07:39:25,706 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Create AMRMToken for ApplicationAttempt: 
appattempt_1465112239753_0001_03
2016-06-05 07:39:25,707 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Creating password for appattempt_1465112239753_0001_03
2016-06-05 07:39:25,727 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: 
Done launching container Container: [ContainerId: 
container_1465112239753_0001_03_01, NodeId: 
slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387, 
Resource:

Re: problem getting fine grained scaling workig

2016-06-03 Thread Darin Johnson

That is correct you need at least one node manager with the minimum
requirements to launch an ApplicationMaster.  Otherwise YARN will throw an
exception.

On Fri, Jun 3, 2016 at 10:52 AM, yuliya Feldman  wrote:

> I believe you need at least one NM that is not subject to fine grain
> scaling.
> So far if total resources on the cluster is less then a single container
> needs for AM you won't be able to submit any app.As exception below tells
> you.
> (Invalid resource request, requested memory < 0, or requested memory >max
> configured, requestedMemory=1536, maxMemory=0
> at)
> I believe by default when starting Myriad cluster one NM with non 0
> capacity should start by default.
> In addition see in RM log whether offers with resources are coming to RM -
> this info should be in the log.
>
>   From: Stephen Gran 
>  To: "dev@myriad.incubator.apache.org" 
>  Sent: Friday, June 3, 2016 1:29 AM
>  Subject: problem getting fine grained scaling workig
>
> Hi,
>
> I'm trying to get fine grained scaling going on a test mesos cluster.  I
> have a single master and 2 agents.  I am running 2 node managers with
> the zero profile, one per agent.  I can see both of them in the RM UI
> reporting correctly as having 0 resources.
>
> I'm getting stack traces when I try to launch a sample application,
> though.  I feel like I'm just missing something obvious somewhere - can
> anyone shed any light?
>
> This is on a build of yesterday's git head.
>
> Cheers,
>
> root@master:/srv/apps/hadoop# bin/yarn jar
> share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
> /outDir
> 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
> master.testing.local/10.0.5.3:8032
> 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
> job: job_1464902078156_0001
> 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
> java.io.IOException:
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
> Invalid resource request, requested memory < 0, or requested memory >
> max configured, requestedMemory=1536, maxMemory=0
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
> at
>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
> at
>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
> at
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
> at
>
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
>

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

Re: problem getting fine grained scaling workig

10 matches

Site Navigation

Mail list logo

Footer information