Re: problem getting fine grained scaling workig

Stephen Gran Sun, 05 Jun 2016 05:16:15 -0700

Hi,

Attached.  Thanks very much for looking.


Cheers,

On 05/06/16 12:51, Darin Johnson wrote:
> Hey Steven can you please send your yarn-site.xml, I'm guessing you're on
> the right track.
>
> Darin
> Hi,
>
> OK.  That helps, thank you.  I think I just misunderstood the docs (or
> they never said explicitly that you did need at least some static
> resource), and I scaled down the initial nm.medium that got started.  I
> get a bit further now, and jobs start but are killed with:
>
> Diagnostics: Container
> [pid=3865,containerID=container_1465112239753_0001_03_000001] is running
> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> memory used; 2.6 GB of 0B virtual memory used. Killing container
>
> When I've seen this in the past with yarn but without myriad, it was
> usually about ratios of vmem to mem and things like that - I've tried
> some of those knobs, but I didn't expect much result and didn't get any.
>
> What strikes me about the error message is that the vmem and mem
> allocations are for 0.
>
> I'm sorry for asking what are probably naive questions here, I couldn't
> find a different forum.  If there is one, please point me there so I
> don't disrupt the dev flow here.
>
> I can see this in the logs:
>
>
> 2016-06-05 07:39:25,687 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_000001 Container Transitioned from NEW
> to ALLOCATED
> 2016-06-05 07:39:25,688 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
>      OPERATION=AM Allocated Container        TARGET=SchedulerApp
> RESULT=SUCCESS  APPID=application_1465112239753_0001
> CONTAINERID=container_1465112239753_0001_03_000001
> 2016-06-05 07:39:25,688 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> Assigned container container_1465112239753_0001_03_000001 of capacity
> <memory:0, vCores:0> on host slave2.testing.local:26688, which has 1
> containers, <memory:0, vCores:0> used and <memory:4096, vCores:1>
> available after allocation
> 2016-06-05 07:39:25,689 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> Sending NMToken for nodeId : slave2.testing.local:26688 for container :
> container_1465112239753_0001_03_000001
> 2016-06-05 07:39:25,696 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_000001 Container Transitioned from
> ALLOCATED to ACQUIRED
> 2016-06-05 07:39:25,696 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> Clear node set for appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:25,696 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> Storing attempt: AppId: application_1465112239753_0001 AttemptId:
> appattempt_1465112239753_0001_000003 MasterContainer: Container:
> [ContainerId: container_1465112239753_0001_03_000001, NodeId:
> slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> Resource: <memory:0, vCores:0>, Priority: 0, Token: Token { kind:
> ContainerToken, service: 10.0.5.5:26688 }, ]
> 2016-06-05 07:39:25,697 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_000003 State change from SCHEDULED to
> ALLOCATED_SAVING
> 2016-06-05 07:39:25,698 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_000003 State change from ALLOCATED_SAVING
> to ALLOCATED
> 2016-06-05 07:39:25,699 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Launching masterappattempt_1465112239753_0001_000003
> 2016-06-05 07:39:25,705 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Setting up container Container: [ContainerId:
> container_1465112239753_0001_03_000001, NodeId:
> slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> Resource: <memory:0, vCores:0>, Priority: 0, Token: Token { kind:
> ContainerToken, service: 10.0.5.5:26688 }, ] for AM
> appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:25,705 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Command to launch container container_1465112239753_0001_03_000001 :
> $JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=<LOG_DIR>
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Dhadoop.root.logfile=syslog  -Xmx1024m
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout
> 2><LOG_DIR>/stderr
> 2016-06-05 07:39:25,706 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Create AMRMToken for ApplicationAttempt:
> appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:25,707 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Creating password for appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:25,727 INFO
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher:
> Done launching container Container: [ContainerId:
> container_1465112239753_0001_03_000001, NodeId:
> slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> Resource: <memory:0, vCores:0>, Priority: 0, Token: Token { kind:
> ContainerToken, service: 10.0.5.5:26688 }, ] for AM
> appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:25,728 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_000003 State change from ALLOCATED to LAUNCHED
> 2016-06-05 07:39:25,736 WARN
> org.apache.myriad.scheduler.event.handlers.StatusUpdateEventHandler:
> Task: yarn_container_1465112239753_0001_03_000001 not found, status:
> TASK_RUNNING
> 2016-06-05 07:39:26,510 INFO org.apache.hadoop.yarn.util.RackResolver:
> Resolved slave1.testing.local to /default-rack
> 2016-06-05 07:39:26,517 WARN
> org.apache.myriad.scheduler.fgs.NMHeartBeatHandler: FineGrainedScaling
> feature got invoked for a NM with non-zero capacity. Host:
> slave1.testing.local, Mem: 4096, CPU: 0. Setting the NM's capacity to
> (0G,0CPU)
> 2016-06-05 07:39:26,517 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl:
> slave1.testing.local:29121 Node Transitioned from NEW to RUNNING
> 2016-06-05 07:39:26,518 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Added node slave1.testing.local:29121 cluster capacity: <memory:4096,
> vCores:1>
> 2016-06-05 07:39:26,519 INFO
> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager:
> afterSchedulerEventHandled: NM registration from node slave1.testing.local
> 2016-06-05 07:39:26,528 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
> received container statuses on node manager register :[container_id {
> app_attempt_id { application_id { id: 1 cluster_timestamp: 1465112239753
> } attemptId: 2 } id: 1 } container_state: C_RUNNING resource { memory: 0
> virtual_cores: 0 } priority { priority: 0 } diagnostics: ""
> container_exit_status: -1000 creation_time: 1465112356478]
> 2016-06-05 07:39:26,530 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
> NodeManager from node slave1.testing.local(cmPort: 29121 httpPort:
> 20456) registered with capability: <memory:0, vCores:0>, assigned nodeId
> slave1.testing.local:29121
> 2016-06-05 07:39:26,611 INFO
> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Setting
> capacity for node slave1.testing.local to <memory:4637, vCores:6>
> 2016-06-05 07:39:26,611 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
> Update resource on node: slave1.testing.local from: <memory:0,
> vCores:0>, to: <memory:4637, vCores:6>
> 2016-06-05 07:39:26,615 INFO
> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Setting
> capacity for node slave1.testing.local to <memory:0, vCores:0>
> 2016-06-05 07:39:26,616 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
> Update resource on node: slave1.testing.local from: <memory:4637,
> vCores:6>, to: <memory:0, vCores:0>
> 2016-06-05 07:39:26,691 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_000001 Container Transitioned from
> ACQUIRED to RUNNING
> 2016-06-05 07:39:26,835 WARN
> org.apache.myriad.scheduler.event.handlers.StatusUpdateEventHandler:
> Task: yarn_container_1465112239753_0001_03_000001 not found, status:
> TASK_FINISHED
> 2016-06-05 07:39:27,603 INFO
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler:
> Received offers 1
> 2016-06-05 07:39:27,748 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> container_1465112239753_0001_03_000001 Container Transitioned from
> RUNNING to COMPLETED
> 2016-06-05 07:39:27,748 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt:
> Completed container: container_1465112239753_0001_03_000001 in state:
> COMPLETED event:FINISHED
> 2016-06-05 07:39:27,748 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
>      OPERATION=AM Released Container TARGET=SchedulerApp
> RESULT=SUCCESS  APPID=application_1465112239753_0001
> CONTAINERID=container_1465112239753_0001_03_000001
> 2016-06-05 07:39:27,748 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> Released container container_1465112239753_0001_03_000001 of capacity
> <memory:0, vCores:0> on host slave2.testing.local:26688, which currently
> has 0 containers, <memory:0, vCores:0> used and <memory:4096, vCores:1>
> available, release resources=true
> 2016-06-05 07:39:27,748 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Application attempt appattempt_1465112239753_0001_000003 released
> container container_1465112239753_0001_03_000001 on node: host:
> slave2.testing.local:26688 #containers=0 available=<memory:4096,
> vCores:1> used=<memory:0, vCores:0> with event: FINISHED
> 2016-06-05 07:39:27,749 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> Updating application attempt appattempt_1465112239753_0001_000003 with
> final state: FAILED, and exit status: -103
> 2016-06-05 07:39:27,750 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_000003 State change from LAUNCHED to
> FINAL_SAVING
> 2016-06-05 07:39:27,751 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Unregistering app attempt : appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:27,751 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Application finished, removing password for
> appattempt_1465112239753_0001_000003
> 2016-06-05 07:39:27,751 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1465112239753_0001_000003 State change from FINAL_SAVING to
> FAILED
> 2016-06-05 07:39:27,751 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The
> number of failed attempts is 2. The max attempts is 2
> 2016-06-05 07:39:27,753 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating
> application application_1465112239753_0001 with final state: FAILED
> 2016-06-05 07:39:27,756 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1465112239753_0001 State change from ACCEPTED to FINAL_SAVING
> 2016-06-05 07:39:27,757 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Application appattempt_1465112239753_0001_000003 is done. finalState=FAILED
> 2016-06-05 07:39:27,757 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
> Application application_1465112239753_0001 requests cleared
> 2016-06-05 07:39:27,758 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
> Updating info for app: application_1465112239753_0001
> 2016-06-05 07:39:27,758 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> Application application_1465112239753_0001 failed 2 times due to AM
> Container for appattempt_1465112239753_0001_000003 exited with
> exitCode: -103
> For more detailed output, check application tracking
> page:
> http://master.testing.local:8088/cluster/app/application_1465112239753_0001Then
> ,
> click on links to logs of each attempt.
> Diagnostics: Container
> [pid=3865,containerID=container_1465112239753_0001_03_000001] is running
> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> memory used; 2.6 GB of 0B virtual memory used. Killing container.
> Dump of the process-tree for container_1465112239753_0001_03_000001 :
>           |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>           |- 3873 3865 3865 3865 (java) 80 26 2770927616 12614
> /usr/lib/jvm/java-8-openjdk-amd64/bin/java
> -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1465112239753_0001/container_1465112239753_0001_03_000001/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/srv/apps/hadoop-2.7.2/logs/userlogs/application_1465112239753_0001/container_1465112239753_0001_03_000001
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Dhadoop.root.logfile=syslog -Xmx1024m
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>           |- 3865 3863 3865 3865 (bash) 0 1 11427840 354 /bin/bash -c
> /usr/lib/jvm/java-8-openjdk-amd64/bin/java
> -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1465112239753_0001/container_1465112239753_0001_03_000001/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/srv/apps/hadoop-2.7.2/logs/userlogs/application_1465112239753_0001/container_1465112239753_0001_03_000001
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Dhadoop.root.logfile=syslog  -Xmx1024m
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> 1>/srv/apps/hadoop-2.7.2/logs/userlogs/application_1465112239753_0001/container_1465112239753_0001_03_000001/stdout
> 2>/srv/apps/hadoop-2.7.2/logs/userlogs/application_1465112239753_0001/container_1465112239753_0001_03_000001/stderr
>
>
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> Failing this attempt. Failing the application.
>
>
>
> On 03/06/16 15:52, yuliya Feldman wrote:
>> I believe you need at least one NM that is not subject to fine grain
> scaling.
>> So far if total resources on the cluster is less then a single container
> needs for AM you won't be able to submit any app.As exception below tells
> you.
>> (Invalid resource request, requested memory < 0, or requested memory >max
> configured, requestedMemory=1536, maxMemory=0
>>           at)
>> I believe by default when starting Myriad cluster one NM with non 0
> capacity should start by default.
>> In addition see in RM log whether offers with resources are coming to RM
> - this info should be in the log.
>>
>>         From: Stephen Gran <stephen.g...@piksel.com>
>>    To: "dev@myriad.incubator.apache.org" <dev@myriad.incubator.apache.org>
>>    Sent: Friday, June 3, 2016 1:29 AM
>>    Subject: problem getting fine grained scaling workig
>>
>> Hi,
>>
>> I'm trying to get fine grained scaling going on a test mesos cluster.  I
>> have a single master and 2 agents.  I am running 2 node managers with
>> the zero profile, one per agent.  I can see both of them in the RM UI
>> reporting correctly as having 0 resources.
>>
>> I'm getting stack traces when I try to launch a sample application,
>> though.  I feel like I'm just missing something obvious somewhere - can
>> anyone shed any light?
>>
>> This is on a build of yesterday's git head.
>>
>> Cheers,
>>
>> root@master:/srv/apps/hadoop# bin/yarn jar
>> share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 10000
>> /outDir
>> 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
>> master.testing.local/10.0.5.3:8032
>> 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 10000 using 2
>> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
>> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
>> job: job_1464902078156_0001
>> 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
>> java.io.IOException:
>> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
>> Invalid resource request, requested memory < 0, or requested memory >
>> max configured, requestedMemory=1536, maxMemory=0
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
>>           at
>>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
>>           at
>>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
>>           at
>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>>           at java.security.AccessController.doPrivileged(Native Method)
>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>           at
>>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>>
>>           at
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
>>           at
>>
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>>           at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>>           at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>>           at java.security.AccessController.doPrivileged(Native Method)
>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>           at
>>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>           at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>>           at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
>>           at
> org.apache.hadoop.examples.terasort.TeraGen.run(TeraGen.java:301)
>>           at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>           at
> org.apache.hadoop.examples.terasort.TeraGen.main(TeraGen.java:305)
>>           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>           at
>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>           at
>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>           at java.lang.reflect.Method.invoke(Method.java:497)
>>           at
>>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>>           at
> org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>>           at
> org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>>           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>           at
>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>           at
>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>           at java.lang.reflect.Method.invoke(Method.java:497)
>>           at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>           at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>> Caused by:
>> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
>> Invalid resource request, requested memory < 0, or requested memory >
>> max configured, requestedMemory=1536, maxMemory=0
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
>>           at
>>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
>>           at
>>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
>>           at
>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>>           at java.security.AccessController.doPrivileged(Native Method)
>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>           at
>>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>>
>>           at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>>           at
>>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>>           at
>>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>           at
> java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>>           at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>>           at
>>
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>>           at
>>
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:239)
>>           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>           at
>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>           at
>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>           at java.lang.reflect.Method.invoke(Method.java:497)
>>           at
>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
>>           at
>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>           at com.sun.proxy.$Proxy13.submitApplication(Unknown Source)
>>           at
>>
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:253)
>>           at
>>
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:290)
>>           at
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
>>           ... 24 more
>> Caused by:
>>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException):
>> Invalid resource request, requested memory < 0, or requested memory >
>> max configured, requestedMemory=1536, maxMemory=0
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
>>           at
>>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
>>           at
>>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
>>           at
>>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
>>           at
>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>>           at java.security.AccessController.doPrivileged(Native Method)
>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>           at
>>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>>
>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>           at
>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>>           at com.sun.proxy.$Proxy12.submitApplication(Unknown Source)
>>           at
>>
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:236)
>>           ... 34 more
>>
>>
>> Cheers,
>> --
>> Stephen Gran
>> Senior Technical Architect
>>
>> picture the possibilities | piksel.com
>> This message is private and confidential. If you have received this
> message in error, please notify the sender or serviced...@piksel.com and
> remove it from your system.
>>
>> Piksel Inc is a company registered in the United States New York City,
> 1250 Broadway, Suite 1902, New York, NY 10001. F No. = 2931986
>>
>>
>>
>
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>

-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->

  <!-- Resource Manager Configs -->
    <property>
      <description>The hostname of the RM.</description>
      <name>yarn.resourcemanager.hostname</name>
      <value>master.testing.local</value>
    </property>    
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.aux-services</name>
Â Â Â Â Â Â Â Â <value>mapreduce_shuffle,myriad_executor</value>
Â Â Â Â Â Â Â Â <!-- If using MapR distro, please use the following value:
Â Â Â Â Â Â Â Â <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value> -->
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
Â Â Â Â Â Â Â Â <value>org.apache.hadoop.mapred.ShuffleHandler</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
Â Â Â Â Â Â Â Â <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
Â Â Â Â Â Â Â Â <value>2000</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
Â Â Â Â Â Â Â Â <value>10000</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
Â Â Â Â Â Â Â Â <value>1000</value>
Â Â Â Â </property>
<!-- (more) Site-specific YARN configuration properties -->
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.resource.cpu-vcores</name>
Â Â Â Â Â Â Â Â <value>${nodemanager.resource.cpu-vcores}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.resource.memory-mb</name>
Â Â Â Â Â Â Â Â <value>${nodemanager.resource.memory-mb}</value>
Â Â Â Â </property>
Â 
Â 
<!-- Dynamic Port Assignment enablement by Mesos -->
Â Â Â  <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.address</name>
Â Â Â Â Â Â Â Â <value>${myriad.yarn.nodemanager.address}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.webapp.address</name>
Â Â Â Â Â Â Â Â <value>${myriad.yarn.nodemanager.webapp.address}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.webapp.https.address</name>
Â Â Â Â Â Â Â Â <value>${myriad.yarn.nodemanager.webapp.address}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.localizer.address</name>
Â Â Â Â Â Â Â Â <value>${myriad.yarn.nodemanager.localizer.address}</value>
Â Â Â Â </property>
Â 
<!-- Myriad Scheduler configuration -->
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.resourcemanager.scheduler.class</name>
Â Â Â Â Â Â Â Â <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
Â Â Â Â </property>
Â 
<!-- Needed for Fine Grain Scaling -->
Â Â Â Â <property>
Â Â Â Â Â Â Â  <name>yarn.scheduler.minimum-allocation-vcores</name>
Â Â Â Â Â Â Â Â <value>0</value>
Â Â Â Â </property>Â Â  Â 
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.scheduler.minimum-allocation-mb</name>
Â Â Â Â Â Â Â Â <value>0</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.resource.memory-mb</name>
Â Â Â Â Â Â Â Â <value>4096</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â  <name>yarn.scheduler.maximum-allocation-vcores</name>
Â Â Â Â Â Â Â Â <value>12</value>
Â Â Â Â </property>Â Â  Â 
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.scheduler.maximum-allocation-mb</name>
Â Â Â Â Â Â Â Â <value>8192</value>
Â Â Â Â </property>
  <property>
   <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
    <description>Whether virtual memory limits will be enforced for containers</description>
  </property>
 <property>
   <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>
    <description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
  </property>
Â 
<!-- Cgroups specific configuration -->
<!--
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <description>Who will execute(launch) the containers.</description>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.container-executor.class</name>
Â Â Â Â Â Â Â Â <value>${yarn.nodemanager.container-executor.class}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <description>The class which should help the LCE handle resources.</description>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
Â Â Â Â Â Â Â Â <value>${yarn.nodemanager.linux-container-executor.resources-handler.class}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.linux-container-executor.cgroups.hierarchy</name>
Â Â Â Â Â Â Â Â <value>${yarn.nodemanager.linux-container-executor.cgroups.hierarchy}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.linux-container-executor.cgroups.mount</name>
Â Â Â Â Â Â Â Â <value>${yarn.nodemanager.linux-container-executor.cgroups.mount}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.linux-container-executor.cgroups.mount-path</name>
Â Â Â Â Â Â Â Â <value>${yarn.nodemanager.linux-container-executor.cgroups.mount-path}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.linux-container-executor.group</name>
Â Â Â Â Â Â Â Â <value>${yarn.nodemanager.linux-container-executor.group}</value>
Â Â Â Â </property>
Â Â Â Â <property>
Â Â Â Â Â Â Â Â <name>yarn.nodemanager.linux-container-executor.path</name>
Â Â Â Â Â Â Â Â <value>${yarn.home}/bin/container-executor</value>
Â Â Â Â </property>
-->


</configuration>

Re: problem getting fine grained scaling workig

Reply via email to