Hi,
sorry, I complement some information.
The hadoop 2.2.0 had been running normally for some days since I start up the
hadoop server. I can run jobs without any problems.
Today suddenly the jobs cannot run and all the jobs’ status were keeping
“submitted” after submitting.
There are 3 slavers and every slave has 32G memory and 24 cpus.
The contents of my fair-scheduler.xml is as follows:
<?xml version="1.0"?>
<allocations>
<queue name="root">
<minResources>10000mb,10vcores</minResources>
<maxResources>90000mb,100vcores</maxResources>
<maxRunningApps>50</maxRunningApps>
<weight>2.0</weight>
<schedulingMode>fair</schedulingMode>
<aclSubmitApps> </aclSubmitApps>
<aclAdministerApps> </aclAdministerApps>
<queue name="queue1">
<minResources>10000mb,10vcores</minResources>
<maxResources>30000mb,30vcores</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>2.0</weight>
<schedulingMode>fair</schedulingMode>
<aclAdministerApps>xxx1,xxx2 admins</aclAdministerApps>
<aclSubmitApps>xxx1,xxx2,xxx3 datadev</aclSubmitApps>
</queue>
<queue name="queue2">
<minResources>10000mb,10vcores</minResources>
<maxResources>30000mb,30vcores</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>2.0</weight>
<schedulingMode>fair</schedulingMode>
<aclAdministerApps>datadev admins</aclAdministerApps>
<aclSubmitApps>xxx1 datadev</aclSubmitApps>
</queue>
<queue name="queue3">
<minResources>5000mb,5vcores</minResources>
<maxResources>10000mb,10vcores</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>2.0</weight>
<schedulingMode>fair</schedulingMode>
<aclAdministerApps>datadev admins</aclAdministerApps>
<aclSubmitApps>xxx1,xxx2 datadev</aclSubmitApps>
</queue>
<queue name="default">
<minResources>10000mb,10vcores</minResources>
<maxResources>30000mb,30vcores</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>2.0</weight>
<schedulingMode>fair</schedulingMode>
<aclAdministerApps>xxx1 admins</aclAdministerApps>
<aclSubmitApps>xxx1,xxx2,xxx3,root datadev</aclSubmitApps>
</queue>
</queue>
<user name="xxx">
<maxRunningApps>10</maxRunningApps>
</user>
<userMaxAppsDefault>10</userMaxAppsDefault>
</allocations>
发件人: Sandy Ryza [mailto:[email protected]]
发送时间: 2013年11月27日 16:33
收件人: [email protected]
主题: Re: problems of FairScheduler in hadoop2.2.0
Hi,
Can you share the contents of your fair-scheduler.xml? If you submit just a
single job, does it run? What do you see if you go to
<resourcemanagerwebui>/ws/v1/cluster/scheduler?
-Sandy
On Wed, Nov 27, 2013 at 12:09 AM, 麦树荣
<[email protected]<mailto:[email protected]>> wrote:
Hi, all
When I run jobs in hadoop 2.2.0, I encounter a problem. Suddenly, the hadoop
resourcemanager cannot work normally: When I submit jobs and the jobs’ status
all are “submitted” and cannot run.
I cannot find any answers in the internet, who can give me some help? Thanks.
The resourcemanager log is as follows:
2013-11-27 14:39:10,749 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1129_000001
2013-11-27 14:39:11,050 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:11,050 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001
2013-11-27 14:39:11,051 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:11,051 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001
2013-11-27 14:39:11,753 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1129_000001
2013-11-27 14:39:11,754 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1129_000001
2013-11-27 14:39:12,055 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:12,055 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001
2013-11-27 14:39:12,056 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:12,056 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001