On 1/27/15, 9:24 PM, [email protected] wrote:
I test again, I found if I set mapreduce.map.cpu.vcores >1 ,the job will hang .
Very similar
to https://issues.apache.org/jira/browse/TEZ-704
I suspect this might be a YARN scheduler bug.
Are you using the FairScheduler or the CapacityScheduler?
I cannot reproduce this issue using my YARN-CS cluster, but I suspect
capacity scheduler is automatically set up to do the dominant resource
scheduling.
Are you on FS + Fair instead of FS + DRF or CS?
Cheers,
Gopal
From: [email protected]
Date: 2015-01-28 10:29
To: user
Subject: Re: Re: tez map task and reduce task stay pending forerver
o yeah . I fix the problem.
I add the config to my hive-site.xml
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.cpu-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.map.cpu.vcores</name>
<value>1</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>1</value>
</property>
And config my tez-site.xml just
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/tez-0.5.3/tez-0.5.3-minimal.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
Every thing is ok.
I think some config in my cluster is too larger .
[email protected]
From: [email protected]
Date: 2015-01-28 10:24
To: user
Subject: Re: Re: tez map task and reduce task stay pending forerver
No . set hive.execution.engine=mr , still hang...
[email protected]
发件人: Jianfeng (Jeff) Zhang
发送时间: 2015-01-28 10:11
收件人: user
主题: Re: 回复: tez map task and reduce task stay pending forerver
Can you run this query successfully using hive on mr ?
Best Regards,
Jeff Zhang
On Wed, Jan 28, 2015 at 10:01 AM, [email protected] <[email protected]> wrote:
I check the tez document from HDP page
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.7/bk_installing_manually_book/content/rpm-chap-tez_configure_tez.html.
tez.am.resource.memory.mb default value is 1536
My hadoop yarn.app.mapreduce.am.resource.mb value is 5734 MiB
The configuration mismatch will cause the problem ?
[email protected]
发件人: [email protected]
发送时间: 2015-01-27 17:59
收件人: user
主题: 回复: 回复: tez map task and reduce task stay pending forerver
Sorry Gopal V, I made a mistake, My config mapreduce.map.memory.mb is 2867 .
[email protected]
发件人: [email protected]
发送时间: 2015-01-27 17:58
收件人: user
主题: 回复: 回复: tez map task and reduce task stay pending forerver
Hello Gopal V,
I check my cdh config ,I found mapreduce.map.memory.mb is 2876.
[email protected]
发件人: [email protected]
发送时间: 2015-01-27 17:31
收件人: user
主题: 回复: Re: tez map task and reduce task stay pending forerver
I check the hivetez.log . No kill request trigger by hive.
[email protected]
发件人: Gopal V
发送时间: 2015-01-27 17:17
收件人: user
抄送: [email protected]
主题: Re: tez map task and reduce task stay pending forerver
On 1/27/15, 12:50 AM, [email protected] wrote:
hive 0.14.0 tez 0.53 hadoop 2.3.0-cdh 5.0.2
hive> select * from p_city order by id;
Query ID = zhoushugang_20150127163434_da70d957-6ac4-4b8b-a484-42b593838076
...
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 INITED 1 0 0 1 0 0
Reducer 2 INITED 1 0 0 1 0 0
Looks all container requests are pending/unresponsive.
I see a container request in the log with
2015-01-27 15:43:15,434 INFO [TaskSchedulerEventHandlerThread]
rm.YarnTaskSchedulerService: Allocation request for task:
attempt_1419300485749_371785_1_00_000000_0 with request:
Capability[<memory:2867, vCores:3>]Priority[2] host:
yhd-jqhadoop11.int.yihaodian.com rack: null
...
2015-01-27 15:43:17,635 INFO [DelayedContainerManager]
rm.YarnTaskSchedulerService: Releasing held container as either there
are pending but unmatched requests or this is not a session,
containerId=container_1419300485749_371785_01_000002, pendingTasks=1,
isSession=true. isNew=true
That seems to indicate that a container allocation request was made, but
YARN resource manager never responded with a container (or gave the
wrong container?).
Does the container size 2867 suggest any idea on what that might be?
Cheers,
Gopal
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed
and may contain information that is confidential, privileged and exempt from
disclosure under
applicable law. If the reader of this message is not the intended recipient,
you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or
forwarding
of this communication is strictly prohibited. If you have received this
communication in error,
please contact the sender immediately and delete it from your system. Thank You.