Hi, We are performing left joining on 5-6 larger tables. We see job is hanging around 95%. All the mappers completed fast and some of the reducer are also completed fast. but some of reducer are hanging state because single task is running on large data. Below are the Mapper and Reducer captured.
[cid:[email protected]] * Is there a way to move task running under Reducer phase to Mapper phase. I mean tweaking with memory settings or modifying the query to have more mapper tasks than reducer task. * Is there a way to know what part of query is taken by task which is running for long time. or what amount of rows this task is running upon ( so that i can think of partition or alternate approach) * Any other memory setting to resolve hanging issue. Below is our memory settings SET hive.tez.container.size = -1; SET hive.execution.engine=tez; SET hive.mapjoin.hybridgrace.hashtable=FALSE; SET hive.optimize.ppd=true; SET hive.cbo.enable =true; SET hive.compute.query.using.stats =true; SET hive.exec.parallel=true; SET hive.vectorized.execution.enabled=true; SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=nonstrict; SET hive.auto.convert.join=false; SET hive.auto.convert.join.noconditionaltask=false; set hive.tez.java.opts = "-Xmx3481m"; set hive.tez.container.size = 4096; --SET mapreduce.map.memory.mb=4096; --SET mapreduce.map.java.opts = -Xmx3000M; --SET mapreduce.reduce.memory.mb = 2048; --SET mapreduce.reduce.java.opts = -Xmx1630M; SET fs.block.size=67108864; Thanks in advance -Mahender
