Hey N.N. Gesli,

(Inline)

On Fri, Oct 28, 2011 at 12:38 PM, N.N. Gesli <[email protected]> wrote:
> Hello,
>
> We have 12 node Hadoop Cluster that is running Hadoop 0.20.2-cdh3u0. Each
> node has 8 core and 144GB RAM (don't ask). So, I want to take advantage of
> this huge RAM and run the map-reduce jobs mostly in memory with no spill, if
> possible. We use Hive for most of the processes. I have set:
> mapred.tasktracker.map.tasks.maximum = 16
> mapred.tasktracker.reduce.tasks.maximum = 8

This is *crazy* for an 8 core machine. Try to keep M+R slots well
below 8 instead - You're probably CPU-thrashed in this setup once
large number of tasks get booted.

> mapred.child.java.opts = 6144

You can also raise io.sort.mb to 2000, and tweak io.sort.factor.

The child opts raise to 6~ GB looks a bit unnecessary since most of
your tasks work on record basis and would not care much about total
RAM. Perhaps use all that RAM for a service like HBase which can
leverage caching nicely!

> One of my Hive queries is producing 6 stage map-reduce jobs. On the third
> stage when it queries from a 200GB table, the last 14 reducers hang. I
> changed mapred.task.timeout to 0 to see if they really hang. It has been 5
> hours, so something terribly wrong in my setup. Parts of the log is below.

It is probably just your slot settings. You may be massively
over-subscribing your CPU resources with 16 map task slots + 8 reduce
tasks slots. At worst case, it would mean 24 total JVMs competing over
8 available physical processors. Doesn't make sense to me at least -
Make it more like 7 M / 2 R or so :)

> My questions:
> * What should be my configurations to make reducers to run in the memory?
> * Why it keeps waiting for map outputs?

It has to fetch map outputs to get some data to start with. And it
pulls the map outputs a few at a time - to not overload the network
during shuffle phases of several reducers across the cluster.

> * What does it mean "dup hosts"?

Duplicate hosts. Hosts it already knows about and has already
scheduled fetch work upon.

<snip>

-- 
Harsh J

Reply via email to