On Tue, Sep 23, 2008 at 9:52 AM, Tarandeep Singh <[EMAIL PROTECTED]>wrote:

> I am running a small cluster of 4 nodes, each node having quad-cores and 8
> gb of RAM.
>
> dfs.replication: I have set it to 2.


Probably reasonable given the small cluster.

Will I get performance boost if I set it to 4 (=number of nodes).


It will likely hurt performance. Reads will be a trivially faster, but
writes will be substantially slower.


> If this is true, how much replication people use
> when they run a cluster of say 1000 nodes. Do they replicate peta bytes of
> data ?


We usually use 3, because it gives us better expected reliability. The
relevant measure is how likely is it that the 2nd machine will crash before
the data is replicated off.


> mapred.child.java.opts: -Xms4096m -Xmx7500m- I tried with diff min and max
> memory and found there was not any improvement in performance. I was
> thinking that giving more memory to the process, will help it to do
> sorting/shuffling etc quickly, but it seems my thinking is not correct. Can
> anyone comment on this parameter and what should be the optimal value
>
> fs.inmemory.size.mb: I have set it to 225. Increasing it further does not
> help. Also can someone explain it in detail like how does this parameter
> affects performance.


Increasing it only helps up until you don't have more than io.sort.factor
spills from your reduce. (In 0.19, there are more options, including not
spilling at all. I assume you are talking about 0.18.) For your job, the
amount of input data to the reduce should fit in fs.inmemory.size.mb *
io.sort.factor. If not, the framework will run a multi-level merge, which is
slower. In 0.19, if you give the framework enough memory, it doesn't ever
write the reduce inputs to disk.

io.sort.mb: I have set it to 200. Increasing it further does not help, at
> least in my jobs. Anyone with more details about this parameter ?


This is used to control the buffering of the map outputs. As long as each
map's output fits into io.sort.mb, you are doing fine.

mapred.map.tasks: After reading the description, I set its value as 41
> (nearest prime close to 10*number of nodes).
> mapred.reduce.tasks: I set its value to 5 (nearest prime close to number of
> nodes)


*Ugh* I seriously need to update those comments. Prime numbers don't matter
at all. Don't set the number of map tasks. The framework does pretty well,
if left to its own devices. Reduce tasks should be set to the # of reduce
slots on the cluster. (On a big cluster, I'd suggest using 98% or so, but on
a 4 node cluster, it isn't required.)


> However I noticed there was not much performance gain. If I use the default
> values, I get similar performance. But I ran the test on a small amount of
> data. I have not tested with huge data set. But I would like to know how
> these parameters are going to affect performance.


In my experience, in general having large maps and reduces helps throughput.


I used the default values for-
>
> mapred.tasktracker.map.tasks.maximum
> mapred.tasktracker.reduce.tasks.maximum


I'd suggest upping the map slots/node to 6 or so.

I'd also suggest setting io.sort.factor to 100 to allow more files to be
merged at once.

Setting the default block size to 256mb will help on large data, although it
requires more io.sort.mb.

You probaly want io.file.buffer.size to be set to at least 128k.

In terms of HDFS start up on small clusters, I'd suggest:
dfs.safemode.threshold.pct = 1.0
dfs.safemode.extension=0

-- Owen

Reply via email to