On Tue, Sep 23, 2008 at 9:52 AM, Tarandeep Singh <[EMAIL PROTECTED]>wrote:
> I am running a small cluster of 4 nodes, each node having quad-cores and 8 > gb of RAM. > > dfs.replication: I have set it to 2. Probably reasonable given the small cluster. Will I get performance boost if I set it to 4 (=number of nodes). It will likely hurt performance. Reads will be a trivially faster, but writes will be substantially slower. > If this is true, how much replication people use > when they run a cluster of say 1000 nodes. Do they replicate peta bytes of > data ? We usually use 3, because it gives us better expected reliability. The relevant measure is how likely is it that the 2nd machine will crash before the data is replicated off. > mapred.child.java.opts: -Xms4096m -Xmx7500m- I tried with diff min and max > memory and found there was not any improvement in performance. I was > thinking that giving more memory to the process, will help it to do > sorting/shuffling etc quickly, but it seems my thinking is not correct. Can > anyone comment on this parameter and what should be the optimal value > > fs.inmemory.size.mb: I have set it to 225. Increasing it further does not > help. Also can someone explain it in detail like how does this parameter > affects performance. Increasing it only helps up until you don't have more than io.sort.factor spills from your reduce. (In 0.19, there are more options, including not spilling at all. I assume you are talking about 0.18.) For your job, the amount of input data to the reduce should fit in fs.inmemory.size.mb * io.sort.factor. If not, the framework will run a multi-level merge, which is slower. In 0.19, if you give the framework enough memory, it doesn't ever write the reduce inputs to disk. io.sort.mb: I have set it to 200. Increasing it further does not help, at > least in my jobs. Anyone with more details about this parameter ? This is used to control the buffering of the map outputs. As long as each map's output fits into io.sort.mb, you are doing fine. mapred.map.tasks: After reading the description, I set its value as 41 > (nearest prime close to 10*number of nodes). > mapred.reduce.tasks: I set its value to 5 (nearest prime close to number of > nodes) *Ugh* I seriously need to update those comments. Prime numbers don't matter at all. Don't set the number of map tasks. The framework does pretty well, if left to its own devices. Reduce tasks should be set to the # of reduce slots on the cluster. (On a big cluster, I'd suggest using 98% or so, but on a 4 node cluster, it isn't required.) > However I noticed there was not much performance gain. If I use the default > values, I get similar performance. But I ran the test on a small amount of > data. I have not tested with huge data set. But I would like to know how > these parameters are going to affect performance. In my experience, in general having large maps and reduces helps throughput. I used the default values for- > > mapred.tasktracker.map.tasks.maximum > mapred.tasktracker.reduce.tasks.maximum I'd suggest upping the map slots/node to 6 or so. I'd also suggest setting io.sort.factor to 100 to allow more files to be merged at once. Setting the default block size to 256mb will help on large data, although it requires more io.sort.mb. You probaly want io.file.buffer.size to be set to at least 128k. In terms of HDFS start up on small clusters, I'd suggest: dfs.safemode.threshold.pct = 1.0 dfs.safemode.extension=0 -- Owen
