Hi

I have been running some benchmarks with some hadoop jobs on different nodes 
and disk configurations to see what is a good configuration to get the optimum 
performance.


Here are some results that I have. Using the hadoop job log, I added up the 
timings for each off the map task and reduce task and converted them to seconds 
by dividing by 1000. The job is terasort as provided in the examples.jar.

Column 1 is the number of nodes in the cluster.
Column 2 represents the sum of all the times map jobs for a task .  i.e. Sum 
over all TaskID's ( FINISH_TIME-START_TIME) where TASK_TYPE=Map.
Column 3 the same calculation for all the reduce jobs. In all the cases all the 
tasks were sucessfully completed.
Column 4 is the LAUNCH_TIME-FINISH_TIME for the Job. There is a SETUP_TASK and 
CLEANUP_TASK that take insignificant times.
Column 5 is the proporation of time taken by the tasks= ( Col2+Col3)/Col4

And I am assuming that Total_Time - (MAP_Time+REDUCE_Time)  is basically the 
framework time.

Now looking at the data I see that thge framework is taking ~38%-68% of the 
timings

Here are my questions.

1. Is this a problem with my setup or is this is the normal behaviour?
2. What can I do to reduce to the time taken for everything else?
3. Does this mean  that to get a reasonably efficient hadoop cluster ( 10%-20% 
time for framework), do I need to get to a 1000 node cluster?
4. What is the normal "number" for the framework time ?

I apologize for

1, Cross posting. I am using CDH3B3 but I don't think my questions are specific 
to CDH3B3.
2. Not having provided all the details of system,network, and disk 
configuration. 
3. The different jobs have different disk configurations. Most of them are 
running about 2000 map and reduce jobs.



256 405.42 751.376 3483 0.332126328 
256 411.574 711.841 3363 0.334051442 
256 491.519 599.081 2955 0.369069374 
512 491.034 1212.229 2989 0.56984376 
512 471.421 947.025 2305 0.615377874 
512 841 1932.633 4473 0.620083389 

Reply via email to