Hi, Raj. Interesting analysis...
These numbers appear to be off. For example, 405s for mappers + 751s for reducers = 1156s for all tasks. If you have 2000 map and reduce tasks, this means each task is spending roughly 500ms to do actual work. That is a very low number and seems impossible. - P On Wed, Feb 2, 2011 at 3:07 AM, Raj V <[email protected]> wrote: > > Hi > > I have been running some benchmarks with some hadoop jobs on different > nodes and disk configurations to see what is a good configuration to get the > optimum performance. > > > Here are some results that I have. Using the hadoop job log, I added up the > timings for each off the map task and reduce task and converted them to > seconds by dividing by 1000. The job is terasort as provided in the > examples.jar. > > Column 1 is the number of nodes in the cluster. > Column 2 represents the sum of all the times map jobs for a task . i.e. > Sum over all TaskID's ( FINISH_TIME-START_TIME) where TASK_TYPE=Map. > Column 3 the same calculation for all the reduce jobs. In all the cases all > the tasks were sucessfully completed. > Column 4 is the LAUNCH_TIME-FINISH_TIME for the Job. There is a SETUP_TASK > and CLEANUP_TASK that take insignificant times. > Column 5 is the proporation of time taken by the tasks= ( Col2+Col3)/Col4 > > And I am assuming that Total_Time - (MAP_Time+REDUCE_Time) is basically > the framework time. > > Now looking at the data I see that thge framework is taking ~38%-68% of the > timings > > Here are my questions. > > 1. Is this a problem with my setup or is this is the normal behaviour? > 2. What can I do to reduce to the time taken for everything else? > 3. Does this mean that to get a reasonably efficient hadoop cluster ( > 10%-20% time for framework), do I need to get to a 1000 node cluster? > 4. What is the normal "number" for the framework time ? > > I apologize for > > 1, Cross posting. I am using CDH3B3 but I don't think my questions are > specific to CDH3B3. > 2. Not having provided all the details of system,network, and disk > configuration. > 3. The different jobs have different disk configurations. Most of them are > running about 2000 map and reduce jobs. > > > 256 405.42 751.376 3483 0.332126328 256 411.574 711.841 3363 > 0.334051442 256 491.519 599.081 2955 0.369069374 512 491.034 1212.229 2989 > 0.56984376 512 471.421 947.025 2305 0.615377874 512 841 1932.633 4473 > 0.620083389 >
