Inline On Wed, Feb 2, 2011 at 5:35 AM, Raj V <[email protected]> wrote:
> Patrick,All > > APologies. My brain must have just frozen. There are two problems here. The > first problem is that my perl script alreadfy divides the time by 1000. So > the time is seconds not milliseconds. > The second problem is more fundamental. You can add all the task timings., > That ignores the inherently parallel aspect. > > Here is the updated table. > > 256 405420 751376 3483 332.1263279 1.297368468 256 411574 711841 3363 > 334.0514422 1.304888446 256 491519 599081 2955 369.0693739 1.441677242 512 > 491034 1212229 2989 569.8437605 1.112976095 512 471421 947025 2305 > 615.3778742 1.201909911 512 841000 1932633 4473 620.0833892 1.21110037 > > Column 1 is the number of nodes in the cluster. > Column 2 is the total map time in seconds. > Column 3 is the total reduce time in seconds. > Column 4 is the job time in seconds. > Column 5 is (Column2+Column3)/Column4. > Column6 is Column 5/Number of nodes. > > From this it appears that the infrastructure gives me around 20-40% > advantage over a single node serail execution.. This does not include the > other advantages such as failure tolereance and data availability. > > Questions > > 1. Is this the norm? > Possibly, for terasort. You are hitting a network bottleneck. Remember that with terasort as much data goes through the shuffle/reduce phase as your input data. Terasort isn't representative of most workloads, however, and is more a test of your network throughput than disk I/O or CPU or anything else. > 2. Can I tune the system to get better numbers? > Not enough info here. > 3, Are there others who can shed some thoughts on these or share data? > 4 Any idea about teh loss of efficiency due to the infrastructure? > 5. The 256 node cluster seems to provide an efficieny of 1.3 while 512 > nodes it decreases to around 1.2. Would this trend continue? > Yes, because of the network capacity. > > Raj > > Raj > > > *From:* Patrick Angeles <[email protected]> > *To:* [email protected] > *Cc:* "[email protected]" <[email protected]> > *Sent:* Tuesday, February 1, 2011 7:24 PM > *Subject:* Re: Hadoop Framework questions. > Hi, Raj. > > Interesting analysis... > > These numbers appear to be off. For example, 405s for mappers + 751s for > reducers = 1156s for all tasks. If you have 2000 map and reduce tasks, this > means each task is spending roughly 500ms to do actual work. That is a very > low number and seems impossible. > > - P > > On Wed, Feb 2, 2011 at 3:07 AM, Raj V <[email protected]> wrote: > > HiI have been running some benchmarks with some hadoop jobs on different > nodes and disk configurations to see what is a good configuration to get the > optimum performance.Here are some results that I have. Using the hadoop job > log, I added up the timings for each off the map task and reduce task and > converted them to seconds by dividing by 1000. The job is terasort as > provided in the examples.jar.Column 1 is the number of nodes in the > cluster.Column 2 represents the sum of all the times map jobs for a task . > i.e. Sum over all TaskID's ( FINISH_TIME-START_TIME) where > TASK_TYPE=Map.Column 3 the same calculation for all the reduce jobs. In all > the cases all the tasks were sucessfully completed.Column 4 is the > LAUNCH_TIME-FINISH_TIME for the Job. There is a SETUP_TASK and CLEANUP_TASK > that take insignificant times.Column 5 is the proporation of time taken by > the tasks= ( Col2+Col3)/Col4And I am assuming that Total_Time - > (MAP_Time+REDUCE_Time) is basically the framework time.Now looking at the > data I see that thge framework is taking ~38%-68% of the timingsHere are my > questions.1. Is this a problem with my setup or is this is the normal > behaviour?2. What can I do to reduce to the time taken for everything > else?3. Does this mean that to get a reasonably efficient hadoop cluster ( > 10%-20% time for framework), do I need to get to a 1000 node cluster?4. What > is the normal "number" for the framework time ?I apologize for1, Cross > posting. I am using CDH3B3 but I don't think my questions are specific to > CDH3B3.2. Not having provided all the details of system,network, and disk > configuration. 3. The different jobs have different disk configurations. > Most of them are running about 2000 map and reduce jobs. > 256 405.42 751.376 3483 0.332126328 256 411.574 711.841 3363 > 0.334051442 256 491.519 599.081 2955 0.369069374 512 491.034 1212.229 2989 > 0.56984376 512 471.421 947.025 2305 0.615377874 512 841 1932.633 4473 > 0.620083389 > >
