Hi Gokul, Would it be possible to post your benchmark code somewhere? I'm not clear on what exact thing you're measuring - IOUtils.copyBytes is a pretty general method.
Also, what disk hardware are you using? Have you configured dfs.data.dir to point to all local drives on each DN? Could very well be a factor of disk seeks limiting you. -Todd On Mon, Apr 12, 2010 at 10:18 PM, Gokulakannan M <gok...@huawei.com> wrote: > @Todd, > > > > I am using 1 namenode and 10 datanodes. Replication factor is > 3. > > > > These are the details about my test suite. It uses the simple > write code that uses IOUtils.copyBytes() and calculates the response time > for each write for a given number of users(threads). > > > > From namenode, I am running the test suite which calculates the > response time for writing a file for several concurrent users(1 , 4 , 9 , > 20 ). Whatever the file size may be(I tried with 100 MB , 1 GB and 5 GB > files), I find that the response time increases linearly. > > > > These are the results for a 100 MB file. > > > > > > No of Threads > > Response time(milli sec) > > 1 > > 2199 > > > > > > 4 > > 2763 > > > > 2774 > > > > 34766 > > > > 64488 > > > > > > 9 > > 3620 > > > > 4897 > > > > 5018 > > > > 9991 > > > > 65501 > > > > 124156 > > > > 183639 > > > > 243631 > > > > 314233 > > > > > > 20 > > 6702 > > > > 6784 > > > > 6985 > > > > 8987 > > > > 9202 > > > > 9271 > > > > 9925 > > > > 10752 > > > > 68237 > > > > 70878 > > > > 73469 > > > > 75261 > > > > 77507 > > > > 82061 > > > > 129942 > > > > 137098 > > > > 141822 > > > > 194199 > > > > 269353 > > > > 322328 > > > > > > You can see the write time increases for each thread. > > > > @ Sagar: > > > > I think network is not the issue here. This is a private > network with high bandwidth. Moreover I tried the running the same test > suite with several configurations(pseudo dist mode, 1 NN 1 DN Cluster , 1 NN > 3 DN cluster) in my usual network(typically with low bandwidth) and obtained > the results like above :( > > > > > > Thanks, > > Gokul > > > > > > > ------------------------------ > > *From:* Todd Lipcon [mailto:t...@cloudera.com] > *Sent:* Monday, April 12, 2010 8:18 PM > *To:* hdfs-user@hadoop.apache.org; gok...@huawei.com > *Subject:* Re: response time increases lineraly for each thread! > > > > Can you be more specific about your benchmark? You're almost certainly > being limited by either outgoing network from your single node, or by disk > throughput on that node (writes will go local in addition to remote replicas > if you are running a DN on the same host) > > > > -Todd > > On Mon, Apr 12, 2010 at 7:21 AM, Gokulakannan M <gok...@huawei.com> wrote: > > > > Hi, > > > > When I tested the performance of a 11 node hadoop > cluster(private nw) for the write scenario with several threads, I noticed > that the time for each write increases linearly(what ever the file size may > be). > > > > Actually I'm running these threads as a single process in the > system where namenode is running. I'm getting inconsistent results when I > run this suite each time. The results are fluctuating. What might be the > bottlenecks? > > > > I think running this in parallel by spawning multiple shells > will give consistent result. But running these in the same system will give > exact results?? > > > > *Any* *ways to measure the performance correctly?* > > * * > > * *Appreciated any help in this. > > > > Thanks, > > Gokul > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera
<<image001.gif>>