Hi Gokul,

Would it be possible to post your benchmark code somewhere? I'm not clear on
what exact thing you're measuring - IOUtils.copyBytes is a pretty general
method.

Also, what disk hardware are you using? Have you configured dfs.data.dir to
point to all local drives on each DN? Could very well be a factor of disk
seeks limiting you.

-Todd


On Mon, Apr 12, 2010 at 10:18 PM, Gokulakannan M <gok...@huawei.com> wrote:

>  @Todd,
>
>
>
>             I am using 1 namenode and 10 datanodes. Replication factor is
> 3.
>
>
>
>             These are the details about my test suite. It uses the simple
> write code that uses IOUtils.copyBytes() and calculates the response time
> for each write for a given number of users(threads).
>
>
>
>             From namenode, I am running the test suite which calculates the
> response time for writing a file for several concurrent users(1 ,  4  , 9 ,
> 20 ). Whatever the file size may be(I tried with 100 MB , 1 GB and 5 GB
> files), I find that the response time increases linearly.
>
>
>
>             These are the results for a 100 MB file.
>
>
>
>
>
> No of Threads
>
> Response time(milli sec)
>
> 1
>
> 2199
>
>
>
>
>
> 4
>
> 2763
>
>
>
> 2774
>
>
>
> 34766
>
>
>
> 64488
>
>
>
>
>
> 9
>
> 3620
>
>
>
> 4897
>
>
>
> 5018
>
>
>
> 9991
>
>
>
> 65501
>
>
>
> 124156
>
>
>
> 183639
>
>
>
> 243631
>
>
>
> 314233
>
>
>
>
>
> 20
>
> 6702
>
>
>
> 6784
>
>
>
> 6985
>
>
>
> 8987
>
>
>
> 9202
>
>
>
> 9271
>
>
>
> 9925
>
>
>
> 10752
>
>
>
> 68237
>
>
>
> 70878
>
>
>
> 73469
>
>
>
> 75261
>
>
>
> 77507
>
>
>
> 82061
>
>
>
> 129942
>
>
>
> 137098
>
>
>
> 141822
>
>
>
> 194199
>
>
>
> 269353
>
>
>
> 322328
>
>
>
>
>
>             You can see the write time increases for each thread.
>
>
>
> @ Sagar:
>
>
>
>             I think network is not the issue here. This is a private
> network with high bandwidth. Moreover I tried the running the same test
> suite with several configurations(pseudo dist mode, 1 NN 1 DN Cluster , 1 NN
> 3 DN cluster) in my usual network(typically with low bandwidth) and obtained
> the results like above :(
>
>
>
>
>
>  Thanks,
>
>   Gokul
>
>
>
>
>
>
>   ------------------------------
>
> *From:* Todd Lipcon [mailto:t...@cloudera.com]
> *Sent:* Monday, April 12, 2010 8:18 PM
> *To:* hdfs-user@hadoop.apache.org; gok...@huawei.com
> *Subject:* Re: response time increases lineraly for each thread!
>
>
>
> Can you be more specific about your benchmark? You're almost certainly
> being limited by either outgoing network from your single node, or by disk
> throughput on that node (writes will go local in addition to remote replicas
> if you are running a DN on the same host)
>
>
>
> -Todd
>
> On Mon, Apr 12, 2010 at 7:21 AM, Gokulakannan M <gok...@huawei.com> wrote:
>
>
>
> Hi,
>
>
>
>             When I tested the performance of a 11 node hadoop
> cluster(private nw) for the write scenario with several threads, I noticed
> that the time for each write increases linearly(what ever the file size may
> be).
>
>
>
>             Actually I'm running these threads as a single process in the
> system where namenode is running. I'm getting inconsistent results when I
> run this suite each time. The results are fluctuating. What might be the
> bottlenecks?
>
>
>
>             I think running this in parallel by spawning multiple shells
> will give consistent result. But running these in the same system will give
> exact results??
>
>
>
>             *Any* *ways to measure the performance correctly?*
>
> * *
>
> *            *Appreciated any help in this.
>
>
>
>  Thanks,
>
>   Gokul
>
>
>
>
>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

<<image001.gif>>

Reply via email to