@Todd,

 

            I am using 1 namenode and 10 datanodes. Replication factor is 3.

 

            These are the details about my test suite. It uses the simple
write code that uses IOUtils.copyBytes() and calculates the response time
for each write for a given number of users(threads). 

 

            From namenode, I am running the test suite which calculates the
response time for writing a file for several concurrent users(1 ,  4  , 9 ,
20 ). Whatever the file size may be(I tried with 100 MB , 1 GB and 5 GB
files), I find that the response time increases linearly. 

 

            These are the results for a 100 MB file.

 

            


No of Threads

Response time(milli sec)


1

2199


 

 


4

2763


 

2774


 

34766


 

64488


 

 


9

3620


 

4897


 

5018


 

9991


 

65501


 

124156


 

183639


 

243631


 

314233


 

 


20

6702


 

6784


 

6985


 

8987


 

9202


 

9271


 

9925


 

10752


 

68237


 

70878


 

73469


 

75261


 

77507


 

82061


 

129942


 

137098


 

141822


 

194199


 

269353


 

322328

 

 

            You can see the write time increases for each thread.

 

@ Sagar:

 

            I think network is not the issue here. This is a private network
with high bandwidth. Moreover I tried the running the same test suite with
several configurations(pseudo dist mode, 1 NN 1 DN Cluster , 1 NN 3 DN
cluster) in my usual network(typically with low bandwidth) and obtained the
results like above :(

            

 

 Thanks,

  Gokul

 

  

 

  _____  

From: Todd Lipcon [mailto:t...@cloudera.com] 
Sent: Monday, April 12, 2010 8:18 PM
To: hdfs-user@hadoop.apache.org; gok...@huawei.com
Subject: Re: response time increases lineraly for each thread!

 

Can you be more specific about your benchmark? You're almost certainly being
limited by either outgoing network from your single node, or by disk
throughput on that node (writes will go local in addition to remote replicas
if you are running a DN on the same host)

 

-Todd

On Mon, Apr 12, 2010 at 7:21 AM, Gokulakannan M <gok...@huawei.com> wrote:

 

Hi,

 

            When I tested the performance of a 11 node hadoop
cluster(private nw) for the write scenario with several threads, I noticed
that the time for each write increases linearly(what ever the file size may
be).

 

            Actually I'm running these threads as a single process in the
system where namenode is running. I'm getting inconsistent results when I
run this suite each time. The results are fluctuating. What might be the
bottlenecks? 

 

            I think running this in parallel by spawning multiple shells
will give consistent result. But running these in the same system will give
exact results??

 

            Any ways to measure the performance correctly?

 

            Appreciated any help in this.

 

 Thanks,

  Gokul

 

 




-- 
Todd Lipcon
Software Engineer, Cloudera

<<attachment: image001.gif>>

Reply via email to