Obviously the algorithm matters, but here are some very old numbers (things 
today are much better), but you do see the 'linear' scaling with both nodes and 
datasets:

http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
100TB Sort - 97 mins
1000 TB Sort - 975 mins

Arun

On Jan 17, 2013, at 7:09 PM, Thiago Vieira wrote:

> Hello!
> 
> Is common to see this sentence: "Hadoop Scales Linearly". But, is there any 
> performance evaluation to confirm this? 
> 
> In my evaluations, Hadoop processing capacity scales linearly, but not 
> proportional to number of nodes, the processing capacity achieved with 20 
> nodes is not the double of the processing capacity achieved with 10 nodes. Is 
> there any evaluation about this?
> 
> Thank you!
> 
> --
> Thiago Vieira

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/


Reply via email to