On Aug 25, 2011, at 6:26 AM, Robert Evans wrote:

> I saw an article yesterday saying the GlusterFS 3.3 now has Hadoop bindings.
> I also ran across XtreemFS a while back, which also supports Hadoop
> bindings.  Both of them claim to be faster and more scalable than HDFS.

... for various values of "faster" and "scalable".  

        For example, in the case of gluster, I haven't seen any references to 
PB-sized filesystems.  But gluster likely handles small files better.  Both are 
measurements of scale, but is one more scalable than the other?  Depending upon 
use case, obviously yes.

        As Hadoop gains in importance, we're seeing more and more of these type 
of overly broad statements.  Consumers just need to be smart and do their 
research to find the correct bits for them.  I just hope folks actually dig 
into the details before spending their cash.  

> Has anyone in the community done some actual benchmarks on the same hardware
> for some HDFS replacements?  I would love to see how true their claims are
> and what we need to do to beat them.

        I don't think it is a matter of 'beating' them.   Certain environments 
have needs that aren't met by HDFS and they will go out looking for 
alternatives.  Competition is also good in the sense that it drives people to 
improve the base stuff.  (The sudden importance of HA in certain camps is a 
great example of this.)

        Besides, for a lot of these commercial companies, the fact that they 
aren't submitting (viable) patches to be included in Apache Hadoop puts them at 
a disadvantage from the start. 

Reply via email to