Re: FUSE HDFS significantly slower

Hazem Mahmoud Tue, 26 Oct 2010 11:26:13 -0700

That raises a question that I am currently looking into and would appreciate 
any and all advice people have.

We are replacing our current NetApp solution, which has served us well but we 
have outgrown it.

I am looking at either upgrading to a bigger and meaner NetApp or possibly 
going with Hadoop (HDFS and Fuse ).

I need to mount the "storage solution" (HDFS or SAN) to about 5 or 6 systems. 
I'm a little concerned about utilizing HDFS/Fuse for a couple of reasons:
1. Performance of Fuse (how does it compare to an iSCSI SAN solution for 
example)...i know, it probably depends on a lot of things, but just 
generally-speaking or any experiences anyone has had
2. Security/permissions (owner of all files show up as "nobody"

Another question: Are there other options for mounting HDFS on these 5 or 6 
systems for pure filesystem access ? (using NFS, etc)

Thanks everyone!

-Hazem

On Oct 26, 2010, at 5:43 AM, Brian Bockelman wrote:

> In general, unless you run newer kernels and versions of FUSE as that ticket 
> suggests, it is significantly slower in raw throughput.
> 
> However, we generally don't have a day go by at my site where we don't push 
> FUSE over 30Gbps, as the bandwidth is spread throughout nodes.  Additionally, 
> as we are limited by the latency of spinning disk and random reads, we don't 
> particularly hurt by going "only" 60MB/s on our nodes.  If we wanted to go 
> faster, we use the native clients.
> 
> Of course, if anyone wants to donate a lowly university 1.5PB of SSDs, I'm 
> all ears :)
> 
> Brian
> 
> On Oct 26, 2010, at 12:40 AM, Ted Yu wrote:
> 
>> https://issues.apache.org/jira/browse/HADOOP-3805 tried to mitigate this
>> problem.
>> 
>> On Mon, Oct 25, 2010 at 10:17 PM, aniket ray <[email protected]> wrote:
>> 
>>> Hi,
>>> 
>>> I'm seeing in my experiments that Fuse-HDFS is significantly slower (around
>>> 3x slower) than using the Java hdfs API directly.
>>> Wanted to ask if this slowness the norm? Or is there something wrong with
>>> my
>>> configuration.
>>> Also is this purely JNI slowness or is there something deeper to it?
>>> 
>>> 
>>> My experiment is basically opening a file in write mode and calling writes
>>> multiple times  (close to 5GB data) to write to that file.
>>> 
>>> Thanks for the help,
>>> aniket ray
>>> 
>

Re: FUSE HDFS significantly slower

Reply via email to