Re: Status FUSE-Support of HDFS

Brian Bockelman Tue, 04 Nov 2008 08:19:51 -0800

Hey Robert,

I would chime in saying that our usage of FUSE results in a networktransfer rate of about 30MB/s, and it does not seem to be a limitingfactor (right now, we're CPU bound).

In our (limited) tests, we've achieved 80Gbps of reads in our clusteroverall. This did not appear to push the limits of FUSE or Hadoop.

Since we've applied the patches (which are in 0.18.2 by default), wehaven't had any corruption issues. Our application has rather heavy-handed internal file checksums, and the jobs would crash immediatelyif they were reading in garbage.


Brian

On Nov 4, 2008, at 10:07 AM, Robert Krüger wrote:

Thanks! This is good news. So it's fast enough for our purposes if it
turns out to be the same order of magnitude on our systems.

Have you used this with rsync? If so, any known issues with that
(reading or writing)?

Thanks in advance,

Robert


Pete Wyckoff wrote:
Reads are 20-30% slower
Writes are 33% slower before https://issues.apache.org/jira/browse/HADOOP-3805- You need a kernel > 2.6.26-rc* to test 3805, which I don't have :(
These #s are with hadoop 0.17 and the 0.18.2 version of fuse-dfs.

-- pete


On 11/2/08 6:23 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:



Hi Pete,
thanks for the info. That helps a lot. We will probably test it forouruse cases then. Did you benchmark throughput when reading writingfilesthrough fuse-dfs and compared it to command line tool or APIaccess? Is
there a notable difference?

Thanks again,

Robert



Pete Wyckoff wrote:
It has come a long way since 0.18 and facebook keeps our (0.17)dfs mounted via fuse and uses that for some operations.
There have recently been some problems with fuse-dfs when used ina multithreaded environment, but those have been fixed in 0.18.2and 0.19. (do not use 0.18 or 0.18.1)
The current (known) issues are:
1. Wrong semantics when copying over an existing file - namely itdoes a delete and then re-creates the file, so ownership/permissions may end up wrong. There is a patch for this.2. When directories have 10s of thousands of files, performancecan be very poor.3. Posix truncate is supported only for truncating it to 0 sizesince hdfs doesn't support truncate.4. Appends are not supported - this is a libhdfs problem andthere is a patch for it.
It is still a pre-1.0 product for sure, but it has been prettystable for us.
-- pete


On 10/31/08 9:08 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:



Hi,
could anyone tell me what the current Status of FUSE support forHDFS
is? Is this something that can be expected to be usable in a few
weeks/months in a production environment? We have been really
happy/successful with HDFS in our production system. However, some
software we use in our application simply requires an OS-Level file
system which currently requires us to do a lot of copying betweenHDFSand a regular file system for processes which require thatsoftware andFUSE support would really eliminate that one disadvantage we havewithHDFS. We wouldn't even require the performance of that to beoutstandingbecause just by eliminatimng the copy step, we would greatlyincrease
the thruput of those processes.

Thanks for sharing any thoughts on this.

Regards,

Robert

Re: Status FUSE-Support of HDFS

Reply via email to