[ 
https://issues.apache.org/jira/browse/HADOOP-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Farnum updated HADOOP-6253:
-----------------------------------

    Status: Patch Available  (was: Open)

I've attached a patch which includes the CephFileSystem and IOStream classes, 
as well as package documentation. To actually use it you're going to need an 
installation of Ceph (ceph.newdream.net).
I have *not* included any unit tests, as the code depends on the libhadoopceph 
shared library and without a Ceph install it seems sort of pointless -- about 
all I can see to do is make sure that calling the methods throws an IOException 
for being uninitialized. Still, most of the other filesystems came up with 
something, so if you have any suggestions for useful test cases let me know and 
I can add them. :)

In very basic testing (~900MB and ~6GB worth of data), this and the current 
Ceph code is roughly equivalent in speed to HDFS running a mapred via the 
hadoop-examples jar from .20 using the default values for both systems; Ceph 
tends to be slightly faster in a put and slightly slower in the mapred (~3:35 
versus ~3:20 on the 6GB test case). However, Ceph, while still highly 
experimental and in-development, is a full filesystem with a linux kernel and 
full userspace client; it also distinguishes itself from HDFS by having no 
single point of failure -- it uses a paxos-based monitor cluster for managing 
state and multiple metadata servers instead of the single HDFS namenode (though 
of course you can also run the entire system on one machine).

> Add a Ceph FileSystem interface.
> --------------------------------
>
>                 Key: HADOOP-6253
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6253
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Gregory Farnum
>            Priority: Minor
>
> The experimental distributed filesystem Ceph does not have a single point of 
> failure, and might be of use to some Hadoop users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to