Re: Announcing release of Kosmos Filesystem (KFS)

Sriram Rao Mon, 01 Oct 2007 10:38:27 -0700

Hi Ian,

>
> > what is the fundamental difference between KFS and hadoop such that 2
> > seperate projects are required?


Backing up a bit,
 - Hadoop: is map-reduce engine + HDFS
 - KFS: is a filesystem

Your question is probably more of what is the difference between KFS and HDFS.

Toby's reply to your mail gives me a bit more credit :-)

>
> * KFS supports atomic append, Hadoop does not
>

KFS, currently, DOES NOT support atomic record append.  We have
designed the system for atomic record append; we have held off this
feature as others have taken priority.  There is support for multiple
concurrent writes to a file---the chunkserver that has the write lease
serializes the writes.

> * KFS supports rebalancing, Hadoop does not

Currently this is the case.  Owen had replied to one of  the posts
over the weekend/last week about adding block rebalancing in HDFS in
an upcoming Hadoop release.

> * KFS exports a POSIX file interface, Hadoop does not (GFS does not, either)

To be a bit clear,
 - with  HDFS, you open a file for writing once; you write
sequentially start->end
 - with KFS, you can open a file for writing multiple times; you can
seek anywhere and write; the KfsClient::Open() supports O_APPEND in
POSIX-style appends; not record-append.

There is also the issue with HDFS's writes to a file becoming  visible
on close.  With KFS,
 - when a client creates a file, the file is part of the fs namespace
 - writes are cached at the client
 - whenever the cache is full/application calls flush, the data gets
pushed out to the chunkservers
 - when data is written to chunkservers, it is visible

> Maybe KFS can
> be integrated with Hadoop's MapReduce to make up for the current lack
> of such from Kosmix?

Toby---This *is* done.  I have a filed a JIRA issue and submitted the
code.  See Hadoop-1963.

Sriram

Re: Announcing release of Kosmos Filesystem (KFS)

Reply via email to