On Oct 3, 2013, at 12:17 PM, Milind Bhandarkar wrote:

> Exec Summary: For the last couple of months, we, at Pivotal, along with a
> couple of folks in the community have been working on making Namespace
> implementation in the namenode pluggable. We have demonstrated that it can
> be done without major surgery on the namenode, and does not have noticeable
> performance impact. We would like to contribute it back to Apache if there
> is sufficient interest. Please let us know if you are interested, and we
> will create a Jira and update the patch for in-progress work.
> ……


Milind, 
a reasonable idea - but best to discuss actual details in a jira.  Some initial 
thoughts, to clear some of the confusions, (and accusations) in this thread

HDFS pluggability (and relation to pluggability added as part of Federation)
 - Pluggabilty and federation are orthogonal, although we did improved the 
pluggabily of HDFS as part of federation implementation. As Vinod has noted the 
*block layer* was separated out as part of the federation work and hence makes 
the general development of new  of HDFS namespace implementations easier.  
Federation's  pluggablity was  targeted towards  someone writing a new NN and 
reusing the block storage layer via a library   and optionally living 
side-by-side with different implementations of the NN within the same cluster. 
Hence we added notion of block pools and separated out the block management 
layer.  
 - So your proposed work is clearly not in conflict with Federation or even 
with the pluggability that Federation added, but philosophically,  your 
proposal is complementary. 

Considerations: A Public API?
The FileSystem/AbstractFileSystem APIs and the newly proposed 
AbstractFSNamesystem are targeting very different kinds of plugability into 
Hadoop. The former takes a thin application API (FileSystem and FileContext) 
and makes it easy for users to plug in different filesytems (S3, LocalFS, etc) 
as Hadoop compatible filesystems. In contrast the later (the proposed 
AbstractFSNamesystem) is a fatter interface inside the depths of HDFS 
implementation and makes parts of the impl pluggable. 

I would  not make your proposed AbstractFSNamesystem a public stable Hadoop API 
but instead direct it towards to HDFS developers who want to extend the 
implementation of HDFS more easily. Were you envisioning the Abstract 
FSNamesystem to be a stable public Hadoop API? If someone has their own private 
implementation for this new abstract class, would  the HDFS community have the 
freedom to modify the abstract class in incompatible ways? These are 
discussions for the Jira.

A somewhat related piece of work:
Since Milind motivated his pluggbility by  a new NN implementation (that 
happens to use HBase), I will briefly mention an experiment for building a new 
NN that stores only a partial namespace in memory. The goal of this experiment 
was *not* making the NN code more pluggable, but instead to provide an 
alternate implementation of the NN; hence it is orthogonal.  A PhD student, who 
worked as an intern at Hortonworks implemented a NN that stores only partial 
namespace in RAM. She presented this to a HUG in Aug 2013 in sunnyvale. I have 
encouraged her to file a jira but she wants to finish some more experiments 
before filing, I will file a jira on her behalf and refer to her work in the 
next day or so.  It is a prototype that helps us understand how well the 
particular implementation choice for this alternate NN  works. It would be 
interesting to see if her code changes fit into Milind's newly proposed 
AbstractFSNamesystem. My initial view is that it may not, but I will wait till 
Milind posts an initial strawman of the AbstractFSNamesystem before commenting 
(While subclassing interfaces can works very well, subclassing implementations 
can be very tricky to get right.).

Milind, please file the jira for further discussions.

sanjay



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to