That is a different motivation. The document talks about why you should use
> federation. I am asking about motivation of supporting the code base while
> not using it. At least this is how understand Allen's question and some of
> my colleagues'.
>

Namenode code is not changed at all. Datanode code changes to add the notion
of block pool and a thread per NN. For a single NN, datanode is equivalent
to the current datanode. If you argue that there should not be any code
change - not sure how features like this can be added to HDFS. There is no
change from user perspective and performance of the system. No additional
complexity from the existing system.


> If you could put some numbers in the jira for the reference.
>
Will do.


>
> Also it is interesting to know whether there is a benefit in splitting
> the namespace. Can I e.g. do more getBlockLocations per second?
> This is one of the aspects of scaling, right?
>

I do not understand your question. This feature does not scale
getBlockLocations per second for a single NN. When you use many NNs, total
requests per second does scale for the entire cluster.

> As we developed this feature, some significant improvements have been made
> to the system - fast snapshots (snapshot time down from 1hr 45 mins to 1
> min!), fast startup, cleanup of storage, fixing multi threading issues in
> several places, decommissioning improvements etc.
>

> This is a valid concern. Hence the single namenode configuration that most
> > installations run today, will run as is. We put a lot of development and
> > testing effort to ensure this.
> >
>
> I don't know what you mean by "as is". My experience with this word in real
> estate tells me it can be anything.
>

I used the word with following meaning:
http://www.merriam-webster.com/dictionary/as%20is
— *as is*
*:* in the presently existing condition without modification

Reply via email to