Doug,

> 1. Can you please describe the significant advantages this approach has
> over a symlink-based approach?

Federation is complementary with symlink approach. You could choose to
provide integrated namespace using symlinks. However, client side mount
tables seems a better approach for many reasons:
# Unlike symbolic links, client side mount tables can choose to go to right
namenode based on configuration. This avoids unnecessary RPCs to the
namenodes to discover the targer of symlink.
# The unavailability of a namenode where a symbolic link is configured does
not affect reaching the symlink target.
# Symbolic links need not be configured on every namenode in the cluster and
future changes to symlinks need not be propagated to multiple namenodes. In
client side mount tables, this information is in a central configuration.

If a deployment still wants to use symbolic link, federation does not
preclude it.

> It seems to me that one could run multiple namenodes on separate boxes
and run multile datanode processes per storage box

There are several advantages to using a single datanode:
# When you have large number of namenodes (say 20), the cost of running
separate datanodes in terms of process resources such as memory is huge.
# The disk i/o management and storage utilization using a single datanode is
much better, as it has complete view the storage.
# In the approach you are proposing, you have several clusters to manage.
However with federation, all datanodes are in a single cluster; with single
configuration and operationally easier to manage.

> The patch modifies much of the logic of Hadoop's central component, upon
which the performance and reliability of most other components of the
ecosystem depend.
That is not true.

# Namenode is mostly unchanged in this feature.
# Read/write pipelines are unchanged.
# The changes are mainly in datanode:
#* the storage, FSDataset, Directory and Disk scanners now have another
level to incorporate block pool ID into the hierarchy. This is not a
significant change that should cause performance or stability concerns.
#* datanodes use a separate thread per NN, just like the existing thread
that communicates with NN.

> Can you please tell me how this has been tested beyond unit tests?
As regards to testing, we have passed 600+ tests. In hadoop, these  tests
are mostly integration tests and not pure unit tests.

While these tests have been extensive, we have also been testing this branch
for last 4 months, with QA validation that reflects our production
environment. We have found the system to be stable, performing well and have
not found any blockers with the branch so far.

HDFS-1052 has been open more than a year now. I had also sent an email about
this merge around 2 months ago. There are 90 subtasks that have been worked
on last couple of months under HDFS-1052. Given that there was enough time
to ask these questions, your email a day before I am planning to merge the
branch into trunk seems late!

-- 
Regards,
Suresh

Reply via email to