[
https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152037#comment-16152037
]
He Tianyi commented on HDFS-10467:
----------------------------------
[~elgoiri], thanks for asking.
our federated cluster has grown to 7000+ nodes this year. I can share some
lessons learned in production with nnproxy:
* can speed up data rebalance between subclusters with 'fastcopy', or similar
method, which effectively reduces resource consumption when there is intensive
rebalance work
* perhaps isolation between subclusters for request forwarding on single router
is required, otherwise outage of any subcluster could also affect others (from
client's point of view) due to shared resource, i.e. thread pool (ipc
handlers), client connection pool (ipc client). we done this by implementing a
fully nonblocking version of proxy, also use multiple client connections to
forward requests (as HADOOP-13144 suggests)
* global quota: we've disabled quota on each NameNode. quota is computed by a
separated service which reads/tails fsimage and editlog from all subclusters,
while nnproxy plays the part of enforcing quota (rejecting to create file when
usage exceeds limitation, for example).
> Router-based HDFS federation
> ----------------------------
>
> Key: HDFS-10467
> URL: https://issues.apache.org/jira/browse/HDFS-10467
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: fs
> Affects Versions: 2.8.1
> Reporter: Íñigo Goiri
> Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch,
> HDFS-10467.PoC.patch, HDFS Router Federation.pdf,
> HDFS-Router-Federation-Prototype.patch
>
>
> Add a Router to provide a federated view of multiple HDFS clusters.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]