[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577562#comment-16577562 ] Hudson commented on HDFS-10467: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14752 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14752/]) HDFS-12790: [SPS]: Rebasing HDFS-10285 branch after HDFS-10467, (umamahesh: rev 9b83f94f35eb8cd20d9f3e0cbbeecb6ffb5b) * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestStoragePolicyCommands.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestStoragePolicySatisfyAdminCommands.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStoragePolicySatisfier.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStoragePolicySatisfierWithStripedFile.java > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Labels: RBF > Fix For: 2.9.0, 3.0.0 > > Attachments: HDFS Router Federation.pdf, HDFS-10467.002.patch, > HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199097#comment-16199097 ] Hudson commented on HDFS-10467: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13061 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13061/]) HADOOP-14939. Update project release notes with HDFS-10467 for 3.0.0. (wang: rev 132cdac0ddb5c38205a96579a23b55689ea5a8e3) * (edit) hadoop-project/src/site/markdown/index.md.vm > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Labels: RBF > Fix For: 3.0.0 > > Attachments: HDFS Router Federation.pdf, HDFS-10467.002.patch, > HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197695#comment-16197695 ] Íñigo Goiri commented on HDFS-10467: Thanks [~andrew.wang]. I updated the release notes for this JIRA and created HADOOP-14939 updating the {{index.md.vm}}. Not my finest piece of literature so feel free to suggest comments. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Labels: RBF > Fix For: 3.0.0 > > Attachments: HDFS Router Federation.pdf, HDFS-10467.002.patch, > HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197575#comment-16197575 ] Andrew Wang commented on HDFS-10467: Thanks for working on this Inigo. Do you mind adding a release note to this JIRA? We should also update hadoop-project/src/site/markdown/index.md.vm with links to the docs. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Labels: RBF > Fix For: 3.0.0 > > Attachments: HDFS Router Federation.pdf, HDFS-10467.002.patch, > HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195522#comment-16195522 ] Hudson commented on HDFS-10467: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13045 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13045/]) HDFS-12223. Rebasing HDFS-10467. Contributed by Inigo Goiri. (inigoiri: rev 0ec82b8cdfaaa5f23d1a0f7f7fb8c9187c5e309b) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java HDFS-12312. Rebasing HDFS-10467 (2). Contributed by Inigo Goiri. (inigoiri: rev 346c9fce43ebf6a90fc56e0dc7c403f97cc5391f) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs HDFS-12430. Rebasing HDFS-10467 After HDFS-12269 and HDFS-12218. (inigoiri: rev 1f06b81ecb14044964176dd16fafaa0ee96bfe3d) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java HDFS-12580. Rebasing HDFS-10467 after HDFS-12447. Contributed by Inigo (inigoiri: rev 6c69e23dcdf1cdbddd47bacdf2dace5c9f06e3ad) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Labels: RBF > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195513#comment-16195513 ] Íñigo Goiri commented on HDFS-10467: The vote passed so I merged HDFS-10467 into trunk and branch-3.0. [~andrew.wang], I compiled and tested both branch and seem to work. Let me know if there were any issues. I'm pretty sure I may have forgotten to close the JIRA properly (target versions and so on) so feel free to clean that up if so. With this, I complete HDFS-10467 with all sub-tasks closed and move to phase 2 in HDFS-12615. Thanks everybody for the comments! > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Labels: RBF > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193982#comment-16193982 ] Íñigo Goiri commented on HDFS-10467: The [vote for merging|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201709.mbox/%3CCAB1dGgoyfOM5ydbHrFQSbxq9Q3Kt-B1Xz3qFiXAe%2BNWyrbc3-Q%40mail.gmail.com%3E] is finishing tomorrow. At this point, the only thing open is the issue with the names described in HDFS-12577. The security patch can be moved to v2 (I'll open a new umbrella with the remaining issues). > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Labels: RBF > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172141#comment-16172141 ] Íñigo Goiri commented on HDFS-10467: There are a couple of improvements for HDFS-12273 that would go into separate JIRAs. As they are not required for the merge, I'm thinking on adding them out of this umbrella with the suffix RBF (for Router Based Federation). > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152045#comment-16152045 ] Tao Li commented on HDFS-10467: --- Tao Li liked your email Spark by Readdle > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152037#comment-16152037 ] He Tianyi commented on HDFS-10467: -- [~elgoiri], thanks for asking. our federated cluster has grown to 7000+ nodes this year. I can share some lessons learned in production with nnproxy: * can speed up data rebalance between subclusters with 'fastcopy', or similar method, which effectively reduces resource consumption when there is intensive rebalance work * perhaps isolation between subclusters for request forwarding on single router is required, otherwise outage of any subcluster could also affect others (from client's point of view) due to shared resource, i.e. thread pool (ipc handlers), client connection pool (ipc client). we done this by implementing a fully nonblocking version of proxy, also use multiple client connections to forward requests (as HADOOP-13144 suggests) * global quota: we've disabled quota on each NameNode. quota is computed by a separated service which reads/tails fsimage and editlog from all subclusters, while nnproxy plays the part of enforcing quota (rejecting to create file when usage exceeds limitation, for example). > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151369#comment-16151369 ] Íñigo Goiri commented on HDFS-10467: [~He Tianyi], now that we start to have most of the patches ready and we are discussing what would take to merge the branch into trunk, would you mind taking a look at the code? Let me know if there is any feature from NNProxy you think should be covered here. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983422#comment-15983422 ] Inigo Goiri commented on HDFS-10467: [~fabbri], the tasks in the current JIRA are the basic ones to get the Router-based federation working. There are a bunch of them that we can add: * Web interface * Metrics system * Router heartbeating * Router safe mode * Rebalancing All these are already implemented and is running in our clusters. There is a couple months ago version available at: https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router (I can update with the latest if needed.) At this point is a matter of reviewing the code in the subtasks. It's hard to give a time frame but having reviews; so any reviews on the subtasks is highly appreciated. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS > Router Federation.pdf, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983390#comment-15983390 ] Aaron Fabbri commented on HDFS-10467: - Thanks for the update [~elgoiri]. I'm trying to get a feel for the overall progress of this. Are there any work items that are not already covered in the subtasks here? Any other details on how much work is left, or when you expect to have basic features completed, is welcomed. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS > Router Federation.pdf, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933260#comment-15933260 ] Hadoop QA commented on HDFS-10467: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 17s{color} | {color:red} HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-10467 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18772/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS > Router Federation.pdf, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727451#comment-15727451 ] Hadoop QA commented on HDFS-10467: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 11s{color} | {color:red} HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-10467 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17784/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727440#comment-15727440 ] Inigo Goiri commented on HDFS-10467: Created fork to branch-2.6.1: https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665202#comment-15665202 ] Inigo Goiri commented on HDFS-10467: I will do a fork in GitHub to place our code running in production for people to try. What version should I use as base? trunk? 2.8? branch-2? > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514336#comment-15514336 ] Jason Kace commented on HDFS-10467: --- [~subru], than you for the feedback! 1) Using jcache for the query caches is a good idea. A TODO I have is to increase the scalability of the caches and/or to prune older entries. jcache seems to handle these well. I'll check out YARN to see if there is a cache manager we can reuse. For the internal caches of state store records, I'm not convinced jcache provides any benefits as these caches are closely synchronized with internal data structures such as the tree representation of the mount table, etc. 2) I'll work on curator for ZK. It will simplify the codebase and connection management. 3) I'll add versioning. For HDFS federation, there are multiple APIs implemented in different classes, I recommend that each of these are versioned (i.e. Registration, MountTable, RouterState, Rebalancer, etc). The driver interface is separate from the interface APIs and should also be versioned. Each of the data records and/or API request/response objects can potentially be versioned, but I think it is best to keep their version tied to the interface API as each has a 1:many relationship between the interface:object. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514209#comment-15514209 ] Inigo Goiri commented on HDFS-10467: [~subru], I agree on {{RecordFactory}}. I just created HADOOP-13642 to move this from YARN to Common. Thanks for the feedback! > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511886#comment-15511886 ] Hadoop QA commented on HDFS-10467: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-10467 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16828/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511831#comment-15511831 ] Subru Krishnan commented on HDFS-10467: --- Thanks [~jakace] and [~goiri] for the refactored patch. I made a quick pass in the context of HADOOP-13378 and I think that we can represent the YARN {{FederationStateStore}} using the generic {{StateStoreDriver}} you guys have defined. Personally I prefer the push mechanism we have in YARN-3671 as it's much simpler than the pull mechanism proposed here though I do agree both achieve the same result. A couple of comments based on my quick scan: * We should add versioning to the generic {{StateStoreDriver}}. Refer [FederationStateStore|https://github.com/apache/hadoop/blob/YARN-2915/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/store/FederationStateStore.java]. * Use _jcache_ instead of writing custom key-caches based on _ConcurrentHashMaps_. In fact, I feel we can refactor the (jcache-based) cache in [FederationStateStoreFacade|https://github.com/apache/hadoop/blob/YARN-2915/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/utils/FederationStateStoreFacade.java] and use it across both efforts. * Lastly I would suggest using [Curator|http://curator.apache.org/] for *ZooKeeper* implementation as we have moved to it in YARN (YARN-4438 and follow up work). > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403257#comment-15403257 ] Inigo Goiri commented on HDFS-10467: Regarding the rebalancing operations, currently we are proposing to disallow write accesses from the Routers. The problem is that then we have to disallow direct accesses to the Namenodes to prevent writes at that level. For this reason, we could leverage the concept of immutable folders/files from HDFS-3154 and more recently HDFS-7568. Not sure how likely are those efforts are to move forward though. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382547#comment-15382547 ] Inigo Goiri commented on HDFS-10467: Thanks [~vinayrpet] for the comments. bq. Why cant we use DFSClient with HA here directly? Like how current clients connects to HA, each subcluster can be connected to a subcluster using a HA configured DFSClient. DFSClient itself will handle switching between NNs in case of failover. This DFSClient can be kept as-is and re-used later for next requests on same subcluster. So it should know the current active namenode. To provide a fully federated view, we think is best to track the state of all the Namenodes. In this way, we can expose the federation view in the web UI. Given that we have this information, we can use this information as hints for the clients. Actually, there was some discussion in HDFS-7858 regarding using the information in ZK to go faster to the Active namenode. This was discarded because of the additional complexity. I think this might be a good opportunity to go in that direction. Our current implementation (using the Active hint) is faster than the regular fail over and produces less load than the hedging approach. bq. How about supporting a default Mount-point for '/' as well. Could be optional also? Instead of rejecting requests for the paths which doesnt match with other other mountpoints. There might be some usecases where there might be multiple first level directories other than mounted path. Which could go under /. Yes, this is a common use case. We already support a default / set using {{dfs.router.default.nameserviceId}}. We may want to make it more explicit/clear. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382535#comment-15382535 ] Inigo Goiri commented on HDFS-10467: [~mingma], thank you for the comments. A few answers/clarifictions. bq. Support for mergeFs HADOOP-8298. We should be able to extend the design to support this.There might be some issues around how to provision a new sub folder (which namespace should own that) and how it works with rebalancer. This could be a good addition for future work section. In the prototype we actually started this but we haven't gone into testing with it. In addition, I think merge points go a little bit on the direction of N-Fly in HADOOP-12077. I think we should support both of them together. I'll add the reference explicitly to the document. bq. Handling of inconsistent state. Given routers cache which namenodes are active, the state could be different from the actual namenode at that moment. Thus routers might get {{StandbyException}} and need to retry on another namenode. If so, does it mean the routers should leverage ipc {{FailoverOnNetworkExceptionRetry}} or use {{DFSClient}} with hint for active namenode? In the current implementation we use the client with the hint. We first try the one marked as active in the State Store and we capture {{StandbyExceptions}} etc. This is in HDFS-10629 in {{RouterRpcServer#invokeMethod()}}. bq. Soft state vs hard state. while subcluster active namenode machine and load/space are soft state that can be reconstructed from namenodes; mount table is hard state that need to be persisted. Is there any benefit separating them out to use different state stores as they have different persistence requirement, access patterns(mount table does't change much while load/space update is frequent) and admin interface? For example, admin might want to update mount table on demand; but not load/space state. True, this is easy to implement right now. We should see if people is OK with the additional complexity of configuring two backends. I guess we can discuss in HDFS-10630. bq. Usage of subcluster load/space state. Is it correct that the only consumer of subcluster's load/space state is the rebalancer? I image initially we would run rebalancer manually. For that, the rebalancer can just pull subcluster's load/space state from namenodes on demand. Then we don't have to store subcluster load/space state in state store. Correct. Right now we are not even storing load/space data in the State Store. Actually in our Rebalancer prototypes, we are collecting the space externally. For now, we will keep the usage state out of the State Store and once we go into the Rebalancer, we can discuss what's best. bq. Admin's modification of mount table. Besides rebalancer, admin might want to update mount table during cluster initial setup as well as addition of new namespace with new mount entry. If we continue to use mounttable.xml, then admins can push the update the same way as viewFs setup. If we use ZK store, them we need to provide tools to update state store. Right now, our admin tool goes through the Routers to modify the mount table. We could also go directly to the State Store. I just created HDFS-10646 to develop this. bq. What is the performance optimization in your latest patch, based on async RPC client? Our current optimization is based on being able to use more sockets. The current client has a single thread pool per connection and we were limited by this. We haven't explored async extensively but we are not yet sure it will give us the performance we need. We need to explore this. I'll update the document accordingly. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382072#comment-15382072 ] Vinayakumar B commented on HDFS-10467: -- Hi, This will be nice addition to Federation. Apparently, I was also working on similar feature, which almost has same design. Design looks great, I have some comments though. {quote}3.3.2 Namenode heartbeat HA For high availability and flexibility, multiple Routers can monitor the same Namenode and heartbeat the information to the State Store{quote} {quote}If a Router tries to contact the active Namenode but is unable to do it, the Router will try the other Namenodes in the subcluster.{quote} Why cant we use DFSClient with HA here directly? Like how current clients connects to HA, each subcluster can be connected to a subcluster using a HA configured DFSClient. DFSClient itself will handle switching between NNs in case of failover. This DFSClient can be kept as-is and re-used later for next requests on same subcluster. So it should know the current active namenode. By doing this, there will not be any need of Heartbeat between Router and NameNode to monitor the NameNode status. bq. MountTable How about supporting a default Mount-point for '/' as well. Could be optional also? Instead of rejecting requests for the paths which doesnt match with other other mountpoints. There might be some usecases where there might be multiple first level directories other than mounted path. Which could go under /. Similar to Linux's FileSystem mounts. I will try to review the code on sub jiras. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381726#comment-15381726 ] Ming Ma commented on HDFS-10467: [~elgoiri], nice work! Here are couple more questions about the design. I will post code review questions in sub jiras. * Support for mergeFs HADOOP-8298. We should be able to extend the design to support this.There might be some issues around how to provision a new sub folder (which namespace should own that) and how it works with rebalancer. This could be a good addition for future work section. * Handling of inconsistent state. Given routers cache which namenodes are active, the state could be different from the actual namenode at that moment. Thus routers might get {{StandbyException}} and need to retry on another namenode. If so, does it mean the routers should leverage ipc {{FailoverOnNetworkExceptionRetry}} or use DFSClient with hint for active namenode? * Soft state vs hard state. while subcluster active namenode machine and load/space are soft state that can be reconstructed from namenodes; mount table is hard state that need to be persisted. Is there any benefit separating them out to use different state stores as they have different persistence requirement, access patterns(mount table does't change much while load/space update is frequent) and admin interface? For example, admin might want to update mount table on demand; but not load/space state. * Usage of subcluster load/space state. Is it correct that the only consumer of subcluster's load/space state is the rebalancer? I image initially we would run rebalancer manually. For that, the rebalancer can just pull subcluster's load/space state from namenodes on demand. Then we don't have to store subcluster load/space state in state store. * Admin's modification of mount table. Besides rebalancer, admin might want to update mount table during cluster initial setup as well as addition of new namespace with new mount entry. If we continue to use mounttable.xml, then admins can push the update the same way as viewFs setup. If we use ZK store, them we need to provide tools to update state store. * What is the performance optimization in your latest patch, based on async RPC client? > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15377362#comment-15377362 ] Jing Zhao commented on HDFS-10467: -- Assigned the jira to [~elgoiri]. bq. Probably, it's a good idea to create a new branch for this effort. +1. I've created the feature branch HDFS-10467. Please feel free to use it for next step development. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376170#comment-15376170 ] Inigo Goiri commented on HDFS-10467: After checking the code, I think there might a bunch of overlaps between this work and YARN-2915. I'd like to explore what we could move into Hadoop commons to manage a federated space. I would probably open a new JIRA for that. In addition, given the feedback collected during the last few weeks, it seems like the community is OK with going into this direction so I'd like to start moving the review process forward. To simplify the review, I propose to convert this JIRA into an umbrella and split the current patch into smaller subtasks. For now, I would like to start with: # Minimum Router # State Store interface # ZooKeeper State Store implementation We can add more tasks if people think is the way to do. Probably, it's a good idea to create a new branch for this effort. Thoughts? Opinions? > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373984#comment-15373984 ] Inigo Goiri commented on HDFS-10467: This optimization increases significantly the performance of the Router. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359625#comment-15359625 ] Hadoop QA commented on HDFS-10467: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 1s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 15 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 49s{color} | {color:red} root generated 3 new + 706 unchanged - 2 fixed = 709 total (was 708) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 48s{color} | {color:orange} root: The patch generated 478 new + 1183 unchanged - 5 fixed = 1661 total (was 1188) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 12s{color} | {color:red} The patch generated 4 new + 74 unchanged - 1 fixed = 78 total (was 75) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s{color} | {color:red} The patch 36 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 7s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 37 new + 0 unchanged - 0 fixed = 37 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 1s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 4 new + 7 unchanged - 0 fixed = 11 total (was 7) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 48s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 50s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 28s{color} | {color:red} The patch generated 11 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}114m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | org.apache.hadoop.hdfs.server.federation.locator.PathTreeNode.toString(int) concatenates strings using + in a loop At PathTreeNode.java:in a loop At PathTreeNode.java:[line 130] | | | Unread field:StateStoreMetrics.java:[line 56] | | | Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332455#comment-15332455 ] Inigo Goiri commented on HDFS-10467: [~zhz], thanks for the feedback. Some clarifications to your comments: # We considered modifying ViewFs to check a remote and centralized mount table (I think this option is pretty much what you propose, right?). We didn't go this route for a couple reasons: (1) modifications to the client, and (2) challenging rebalance. In addition, with our approach we get some side advantages like a unified view of the federation, a Router to isolate the NameNodes from the clients, and better HA management. # We haven't gone into hard-linking of DNs but that could be an improvement to the DistCp approach. We are open to improvements there but it might imply changes to the NNs. # Our current implementation of the Subcluster Rebalancer is a tool similar to the regular Rebalancer and is also manually triggered. Right now, it's in a separate package, I can post a patch just for it. Our ultimate goal is to have some service that monitors the subclusters and triggers the proper Subcluster Rebalancer operation (this is work in progress). # In our environment, we co-locate with other services (related to YARN-5215). I think this is orthogonal to the rebalancing but we can always go into that. # The rebalancing itself is the most open part at this point. We've been targeting a tool that supports as many options as possible and let's the admin decide. For now, we support both locking and not locking. # At some point we considered NN level locking. Actually, [~jira.shegalov] had a couple proposals for this based on permissions. We can refine this over time and maybe even implement locking at NN level. # Regading the rebalancing protocol, as I said we are targetting to make it as broad as possible and allow the amdin to pick their options. * I think it'd be better to support rebalancing of different subtrees at the same time. Only rebalancing within a subtree that is under rebalancing would be disallowed. We can always add options for that. * Again this is an option we added based on internal feedback, the Subcluster Rebalancer has an option to wait or not. The Router membership is in the State Store and it's done by the Router; this is already in the PoC patch. And yes, the main reason to do this is the caching of the mount table. Having the Router membership is also useful from an administration point of view to see the whole status of the federation. In general, I think we should start a separate effort for the Subcluster Rebalancer as it has many design choices that can be changed. Obviously we also need to transform this into an umbrella, right now is too big. If people is positive about this effort, we should start discussing ways to split the effort. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332129#comment-15332129 ] Zhe Zhang commented on HDFS-10467: -- Read more about the design and patch. Looks really interesting. Great work here [~elgoiri]. Below are some questions and comments: # Have you considered and compared with the option where client first checks with Router to get NN address, before doing actual RPCs? Or client directly checks the mount table at StateStore, to get NN address? # Have you considered using hard-linking on DNs for rebalancing? # Just to clarify, in the posted design and patch, is Subcluster Rebalancer a tool that should always be manually started? Or is some form of automatic rebalancing in scope? In the patch, which component / class contains the logic of Rebalancer? The {{Rebalancer}} interface doesn't look like it. # bq. We may also find (4) scenarios where too much load or high space requirements in a subcluster start to interfere with the primary tenants of the subcluster. What are "primary tenants" in this context? Non-Hadoop workloads running on the same physical nodes? # Locking the mount entry during rebalancing sounds too disruptive to applications. Alternatively, we can abort the rebalancing when there is an incoming write? Coupled with the 5.2.1 Precondition, the chances of aborted rebalancing shouldn't be too high. # Locking a mount point is a little tricky. Technically, an HDFS client has full control on the local config, and can be configured to directly talk to the NN of a subtree. In a production environment, this could be a legacy config file, or temporary workaround to bypass router. This could lead to data corruption. Not sure if we should consider adding subtree locking in HDFS (any previous discussions / JIRAs)? # About the rebalancing protocol in 5.6: #* In step 1, can we simplify it by adding a limit that at most 1 rebalancing effort at any given time? So a new rebalancing effort would have to wait for the current rebalancing to either finish or be aborted. #* Steps 7 and 9 involves waiting for _all_ routers to acknowledge some state change. Is this too heavyweight? Who maintains router memberships? Are we doing this because of router caching of mount table data? > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319822#comment-15319822 ] Inigo Goiri commented on HDFS-10467: And we targeted to minimize the impact on those 12 classes :) Actually, we expect that based on feedback we can reduce the impact on the {{Client}} and the {{Server}}. Right now, we are using those extensions to allow more connections between {{Router}} and {{NameNode}}. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319564#comment-15319564 ] Zhe Zhang commented on HDFS-10467: -- Thanks. Impressive that only {{12 deletions(-)}} :) > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319411#comment-15319411 ] Hadoop QA commented on HDFS-10467: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 1s {color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 14 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 50s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 50s {color} | {color:red} root generated 4 new + 695 unchanged - 2 fixed = 699 total (was 697) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 49s {color} | {color:red} root: The patch generated 609 new + 1185 unchanged - 5 fixed = 1794 total (was 1190) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 12s {color} | {color:red} The patch generated 4 new + 80 unchanged - 1 fixed = 84 total (was 81) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 48 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 2s {color} | {color:red} The patch 37 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 56s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 44 new + 0 unchanged - 0 fixed = 44 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 5s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 65 new + 7 unchanged - 0 fixed = 72 total (was 7) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 15s {color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 50s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s {color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 145m 55s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Unread field:StateStoreMetrics.java:[line 55] | | | org.apache.hadoop.hdfs.server.federation.router.FederationConnectionId doesn't override
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319098#comment-15319098 ] Inigo Goiri commented on HDFS-10467: I went through the rebase into trunk and there are just a couple changes in {{Server}}, {{Client}}, and a couple related classes. It should be easy to keep rebasing the patch as needed. I haven't been able to fully test it on trunk yet but we'll go over it during the day. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.patch, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317952#comment-15317952 ] Zhe Zhang commented on HDFS-10467: -- Thanks [~elgoiri]! I think most likely we will cut a feature branch for this work. So rebasing the patch on trunk won't be wasted effort. After that, I suggest you push {{trunk + your PoC}} to a personal github branch. Otherwise, trunk itself is a moving target and it will be hard to apply and evaluate the PoC patch again. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317941#comment-15317941 ] Inigo Goiri commented on HDFS-10467: [~zhz], true, it was done on our internal 2.6 branch. I'll me prepare a patch for trunk and disable the SQL driver by tomorrow. Let me know if some branch other than trunk is better as a base. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317925#comment-15317925 ] Zhe Zhang commented on HDFS-10467: -- Thanks for the design doc and patch [~elgoiri], very interesting work. A quick suggestion on the PoC patch first: it doesn't really apply on branch-2.6. I suggest you either attach a link to your PoC github branch or a PoC patch based on some stable branch. So that people can view it as a working PoC project. The dependency on sql server also seems to be causing trouble in building. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317636#comment-15317636 ] He Tianyi commented on HDFS-10467: -- +1. This also reduces latency for first request from client (no failover on client-side, and router can memorize current active peer). > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317626#comment-15317626 ] Inigo Goiri commented on HDFS-10467: Another advantage about the proposed approach is that the Routers take care of the fail over of the Namenodes so it simplifies that in the Client side. Indirectly, this is using the approach that leverages centralized information which was discarded in HDFS-7858. This approach was discarded in that context but I think this is reasonable to do here. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, > HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310998#comment-15310998 ] Inigo Goiri commented on HDFS-10467: [~He Tianyi], our proposal requires additional components (State Store and Router) so it might be a little too complex for what you want. Let me post a patch with our prototype during the week and if it sounds reasonable to you, you can decide whether to merge efforts. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305716#comment-15305716 ] He Tianyi commented on HDFS-10467: -- Thanks for sharing. I've implemented similar approach as a separated project, see https://github.com/bytedance/nnproxy. I am currently using it for backing 2 namenodes with a mount table with 20+ entries in production and worked well. (about 12K TPS) Looks like HDFS Router Federation includes more features. Shall we work together? > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304679#comment-15304679 ] Inigo Goiri commented on HDFS-10467: Thanks for early feedback to [~chris.douglas], [~jira.shegalov], and [~subru]. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304670#comment-15304670 ] Inigo Goiri commented on HDFS-10467: [~He Tianyi] for awareness. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304668#comment-15304668 ] Inigo Goiri commented on HDFS-10467: This approach should support the scenarios in the mail thread. Right now we support the most typical operations of the RPC interface and the basic ones for REST. We don't do proxying of the requests to the DNs as HttpFs does but at some point we might; for now we just point HttpFs to our Routers. Instead of extending HttpFs we went into mimicing the NN to provide the full interface (i.e., RPC) and mimic the image of a big large NN. The main difference with the approach proposed in the mail thread is the addition of the State Store as a centralized storage for the federation state. This mimics the architecture from YARN federation in YARN-2915. If the document is not enough, I can provide a patch with the full approach (including Router, State Store and a simple cluster rebalancer). I think later on, we should split it into multiple subtasks. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304646#comment-15304646 ] Chris Nauroth commented on HDFS-10467: -- [~elgoiri], thank you for sharing this. A similar discussion came up recently on the hdfs-...@hadoop.apache.org mailing list, so it appears you are not alone in this requirement. http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201605.mbox/%3C1462210332.1520687.595811233.2B297F6A%40webmail.messagingengine.com%3E > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304441#comment-15304441 ] Inigo Goiri commented on HDFS-10467: The advantages of this approach are: * Transparent to the users: A user can use the regular HDFS client as the Router looks like a regular Namenode. No maintenance of the mount table, etc. * Transparent rebalancing of subclusters: With this design, we can move data between subclusters and hide it behind the Router. * No additional changes to current HDFS: The Routers and State Store are completely independent from current NNs and DNs. No additional changes to the current HDFS code (we could add some functionality to the NN for some minor performance improvement but not needed). We have a prototype running with 4 subclusters and a couple implementations for the State Store side. Once we agree with the design, I can start creating the subtasks. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org