Hi, We have tested the HDFS-9806 branch in two settings:
(i) 26 node bare-metal cluster, with PROVIDED storage configured to point to another instance of HDFS (containing 468 files, total of ~400GB of data). Half of the Datanodes are configured with only DISK volumes and other other half have both DISK and PROVIDED volumes. (ii) 8 VMs on Azure, with PROVIDED storage configured to point to a WASB account (containing 26,074 files and ~1.3TB of data). All Datanodes are configured with DISK and PROVIDED volumes. (i) was tested using both the text-based alias map (TextFileRegionAliasMap) and the in-memory leveldb-based alias map (InMemoryLevelDBAliasMapClient), while (ii) was tested using the text-based alias map only. Steps followed: (0) Build from apache/HDFS-9806. (Note that for the leveldb-based alias map, the patch posted to HDFS-12912 <https://issues.apache.org/jira/browse/HDFS-12912> needs to be applied; we will commit this to apache/HDFS-9806 after review). (1) Generate the FSImage using the image generation tool with the appropriate remote location (hdfs:// in (i) and wasb:// in (ii)). (2) Bring up the HDFS cluster. (3) Verify that the remote namespace is reflected correctly and data on remote store can be accessed. Commands ran: ls, copyToLocal, fsck, getrep, setrep, getStoragePolicy (4) Run Sort and Gridmix jobs on the data in the remote location with the input paths pointing to the local HDFS. (5) Increase replication of the PROVIDED files and verified that local (DISK) replicas were created for the PROVIDED replicas, using fsck. (6) Verify that Provided storage capacity is shown correctly on the NN and Datanode Web-UI. (7) Bring down datanodes, one by one. When all are down, verify NN reports all PROVIDED files as missing. Bringing back up any one Datanode makes all the data available. (8) Restart NN and verify data is still accesible. (9) Verify that Writes to local HDFS continue to work. (10) Bring down all Datanodes except one. Start decommissioning the remaining Datanode. Verify that the data in the PROVIDED storage is still accessible. Apart from the above, we ported the changes in HDFS-9806 to branch-2.7 and deployed it on a ~800 node cluster as one of the sub-clusters in a Router-based Federated HDFS of nearly 4000 nodes (with help from Inigo Goiri). We mounted about 1000 files, 650TB of remote data (~2.6million blocks with 256MB block size) in this cluster using the text-based alias map. We verified that the basic commands (ls, copyToLocal, setrep) work. We also ran spark jobs against this cluster. -Virajith On Fri, Dec 8, 2017 at 3:44 PM, Chris Douglas <cdoug...@apache.org> wrote: > Discussion thread: https://s.apache.org/kxT1 > > We're down to the last few issues and are preparing the branch to > merge to trunk. We'll post merge patches to HDFS-9806 [1]. Minor, > "cleanup" tasks (checkstyle, findbugs, naming, etc.) will be tracked > in HDFS-12712 [2]. > > We've tried to ensure that when this feature is disabled, HDFS is > unaffected. For those reviewing this, please look for places where > this might add overheads and we'll address them before the merge. The > site documentation [3] and design doc [4] should be up to date and > sufficient to try this out. Again, please point out where it is > unclear and we can address it. > > This has been a long effort and we're grateful for the support we've > received from the community. In particular, thanks to Íñigo Goiri, > Andrew Wang, Anu Engineer, Steve Loughran, Sean Mackrory, Lukas > Majercak, Uma Gunuganti, Kai Zheng, Rakesh Radhakrishnan, Sriram Rao, > Lei Xu, Zhe Zhang, Jing Zhao, Bharat Viswanadham, ATM, Chris Nauroth, > Sanjay Radia, Atul Sikaria, and Peng Li for all your input into the > design, testing, and review of this feature. > > The vote will close no earlier than one week from today, 12/15. -C > > [1]: https://issues.apache.org/jira/browse/HDFS-9806 > [2]: https://issues.apache.org/jira/browse/HDFS-12712 > [3]: https://github.com/apache/hadoop/blob/HDFS-9806/hadoop- > hdfs-project/hadoop-hdfs/src/site/markdown/HdfsProvidedStorage.md > [4]: https://issues.apache.org/jira/secure/attachment/ > 12875791/HDFS-9806-design.002.pdf > > --------------------------------------------------------------------- > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >