[ https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904877#comment-16904877 ]
Yang Yun commented on HDFS-14627: --------------------------------- Upload patch for this proposal, including, # One solutin to set spcial configration for spcial StrorageType. # Change timeout for different StrorageType. # Add option to disable block scanner a StrorageType # Check filesystem for a kind of StrorageType.(For some remote mounted filesystem, this checking is important to make sure the storage is mounted rightly) # Sleep a while during a long checkAndUpdate if the difference is big between disk and memory. (in a case, many datanodes with slow disks start as same time) # Add option to save replica cached file to other place.( For slow disk, saving replica info may take long time, can't finish in the time of shutdownHook) # Save the capacity of volume to reduce system call DF. (in some remote disk, DF is expensive) > Improvements to make slow archive storage works on HDFS > ------------------------------------------------------- > > Key: HDFS-14627 > URL: https://issues.apache.org/jira/browse/HDFS-14627 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Yang Yun > Priority: Minor > Attachments: HDFS-14627.patch, > data_flow_between_datanode_and_aws_s3.jpg > > > In our setup, we mount archival storage from remote. the write speed is about > 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for > example 'ls', are time consuming. > we add some improvements to make this kind of archive storage works in > currrent hdfs system. > 1. Add multiply to read/write timeout if block saved on archive storage. > 2. Save replica cache file of archive storage to other fast disk for quick > restart datanode, shutdownHook may does not execute if the saving takes too > long time. > 3. Check mount file system before using mounted archive storage. > 4. Reduce or avoid call DF during generating heartbeat report for archive > storage. > 5. Add option to skip archive block during decommission. > 6. Use multi-threads to scan archive storage. > 7. Check archive storage error with retry times. > 8. Add option to disable scan block on archive storage. > 9. Sleep a heartBeat time if there are too many difference when call > checkAndUpdate in DirectoryScanner > 10. An auto-service to scan fsimage and set the storage policy of files > according to policy. > 11. An auto-service to call mover to move the blocks to right storage. > 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org