[
https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904877#comment-16904877
]
Yang Yun commented on HDFS-14627:
-
Upload patch for this proposal, including,
# One solutin to set spcial configration for spcial StrorageType.
# Change timeout for different StrorageType.
# Add option to disable block scanner a StrorageType
# Check filesystem for a kind of StrorageType.(For some remote mounted
filesystem, this checking is important to make sure the storage is mounted
rightly)
# Sleep a while during a long checkAndUpdate if the difference is big between
disk and memory. (in a case, many datanodes with slow disks start as same time)
# Add option to save replica cached file to other place.( For slow disk,
saving replica info may take long time, can't finish in the time of
shutdownHook)
# Save the capacity of volume to reduce system call DF. (in some remote disk,
DF is expensive)
> Improvements to make slow archive storage works on HDFS
> ---
>
> Key: HDFS-14627
> URL: https://issues.apache.org/jira/browse/HDFS-14627
> Project: Hadoop HDFS
> Issue Type: Improvement
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HDFS-14627.patch,
> data_flow_between_datanode_and_aws_s3.jpg
>
>
> In our setup, we mount archival storage from remote. the write speed is about
> 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for
> example 'ls', are time consuming.
> we add some improvements to make this kind of archive storage works in
> currrent hdfs system.
> 1. Add multiply to read/write timeout if block saved on archive storage.
> 2. Save replica cache file of archive storage to other fast disk for quick
> restart datanode, shutdownHook may does not execute if the saving takes too
> long time.
> 3. Check mount file system before using mounted archive storage.
> 4. Reduce or avoid call DF during generating heartbeat report for archive
> storage.
> 5. Add option to skip archive block during decommission.
> 6. Use multi-threads to scan archive storage.
> 7. Check archive storage error with retry times.
> 8. Add option to disable scan block on archive storage.
> 9. Sleep a heartBeat time if there are too many difference when call
> checkAndUpdate in DirectoryScanner
> 10. An auto-service to scan fsimage and set the storage policy of files
> according to policy.
> 11. An auto-service to call mover to move the blocks to right storage.
> 12. Dedup files on remote storage if the storage is reliable.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org