[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040032#comment-14040032 ]
Mark Paget commented on HDFS-6584: ---------------------------------- Perhaps rack topology could allow for tagging low priority. Then manual mechanism to tag or automation for least frequently used. > Support archival storage > ------------------------ > > Key: HDFS-6584 > URL: https://issues.apache.org/jira/browse/HDFS-6584 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode > Reporter: Tsz Wo Nicholas Sze > Assignee: Tsz Wo Nicholas Sze > > In most of the Hadoop clusters, as more and more data is stored for longer > time, the demand for storage is outstripping the compute. Hadoop needs a cost > effective and easy to manage solution to meet this demand for storage. > Current solution is: > - Delete the old unused data. This comes at operational cost of identifying > unnecessary data and deleting them manually. > - Add more nodes to the clusters. This adds along with storage capacity > unnecessary compute capacity to the cluster. > Hadoop needs a solution to decouple growing storage capacity from compute > capacity. Nodes with higher density and less expensive storage with low > compute power are becoming available and can be used as cold storage in the > clusters. Based on policy the data from hot storage can be moved to cold > storage. Adding more nodes to the cold storage can grow the storage > independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)