[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940055#comment-15940055
 ] 

Kai Zheng commented on HDFS-7343:
---------------------------------

Thanks Anoop for the thoughts and good questions!

bq. Tracking of data hotness and movement is at what level? Block level? Or 
only file level?
We want to support block level, but not sure how would it be useful, since for 
modern HDFS data files, they may be mostly single block files. If it's typical 
that there are many large files of many blocks of different hotness in typical 
clusters, we may add to consider block level support. Wonder if [~andrew.wang] 
could give some comments about this. Thanks.

bq. HBase, being a user, we will compact our HFiles into one ...
After discussed with Anoop offline, I got the point. HBase itself does fine 
level cache stuffs so it won't need the help of HDFS cache, therefore SSM can't 
help HBase in the cache path. In other cases, it's possible that in HBase there 
are cold tables and even cold regions so in underlying HDFS there could be HDFS 
blocks of different temperatures, then HDFS HSM could help. SSM aims to ease 
HDFS-HSM deployment and usage, so SSM can help HBase in such cases. 

For HBase, I thought [~anu] has some considerations. Anu could you cast your 
points about this? Thanks!


> HDFS smart storage management
> -----------------------------
>
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, 
> HDFS-Smart-Storage-Management-update.pdf, move.jpg
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to