Xiao Chen commented on HDFS-7343:

Thanks all for the great documentation and discussions. It will be an 
interesting undertaking. :)

May be too early to ask: in order to do HDFS management work, the SSM has to 
run as hdfs superuser, right?

And related to Andrew's question on performance-based decisions, is it manual 
or automatic (or both)?
The doc says {{SSM can make prediction on a file’s read based on read 
historical information and cache the file automatically before the read 
operation happens}}, and later gives an example of a similar rule ({{every 1d 
at 0:00 | age lt 30d | cache}}). I think that means both: the description 
indicating the automatic part, and the rule showing a same example for a manual 
control. Is it true?
If the query is not latency-sensitive, the caching-uncaching in the 'automatic' 
way may be unnecessary. Is it possible to not have the automatic way happen for 
some workloads? I can think of similar cases where converting between EC <-> 
replica may not be necessary.

> HDFS smart storage management
> -----------------------------
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFS-Smart-Storage-Management.pdf
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to