[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576163#comment-15576163
 ] 

Anu Engineer commented on HDFS-7343:
------------------------------------

[~zhouwei]

 Thanks for posting a detailed design document. I was not part of the earlier 
discussions so I may not have the right context for many of my comments. Please 
do feel free to provide me the right context if I am missing any.


I think this proposal addresses the one of the biggest issue in running an HDFS 
cluster. We rely on lots of operator interactions to keep the cluster running 
while these should be automated. I completely agree with the sentiment. 

Couple of minor comments:


1. Is this service attempting to become the answer for all administrative 
problems of HDFS?  In other words, Is this service is trying to be a catch all 
service?  

I am not able to see what is common between caching and file placement and 
between running distcp for remote replication and running balancer and disk 
balancer.

My personal view is that this service should pick the right focus and focus on 
those issues. For example, the first 5 uses cases are all related to 
performance and file placement. 

But then in the latter parts of the document we drag in distcp, disk balancer 
and balancer issues. SSM might be the right place for these in the long run, 
but my suggestion is to focus on the core parts of the service and then extend 
it to other things once the core is stabilized. 


2. Do we need a new rules language – Would you please consider using a language 
which admins will already know, for example, if we can write these rules in 
python or even JavaScript, you don’t need to invent a whole new language. Every 
time I have to configure Kerberos rules, I have to look the mini-regex 
meanings. I am worried that this little rule language will blow up and once it 
is in, this is something that we will need to support for the long term. 


3. In this model we are proposing a push model where the datanode and Namenode 
pushes data to some kafka endpoint. I would prefer if namenode and datanode was 
not aware of this service at all. This service can easily connect to namenode 
and read almost all the data which is needed. If you need extra RPC to be added 
in datanode and namenode that would an option too. But creating a dependency 
from namenode and all datanodes in a cluster seems to be something that you 
want to do after very careful consideration. If we move to a pull model, you 
might not even need kafka service to be running in the initial versions.


> HDFS smart storage management
> -----------------------------
>
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFS-Smart-Storage-Management.pdf
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to