Wei Zhou commented on HDFS-7343:

Thanks [~anu] for reviewing the design document and great comments!
For your comments:
1. Is this service attempting to become the answer for all administrative 
problems of HDFS? In other words, Is this service is trying to be a catch all 
I am not able to see what is common between caching and file placement and 
between running distcp for remote replication and running balancer and disk 
For long run, SSM is going to provide user an end-to-end storage management 
automation solution. Any facility can be used in this project towards the 
solution. The use cases and examples listed in the document just give examples 
of the possible scenarios where SSM can be used and what can SSM do. SSM can 
help from different angles by using these facilities.
But then in the latter parts of the document we drag in distcp, disk balancer 
and balancer issues. SSM might be the right place for these in the long run, 
but my suggestion is to focus on the core parts of the service and then extend 
it to other things once the core is stabilized.
You are absolutely right, we have to be focused on implementing the core 
part/module now instead of involving too much beyond at the same time, it's the 
basis of other functions.

2. Do we need a new rules language – Would you please consider using a language 
which admins will already know, for example, if we can write these rules in 
python or even JavaScript, you don’t need to invent a whole new language. Every 
time I have to configure Kerberos rules, I have to lookup the mini-regex 
meanings. I am worried that this little rule language will blow up and once it 
is in, this is something that we will need to support for the long term.
Yes, it's a very good question and we do have thought about it before. We aimed 
at providing administrator/user a simple and specific rule language without 
touching too much besides the rule logic itself. In fact, a rule is very simple 
that only have to declare when and which action to be implied on some objects 
(can be a file, node, etc.). A general and systematic language like python or 
java script maybe too heavy for defining a rule. 

3. In this model we are proposing a push model where the datanode and Namenode 
pushes data to some kafka endpoint. I would prefer if namenode and datanode was 
not aware of this service at all. This service can easily connect to namenode 
and read almost all the data which is needed. If you need extra RPC to be added 
in datanode and namenode that would an option too. But creating a dependency 
from namenode and all datanodes in a cluster seems to be something that you 
want to do after very careful consideration. If we move to a pull model, you 
might not even need kafka service to be running in the initial versions.
Good point! This is also a very good way to implement SSM.
If using pull model, the advantage are:
   (1) No dependency on Kafka service, and it’s indeed much easier for 
development, testing and deployment.
   (2) Closer relationship with HDFS which may be able to support features that 
cannot be done in the model described in the design document. 
The disadvantage are: 
(1) It may have potential performance issue. SSM have to know the messages 
timely in order to work effectively. In order to decrease the overhead of 
getting messages, SSM have to query NameNodes for the messages on a very high 
frequency all the time. It’s also very hard for SSM to query DataNodes one by 
one to get messages in a large scale cluster. 
(2) It simplifies the process of message collecting and management. If SSM 
stopped by user or crashed while the HDFS cluster is still working, then 
messages from nodes shall be lost without Kafka, and it’s not friendly for SSM 
to collect historical data. 
Above, both of the models are workable and we may need more discussion on it. 
What’s your opinion?

> HDFS smart storage management
> -----------------------------
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFS-Smart-Storage-Management.pdf
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to