[ 
https://issues.apache.org/jira/browse/HDFS-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010157#comment-14010157
 ] 

Arpit Agarwal commented on HDFS-5682:
-------------------------------------

Thanks for the feedback [~wuzesheng]. My responses are below.

bq.  1. About the storage type, because I didn't participate the discussion in 
HDFS-2832, I am confused by the current storage type DISK and SSD. I think SSD 
is also one type of disk, DISK and SSD are not orthogonal. Can we change 
storage type to HDD and SDD, this will be more straightforward?
Good point, I'll look into making the names clearer. In a subsequent revision 
of the API we would like to eliminate the hard-coded names from code altogether.

bq.  2. About setStorageTypeSpaceQuota/getStorageTypeSpaceQuota, these two 
names are not very natural. From the literal meaning, it sounds like 
setting/getting space quota on some storage type other than some type of 
storage. I would suggest that setStorageSpaceQuota/getStorageSpaceQuota will be 
better. I am not a native English speaker, if I were wrong, just ignore this.
The function name should communicate that this is disk space quota for a 
specific storage type, as opposed to the overall quotas which are set with 
{{setQuota}}. If the proposed name is hard to follow, how about 
{{get}}/{{setsetQuotaByStorageType}}? 

bq.  3. About the command line, hdfs dfsadmin -get(set)StorageTypeSpaceQuota, I 
think get(set) one storage type once is simple and straightforward, if we 
get(set) more than one once, because there's no atomicity guarantee, it's 
complicated to handle failure.
Yes I think we can simplify the command line as you suggested.

bq. 4. About the StoragePreference class, as you said in the design doc in 
HDFS-2832, in the future HDFS will support place replicas on different 
storages, such as 1 on SSD, and 2 on HDD. I would suggest that 
StoragePerference class can support specifying storage type of each replica 
now, in this way, we can easily support the above feature in the future.
Let's defer this for now. The API and protocol can both be easily extended in a 
backwards compatible manner in the future without affecting existing 
applications.

bq.  5. About the create file sematics, as you said in the doc "During file 
creation there must be sufficient quota to place at least one block times the 
replication factor on the target storage type, otherwise the request is falied 
immediately with QuotaExceededException", I think it will be more natural and 
friendly that first create the file on the default storage(HDD) if there's not 
enough space of desired storage type , and than let the namenode replicate the 
block to desired storage lazily when there's enough space available.
We have to differentiate between quota unavailability vs disk space 
availability. The former will result in a quota violation exception, the latter 
will result in the behavior you described. We discuss the reasons for this in 
the HDFS-2832 design doc.

> Heterogeneous Storage phase 2 - APIs to expose Storage Types
> ------------------------------------------------------------
>
>                 Key: HDFS-5682
>                 URL: https://issues.apache.org/jira/browse/HDFS-5682
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: 20140522-Heterogeneous-Storages-API.pdf
>
>
> Phase 1 (HDFS-2832) added support to present the DataNode as a collection of 
> discrete storages of different types.
> This Jira is to track phase 2 of the Heterogeneous Storage work which 
> involves exposing Storage Types to applications and adding Quota Management 
> support for administrators.
> This phase will also include tools support for administrators/users.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to