[
https://issues.apache.org/jira/browse/HDFS-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010157#comment-14010157
]
Arpit Agarwal commented on HDFS-5682:
-------------------------------------
Thanks for the feedback [~wuzesheng]. My responses are below.
bq. 1. About the storage type, because I didn't participate the discussion in
HDFS-2832, I am confused by the current storage type DISK and SSD. I think SSD
is also one type of disk, DISK and SSD are not orthogonal. Can we change
storage type to HDD and SDD, this will be more straightforward?
Good point, I'll look into making the names clearer. In a subsequent revision
of the API we would like to eliminate the hard-coded names from code altogether.
bq. 2. About setStorageTypeSpaceQuota/getStorageTypeSpaceQuota, these two
names are not very natural. From the literal meaning, it sounds like
setting/getting space quota on some storage type other than some type of
storage. I would suggest that setStorageSpaceQuota/getStorageSpaceQuota will be
better. I am not a native English speaker, if I were wrong, just ignore this.
The function name should communicate that this is disk space quota for a
specific storage type, as opposed to the overall quotas which are set with
{{setQuota}}. If the proposed name is hard to follow, how about
{{get}}/{{setsetQuotaByStorageType}}?
bq. 3. About the command line, hdfs dfsadmin -get(set)StorageTypeSpaceQuota, I
think get(set) one storage type once is simple and straightforward, if we
get(set) more than one once, because there's no atomicity guarantee, it's
complicated to handle failure.
Yes I think we can simplify the command line as you suggested.
bq. 4. About the StoragePreference class, as you said in the design doc in
HDFS-2832, in the future HDFS will support place replicas on different
storages, such as 1 on SSD, and 2 on HDD. I would suggest that
StoragePerference class can support specifying storage type of each replica
now, in this way, we can easily support the above feature in the future.
Let's defer this for now. The API and protocol can both be easily extended in a
backwards compatible manner in the future without affecting existing
applications.
bq. 5. About the create file sematics, as you said in the doc "During file
creation there must be sufficient quota to place at least one block times the
replication factor on the target storage type, otherwise the request is falied
immediately with QuotaExceededException", I think it will be more natural and
friendly that first create the file on the default storage(HDD) if there's not
enough space of desired storage type , and than let the namenode replicate the
block to desired storage lazily when there's enough space available.
We have to differentiate between quota unavailability vs disk space
availability. The former will result in a quota violation exception, the latter
will result in the behavior you described. We discuss the reasons for this in
the HDFS-2832 design doc.
> Heterogeneous Storage phase 2 - APIs to expose Storage Types
> ------------------------------------------------------------
>
> Key: HDFS-5682
> URL: https://issues.apache.org/jira/browse/HDFS-5682
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Arpit Agarwal
> Assignee: Arpit Agarwal
> Attachments: 20140522-Heterogeneous-Storages-API.pdf
>
>
> Phase 1 (HDFS-2832) added support to present the DataNode as a collection of
> discrete storages of different types.
> This Jira is to track phase 2 of the Heterogeneous Storage work which
> involves exposing Storage Types to applications and adding Quota Management
> support for administrators.
> This phase will also include tools support for administrators/users.
--
This message was sent by Atlassian JIRA
(v6.2#6252)