[
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13707400#comment-13707400
]
Suresh Srinivas commented on HDFS-4672:
---------------------------------------
bq. What is the difference between this issue and HDFS-2832?
I have been waiting for answer to this comment. Given poorly toned comments on
HDFS-2832 and on public forums like twitter, I am asking why this is not a dupe
of HDFS-2832?
> Support tiered storage policies
> -------------------------------
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, hdfs-client, libhdfs, namenode
> Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the
> NameNode can gain awareness of what different storage options are available
> in the pool and where they are located, but no API is provided for clients or
> block placement plugins to perform device aware block placement. We would
> like to propose a set of extensions that also have broad applicability to use
> cases where storage device affinity is important:
>
> - Add an enum of generic storage device classes, borrowing from current
> taxonomy of the storage industry
>
> - Augment DataNode volume metadata in storage reports with this enum
>
> - Extend the namespace so pluggable block policies can be specified on a
> directory and storage device class can be tracked in the Inode. Perhaps this
> could be a larger discussion on adding support for extended attributes in the
> HDFS namespace. The Inode should track both the storage device class hint and
> the current actual storage device class. FileStatus should expose this
> information (or xattrs in general) to clients.
>
> - Extend the pluggable block policy framework so policies can also consider,
> and specify, affinity for a particular storage device class
>
> - Extend the file creation API to accept a storage device class affinity
> hint. Such a hint can be supplied directly as a parameter, or, if we are
> considering extended attribute support, then instead as one of a set of
> xattrs. The hint would be stored in the namespace and also used by the client
> to indicate to the NameNode/block placement policy/DataNode constraints on
> block placement. Furthermore, if xattrs or device storage class affinity
> hints are associated with directories, then the NameNode should provide the
> storage device affinity hint to the client in the create API response, so the
> client can provide the appropriate hint to DataNodes when writing new blocks.
>
> - The list of candidate DataNodes for new blocks supplied by the NameNode to
> clients should be weighted/sorted by availability of the desired storage
> device class.
>
> - Block replication should consider storage device affinity hints. If a
> client move()s a file from a location under a path with affinity hint X to
> under a path with affinity hint Y, then all blocks currently residing on
> media X should be eventually replicated onto media Y with the then excess
> replicas on media X deleted.
>
> - Introduce the concept of degraded path: a path can be degraded if a block
> placement policy is forced to abandon a constraint in order to persist the
> block, when there may not be available space on the desired device class, or
> to maintain the minimum necessary replication factor. This concept is
> distinct from the corrupt path, where one or more blocks are missing. Paths
> in degraded state should be periodically reevaluated for re-replication.
>
> - The FSShell should be extended with commands for changing the storage
> device class hint for a directory or file.
>
> - Clients like DistCP which compare metadata should be extended to be aware
> of the storage device class hint. For DistCP specifically, there should be an
> option to ignore the storage device class hints, enabled by default.
>
> Suggested semantics:
>
> - The default storage device class should be the null class, or simply the
> “default class”, for all cases where a hint is not available. This should be
> configurable. hdfs-defaults.xml could provide the default as spinning media.
>
> - A storage device class hint should be provided (and is necessary) only when
> the default is not sufficient.
>
> - For backwards compatibility, any FSImage or edit log entry lacking a
> storage device class hint is interpreted as having affinity for the null
> class.
>
> - All blocks for a given file share the same storage device class. If the
> replication factor for this file is increased the replicas should all be
> placed on the same storage device class.
>
> - If one or more blocks for a given file cannot be placed on the required
> device class, then the file is marked as degraded. Files in degraded state
> should be periodically reevaluated for re-replication.
>
> - A directory and path can only have one storage device affinity hint. If the
> file inode specifies a hint, this is used, otherwise we walk up the path
> until a hint is found and use that one, otherwise the default storage class
> is used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira