[
https://issues.apache.org/jira/browse/HDFS-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763173#comment-13763173
]
Eric Sirianni commented on HDFS-5157:
-------------------------------------
For full disclosure, my group is working on introducing a "locally shared
storage" model to HDFS which would more loosely couple Storages to DataNodes.
So my perspective is slanted towards this mental model.
I think #2 seems like a better long term direction and fits the goal of
creating a slightly looser coupling between DataNodes and Storages. In a
shared storage model, the Storages and DataNodes are each separate resources
with distinct fault domains (e.g. HDFS-5168) and workloads. Having the
NameNode be more "storage-aware" seems to be the right step in this direction.
However, it is still desirable for the DataNode to have some degree of
autonomy/optimization over its storage. For example, rebalancing blocks
between storages of the same type (for balancing) or different types (for
performance) should be able to be done out-of-band of the NameNode.
Another consideration is how accurate/timely the storage info (free space,
load, availability, etc.) is at the DataNode vs. the NameNode. For example, if
the NameNode picks storage A, but that storage has subsequently run out of
space (or has gone offline), can the DataNode allocate the block to storage B
instead?
I would favor a model like #2 but allow for the DataNode to override the
placement decision (where the override was likely only done in exceptional
circumstances).
> Datanode should allow choosing the target storage
> -------------------------------------------------
>
> Key: HDFS-5157
> URL: https://issues.apache.org/jira/browse/HDFS-5157
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Affects Versions: Heterogeneous Storage (HDFS-2832)
> Reporter: Arpit Agarwal
> Assignee: Junping Du
> Attachments: HDFS-5157-v1.patch, HDFS-5157-v2.patch
>
>
> Datanode should allow should choosing a target Storage or target Storage Type
> as a parameter when creating a new block. Currently there are two ways in
> which the target volume is chosen (via {{VolumeChoosingPolicy#chooseVolume}}.
> # AvailableSpaceVolumeChoosingPolicy
> # RoundRobinVolumeChoosingPolicy
> BlockReceiver and receiveBlock should also accept a new parameter for target
> storage or storage type.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira