[
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190091#comment-15190091
]
Lei (Eddy) Xu commented on HDFS-3702:
-------------------------------------
Thanks, [~arpitagarwal].
This patch shares many similarities with HDFS-4946, in the following ways:
This patch is orthogonal to storage policy, original block placement policy
(i.e., rack-aware policy). The storage policy, local rack and etc are all
honored. This patch is basically adding one special case for {{excludeNodes}}.
I'd say that the only thing it changes is for HDFS-4946, as an opposite case to
HDFS-4946, but only for per-client / DFSOutputStream base. For the cases
similar in this JIRA, using storage policy alone does not necessarily provide
better data availability (i.e., Hbase still writes to local SSD).
bq. I am also curious about the answer to Devaraj's question. HDFS-2576 was
added specifically for HBase. Can it address your use case?
To some extend, HDFS-2576 needs each DFSClient have the rest of the cluster in
the {{favoriteNodes}} to achieve the same purpose. It'd also raise question
like: would holding a subset of ND in {{favoriteNodes}} affect the efficiency
of data placement? or should DFSClient constantly refresh this list of nodes? A
similar argument can be applied to HDFS-4946 as well.
bq. The NameNode ignores this CreateFlag.
I am not sure that I understand this question. It is still {{BlockManager}} in
{{NameNode}} making the final decision of block placement (please see my first
point). {{CreateFlag}} is just a user visible flag to provide the _hints_.
These (and future more) hints are sent to NameNode through
{{ClientNamenodeProtocol}} RPCs and processed by NameNode.
bq. it will only work for DFSClient users e.g. not for WebHDFS.
At this time, I am not certain that it will not work for WebHDFS. If that is
the case, can we file a following JIRA to fix it once the basic function is in
place?
I hope that the above explanations can answer your questions, [~arpitagarwal].
Looking forward to hear from you.
> Add an option for NOT writing the blocks locally if there is a datanode on
> the same box as the client
> -----------------------------------------------------------------------------------------------------
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 2.5.1
> Reporter: Nicolas Liochon
> Assignee: Lei (Eddy) Xu
> Priority: Minor
> Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch,
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch,
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch,
> HDFS-3702.008.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that
> wrote them (the 'HBase regionserver') dies. This will likely come from a
> hardware failure, hence the corresponding datanode will be dead as well. So
> we're writing 3 replicas, but in reality only 2 of them are really useful.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)