[ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190091#comment-15190091
 ] 

Lei (Eddy) Xu commented on HDFS-3702:
-------------------------------------

Thanks, [~arpitagarwal].

This patch shares many similarities with HDFS-4946, in the following ways:


This patch is orthogonal to storage policy, original block placement policy 
(i.e., rack-aware policy). The storage policy, local rack and etc are all 
honored. This patch is basically adding one special case for {{excludeNodes}}. 
I'd say that the only thing it changes is for HDFS-4946, as an opposite case to 
HDFS-4946, but only for per-client / DFSOutputStream base.  For the cases 
similar in this JIRA, using storage policy alone does not necessarily provide 
better data availability (i.e., Hbase still writes to local SSD).

bq. I am also curious about the answer to Devaraj's question. HDFS-2576 was 
added specifically for HBase. Can it address your use case? 

To some extend, HDFS-2576 needs each DFSClient have the rest of the cluster in 
the {{favoriteNodes}} to achieve the same purpose. It'd also raise question 
like: would holding a subset of ND in {{favoriteNodes}} affect the efficiency 
of data placement? or should DFSClient constantly refresh this list of nodes? A 
similar argument can be applied to HDFS-4946 as well.

bq.  The NameNode ignores this CreateFlag.

I am not sure that I understand this question. It is still {{BlockManager}} in 
{{NameNode}} making the final decision of block placement (please see my first 
point). {{CreateFlag}} is just a user visible flag to provide the _hints_. 
These (and future more) hints are sent to NameNode through 
{{ClientNamenodeProtocol}} RPCs and processed by NameNode.

bq. it will only work for DFSClient users e.g. not for WebHDFS. 

At this time, I am not certain that it will not work for WebHDFS. If that is 
the case, can we file a following JIRA to fix it once the basic function is in 
place?

I hope that the above explanations can answer your questions, [~arpitagarwal]. 
Looking forward to hear from you.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3702
>                 URL: https://issues.apache.org/jira/browse/HDFS-3702
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 2.5.1
>            Reporter: Nicolas Liochon
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to