[ 
https://issues.apache.org/jira/browse/HDFS-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529341#comment-14529341
 ] 

Zhe Zhang commented on HDFS-7672:
---------------------------------

Thanks for the new patch Nicholas. A few comments:
Nits:
# Not a big deal, but seems to me that {{DataStreamer#endBlock}} uses _end_ as 
a verb ("end a block"). So {{putEndBlock}} and {{getEndBlock}} do not read that 
well. Not feeling strongly about it though.
# {{DataStreamer#toString}} has a bug causing NPE when DEBUG is turned on. I 
guess we should handle {{block == null}}
# In {{DFSStripedOutputStream#writeChunk}}, the following two _if_ statements 
can be combined into if-else?
{code}
    if (!current.isFailed()) {
        xxx
    }

    if (current.isFailed()) {
        xxx
    }
{code}

Structural:
# A large portion of this patch is about moving inter-streamer coordination 
into the new {{Coordinator}} class, which I think is a good idea. But I guess 
we should either split it out as a separate patch, or at least update the 
description / summary of this JIRA.
# I originally thought {{Coordinator#shouldLocateFollowingBlock}} indicates 
whether the leading streamer should locate the following block from NN, but 
then I saw it does so even with a false {{shouldLocateFollowingBlock}}. So what 
does this variable mean? What scenarios would cause it to be false? Would be 
nice to add some Javadoc for it.
# Right now {{shouldLocateFollowingBlock}} is true as long as one of the 
streamers puts an {{endBlock}}. Shouldn't it be an {{&&}} instead of {{||}}? Of 
course this depends on the answer to the question above, I might be 
understanding the variable wrong.
# bq. Do you mean closing the current block group and then continues writing to 
a new BlockGroup using the variable block length feature?
Yes. So assume the current block group bg_0 has encountered failure, should we 
: i) end bg_0 and allocate bg_1, which will leave bg_0's internal blocks 
smaller than configured block size); ii) continuing writing to bg_0 with fewer 
healthy DNs. I don't a clear preference but thought it'd be good to list the 
pros and cons of each option. I can think of the following:
#* The variable-length block method might increase block count
#* The keep-writing method leaves data under-protected for a certain window, 
and it's more expensive to reconstruct the data later (then directly writing 
parity from client).
# bq. Writing with fewer health DNs is consistent with replication
We currently have multiple policies in pipeline recovery 
({{ReplaceDatanodeOnFailure#Policy}}), so maybe as you suggested, we should 
eventually support both modes above in striping
# All tests use a single {{length}}, which is smaller than a full block group.
# Right not the test class takes quite long to finish, since we need to 
actually transfer data. Should we just have 3 tests: leader (to be added 
later), data, parity? Then we can also name the test methods more meaningfully.

> Erasure Coding: handle write failure for stripping coding blocks
> ----------------------------------------------------------------
>
>                 Key: HDFS-7672
>                 URL: https://issues.apache.org/jira/browse/HDFS-7672
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h7672_20150504.patch, h7672_20150504b.patch, 
> h7672_20150504c.patch, h7672_20150505.patch
>
>
> In *stripping* case, for (6, 3)-Reed-Solomon, a client writes to 6 data 
> blocks and 3 parity blocks concurrently.  We need to handle datanode or 
> network failures when writing a EC BlockGroup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to