[ 
https://issues.apache.org/jira/browse/HDFS-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730042#comment-14730042
 ] 

Zhe Zhang commented on HDFS-8383:
---------------------------------

Continuing the review on the patch itself:
# Reading the current single-failure handling logic again, I think the 
{{BlockRecoveryTrigger}} should work. We are making the recovery transactional 
by setting the streamer's {{externalError}} before {{updateBlockForPipeline}} 
and resetting it after {{updatePipeline}}. I think it's the right approach at 
this stage.
# Why not incrementing {{numScheduled}} if it's already positive?
{code}
+        if (numScheduled == 0) {
+          numScheduled++;
+        }
{code}
# The error handling logic is quite complex now. We should use this chance to 
add more explanation. Below is my draft. [~walter.k.su] If it looks OK to you, 
could you help add to the patch?
{code}
  class Coordinator {
    /**
     * The next internal block to write to, allocated by the fastest streamer
     * (earliest to finish writing the current internal block) by calling
     * {@link StripedDataStreamer#locateFollowingBlock}.
     */
    private final MultipleBlockingQueue<LocatedBlock> followingBlocks;

    /**
     * Records the number of bytes actually written to the most recent internal
     * block. Used to calculate the size of the entire block group.
     */
    private final MultipleBlockingQueue<ExtendedBlock> endBlocks;

    /**
     * The following 2 queues are used to handle stream failures.
     *
     * When stream_i fails, the OutputStream notifies all other healthy
     * streamers by setting an external error on each of them, which triggers
     * {@link DataStreamer#processDatanodeError}. The first streamer reaching
     * the external error will call {@link DataStreamer#updateBlockForPipeline}
     * to get a new block with bumped generation stamp, and populate
     * {@link newBlocks} for other streamers. This first streamer will also
     * call {@link DataStreamer#updatePipeline} to update the NameNode state
     * for the block.
     */
    private final MultipleBlockingQueue<LocatedBlock> newBlocks;
    private final MultipleBlockingQueue<ExtendedBlock> updateBlocks;
{code}
# Naming suggestions:
{code}
BlockRecoveryTrigger -> PipelineRecoveryManager or PipelineRecoveryCoordinator 
(I don't have a strong opinion but we can also consider moving the class under 
Coordinator).
trigger() -> addRecoveryWork()
isRecovering() -> isUnderRecovery()
{code}
# The patch at HDFS-8704 removes the {{setFailed}} API, we need to coordinate 
the 2 efforts. Pinging [~libo-intel] for comments.

Follow-ons:
# {{waitLastRecoveryToFinish}} can be improved. The current logic waits for the 
slowest streamer to get out of {{externalError}} state.
# {{externalError}} is actually quite awkward in {{DataStreamer}} -- it's a 
null concept for non-EC {{DataStreamer}}.

> Tolerate multiple failures in DFSStripedOutputStream
> ----------------------------------------------------
>
>                 Key: HDFS-8383
>                 URL: https://issues.apache.org/jira/browse/HDFS-8383
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Walter Su
>         Attachments: HDFS-8383.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to