[
https://issues.apache.org/jira/browse/HDFS-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730042#comment-14730042
]
Zhe Zhang commented on HDFS-8383:
---------------------------------
Continuing the review on the patch itself:
# Reading the current single-failure handling logic again, I think the
{{BlockRecoveryTrigger}} should work. We are making the recovery transactional
by setting the streamer's {{externalError}} before {{updateBlockForPipeline}}
and resetting it after {{updatePipeline}}. I think it's the right approach at
this stage.
# Why not incrementing {{numScheduled}} if it's already positive?
{code}
+ if (numScheduled == 0) {
+ numScheduled++;
+ }
{code}
# The error handling logic is quite complex now. We should use this chance to
add more explanation. Below is my draft. [~walter.k.su] If it looks OK to you,
could you help add to the patch?
{code}
class Coordinator {
/**
* The next internal block to write to, allocated by the fastest streamer
* (earliest to finish writing the current internal block) by calling
* {@link StripedDataStreamer#locateFollowingBlock}.
*/
private final MultipleBlockingQueue<LocatedBlock> followingBlocks;
/**
* Records the number of bytes actually written to the most recent internal
* block. Used to calculate the size of the entire block group.
*/
private final MultipleBlockingQueue<ExtendedBlock> endBlocks;
/**
* The following 2 queues are used to handle stream failures.
*
* When stream_i fails, the OutputStream notifies all other healthy
* streamers by setting an external error on each of them, which triggers
* {@link DataStreamer#processDatanodeError}. The first streamer reaching
* the external error will call {@link DataStreamer#updateBlockForPipeline}
* to get a new block with bumped generation stamp, and populate
* {@link newBlocks} for other streamers. This first streamer will also
* call {@link DataStreamer#updatePipeline} to update the NameNode state
* for the block.
*/
private final MultipleBlockingQueue<LocatedBlock> newBlocks;
private final MultipleBlockingQueue<ExtendedBlock> updateBlocks;
{code}
# Naming suggestions:
{code}
BlockRecoveryTrigger -> PipelineRecoveryManager or PipelineRecoveryCoordinator
(I don't have a strong opinion but we can also consider moving the class under
Coordinator).
trigger() -> addRecoveryWork()
isRecovering() -> isUnderRecovery()
{code}
# The patch at HDFS-8704 removes the {{setFailed}} API, we need to coordinate
the 2 efforts. Pinging [~libo-intel] for comments.
Follow-ons:
# {{waitLastRecoveryToFinish}} can be improved. The current logic waits for the
slowest streamer to get out of {{externalError}} state.
# {{externalError}} is actually quite awkward in {{DataStreamer}} -- it's a
null concept for non-EC {{DataStreamer}}.
> Tolerate multiple failures in DFSStripedOutputStream
> ----------------------------------------------------
>
> Key: HDFS-8383
> URL: https://issues.apache.org/jira/browse/HDFS-8383
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Walter Su
> Attachments: HDFS-8383.00.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)