[
https://issues.apache.org/jira/browse/HDFS-8254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563982#comment-14563982
]
Zhe Zhang commented on HDFS-8254:
---------------------------------
Thanks Nicholas for the patch! Our initial design put {{locateFollowingBlock}}
logic in lead streamer for simplicity. I think it's a great idea to remove that
single point of failure.
# The new {{ConcurrentPoll}} class looks good overall. Let me know if this
understanding is correct: now the fastest streamer will take care of allocating
block group from NN and distributing to other streamers. Can we add some
Javadocs for the class and methods, and ideally, a design description on the
JIRA?
# The concurrent logic is a little complex and some parts could be fragile. For
example, the {{populate}} method in {{locateFollowingBlock}} directly changes
the {{block}} of the class. It's true that {{locateFollowingBlock}} is only
used by {{nextBlockOutputStream}}, which will reassign a correct value for
{{block}}. But this dependency makes {{locateFollowingBlock}} less
self-contained. It also looks like we could run into a race condition if 2
streamers enter {{locateFollowingBlock}} around the same time? They could both
pass {{isReady2Populate}} before either one has started taking from
{{endBlocks}}. Since {{DataStreamer#locateFollowingBlock}} is not complex, can
we do some refactoring and move it to the input stream level? This way the
{{coordinator}} can take care of the main logic and the fastest streamer just
has to trigger it. I haven't thought through {{updateBlockForPipeline}} and
{{updatePipeline}} yet but I guess the stories should be similar.
Nits:
# {{DFSStripedOutputStream}} already has the schema but the streamers are still
using constants. We should either use the schema or at least add some TODOs.
# New methods in {{DataStreamer}} could use some Javadoc.
# {{class Coordinator}} could be renamed to something like
{{StreamersCoordinator}}, just to be more specific.
# The Javadoc of {{StripedBlockUtil#checkBlocks}} should say that it checks the
two blocks are in the same block group
> In StripedDataStreamer, it is hard to tolerate datanode failure in the
> leading streamer
> ---------------------------------------------------------------------------------------
>
> Key: HDFS-8254
> URL: https://issues.apache.org/jira/browse/HDFS-8254
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Tsz Wo Nicholas Sze
> Attachments: h8254_20150526.patch, h8254_20150526b.patch
>
>
> StripedDataStreamer javadoc is shown below.
> {code}
> * The StripedDataStreamer class is used by {@link DFSStripedOutputStream}.
> * There are two kinds of StripedDataStreamer, leading streamer and ordinary
> * stream. Leading streamer requests a block group from NameNode, unwraps
> * it to located blocks and transfers each located block to its corresponding
> * ordinary streamer via a blocking queue.
> {code}
> Leading streamer is the streamer with index 0. When the datanode of the
> leading streamer fails, the other steamers cannot continue since no one will
> request a block group from NameNode anymore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)