[
https://issues.apache.org/jira/browse/HDFS-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932362#action_12932362
]
Todd Lipcon commented on HDFS-1467:
-----------------------------------
Hey Konstantin. Here's the call sequence:
- DataXceiver.opWriteBlock(in, block) (DataXceiver.java:219)
-- DataXceiver.java:270: call BlockReceiver constructor with {{block}} object
from opWriteBlock argument
--- BlockReceiver.java:91 - we copy {{block}} to member variable
--- BlockReceiver.java:114 - we call {{datanode.data.append(...)}} with correct
generation stamp block - so this is fine on this node
--- BlockReceiver.java:118 - {{block.setGenerationStamp(newGs)}} - affects the
same block object from above
-- DataXceiver.java:304 - {{DataTransferProtocol.Sender.opWriteBlock}} with
same block object, which now has the *new* generation stamp.
Thus, the first DN in the pipeline correctly appends to the block, but the
second one fails, since it's asked to append to the *new* GS block, not the old.
You can see this in the attached test failure log. On line 887, the first
datanode in the pipeline correctly opens replica for append:
{{2010-10-19 20:09:21,075 INFO datanode.DataNode (FSDataset.java:append(1101))
- Appending to replica FinalizedReplica, blk_3950121169366352479_1001,
FINALIZED}}
But then when {{opWriteBlock}} is called for the second replica on line 901, it
has the _new_ genstamp:
{{2010-10-19 20:09:21,106 INFO datanode.DataNode
(DataXceiver.java:opWriteBlock(231)) - Receiving block
blk_3950121169366352479_1002 src: /127.0.0.1:59098 dest: /127.0.0.1:54152}}
I'll dig through the version control and see if I can figure out how this was
introduced if indeed it's a regression.
> Append pipeline never succeeds with more than one replica
> ---------------------------------------------------------
>
> Key: HDFS-1467
> URL: https://issues.apache.org/jira/browse/HDFS-1467
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: data-node
> Affects Versions: 0.22.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
> Attachments: failed-TestPipelines.txt, hdfs-1467.txt
>
>
> TestPipelines appears to be failing on trunk:
> Should be RBW replica after sequence of calls append()/write()/hflush()
> expected:<RBW> but was:<FINALIZED>
> junit.framework.AssertionFailedError: Should be RBW replica after sequence of
> calls append()/write()/hflush() expected:<RBW> but was:<FINALIZED>
> at
> org.apache.hadoop.hdfs.TestPipelines.pipeline_01(TestPipelines.java:109)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.