J.Andreina created HDFS-7820:
--------------------------------

             Summary: Client Write fails after rolling upgrade operation with 
"<block_id> already exist in finalized state"
                 Key: HDFS-7820
                 URL: https://issues.apache.org/jira/browse/HDFS-7820
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: J.Andreina
            Assignee: J.Andreina


Steps to Reproduce:
===================

Step 1:  Prepare rolling upgrade using "hdfs dfsadmin -rollingUpgrade prepare"
Step 2:  Shutdown SNN and NN
Step 3:  Start NN with the "hdfs namenode -rollingUpgrade started" option.
Step 4:  Executed "hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> 
upgrade" and restarted Datanode
Step 5:  Write 3 files to hdfs ( block id assigned are : blk_1073741831_1007, 
blk_1073741832_1008,blk_1073741833_1009 )
Step 6:  Shutdown both NN and DN
Step 7:  Start NNs with the "hdfs namenode -rollingUpgrade rollback" option.
         Start DNs with the "-rollback" option.
Step 8:  Write 2 files to hdfs.

Issue:
=======
Client write failed with below exception
{noformat}
2015-02-23 16:00:12,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 src: 
/XXXXXXXXXXX:48545 dest: /XXXXXXXXXXX:50010
2015-02-23 16:00:12,897 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 
received exception 
org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 already exists in 
state FINALIZED and thus cannot be created.
{noformat}

Observations:
=============

1. At Namenode side block invalidate is been sent only to 2 blocks.
{noformat}
15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
blk_1073741833_1009 to XXXXXXXXXXX:50010
15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
blk_1073741831_1007 to XXXXXXXXXXX:50010
{noformat}

2. fsck report does not show information on blk_1073741832_1008
{noformat}
FSCK started by Rex (auth:SIMPLE) from /XXXXXXXXXXX for path / at Mon Feb 23 
16:17:57 CST 2015

/File1:  Under replicated 
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741825_1001. Target Replicas is 
3 but found 1 replica(s).

/File11:  Under replicated 
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741827_1003. Target Replicas is 
3 but found 1 replica(s).

/File2:  Under replicated 
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741826_1002. Target Replicas is 
3 but found 1 replica(s).

/AfterRollback_2:  Under replicated 
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741831_1007. Target Replicas is 
3 but found 1 replica(s).

/Test1:  Under replicated 
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741828_1004. Target Replicas is 
3 but found 1 replica(s).
Status: HEALTHY
 Total size:    31620 B
 Total dirs:    7
 Total files:   6
 Total symlinks:                0
 Total blocks (validated):      5 (avg. block size 6324 B)
 Minimally replicated blocks:   5 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       5 (100.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     1.0
 Corrupt blocks:                0
 Missing replicas:              10 (66.666664 %)
 Number of data-nodes:          1
 Number of racks:               1
FSCK ended at Mon Feb 23 16:17:57 CST 2015 in 3 milliseconds
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to