J.Andreina created HDFS-7820: -------------------------------- Summary: Client Write fails after rolling upgrade operation with "<block_id> already exist in finalized state" Key: HDFS-7820 URL: https://issues.apache.org/jira/browse/HDFS-7820 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina
Steps to Reproduce: =================== Step 1: Prepare rolling upgrade using "hdfs dfsadmin -rollingUpgrade prepare" Step 2: Shutdown SNN and NN Step 3: Start NN with the "hdfs namenode -rollingUpgrade started" option. Step 4: Executed "hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade" and restarted Datanode Step 5: Write 3 files to hdfs ( block id assigned are : blk_1073741831_1007, blk_1073741832_1008,blk_1073741833_1009 ) Step 6: Shutdown both NN and DN Step 7: Start NNs with the "hdfs namenode -rollingUpgrade rollback" option. Start DNs with the "-rollback" option. Step 8: Write 2 files to hdfs. Issue: ======= Client write failed with below exception {noformat} 2015-02-23 16:00:12,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 src: /XXXXXXXXXXX:48545 dest: /XXXXXXXXXXX:50010 2015-02-23 16:00:12,897 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 already exists in state FINALIZED and thus cannot be created. {noformat} Observations: ============= 1. At Namenode side block invalidate is been sent only to 2 blocks. {noformat} 15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add blk_1073741833_1009 to XXXXXXXXXXX:50010 15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add blk_1073741831_1007 to XXXXXXXXXXX:50010 {noformat} 2. fsck report does not show information on blk_1073741832_1008 {noformat} FSCK started by Rex (auth:SIMPLE) from /XXXXXXXXXXX for path / at Mon Feb 23 16:17:57 CST 2015 /File1: Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741825_1001. Target Replicas is 3 but found 1 replica(s). /File11: Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741827_1003. Target Replicas is 3 but found 1 replica(s). /File2: Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741826_1002. Target Replicas is 3 but found 1 replica(s). /AfterRollback_2: Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741831_1007. Target Replicas is 3 but found 1 replica(s). /Test1: Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741828_1004. Target Replicas is 3 but found 1 replica(s). Status: HEALTHY Total size: 31620 B Total dirs: 7 Total files: 6 Total symlinks: 0 Total blocks (validated): 5 (avg. block size 6324 B) Minimally replicated blocks: 5 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 5 (100.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 1.0 Corrupt blocks: 0 Missing replicas: 10 (66.666664 %) Number of data-nodes: 1 Number of racks: 1 FSCK ended at Mon Feb 23 16:17:57 CST 2015 in 3 milliseconds {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)