[
https://issues.apache.org/jira/browse/HDFS-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Surendra Singh Lilhore updated HDFS-11965:
------------------------------------------
Attachment: HDFS-11965-HDFS-10285.001.patch
I attached initial patch.
Added retry logic for under-replicated files. Please review..
> [SPS] Fix TestPersistentStoragePolicySatisfier#testWithCheckpoint failure
> -------------------------------------------------------------------------
>
> Key: HDFS-11965
> URL: https://issues.apache.org/jira/browse/HDFS-11965
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Affects Versions: HDFS-10285
> Reporter: Surendra Singh Lilhore
> Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11965-HDFS-10285.001.patch
>
>
> The test case is failing because all the required replicas are not moved in
> expected storage. This is happened because of delay in datanode registration
> after cluster restart.
> Scenario :
> 1. Start cluster with 3 DataNodes.
> 2. Create file and set storage policy to WARM.
> 3. Restart the cluster.
> 4. Now Namenode and two DataNodes started first and got registered with
> NameNode. (one datanode not yet registered)
> 5. SPS scheduled block movement based on available DataNodes (It will move
> one replica in ARCHIVE based on policy).
> 6. Block movement also success and Xattr removed from the file because this
> condition is true {{itemInfo.isAllBlockLocsAttemptedToSatisfy()}}.
> {code}
> if (itemInfo != null
> && !itemInfo.isAllBlockLocsAttemptedToSatisfy()) {
> blockStorageMovementNeeded
> .add(storageMovementAttemptedResult.getTrackId());
> ....................
> ......................
> } else {
> ....................
> ......................
> this.sps.postBlkStorageMovementCleanup(
> storageMovementAttemptedResult.getTrackId());
> }
> {code}
> 7. Now third DN registered with namenode and its reported one more DISK
> replica. Now Namenode has two DISK and one ARCHIVE replica.
> In test case we have condition to check the number of DISK replica..
> {code} DFSTestUtil.waitExpectedStorageType(testFileName, StorageType.DISK, 1,
> timeout, fs);{code}
> This condition never became true and test case will be timed out.
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]