[jira] [Comment Edited] (HDFS-11293) FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation

Uma Maheswara Rao G (JIRA) Sat, 07 Jan 2017 03:23:25 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15807317#comment-15807317
 ]


Uma Maheswara Rao G edited comment on HDFS-11293 at 1/7/17 11:22 AM:
---------------------------------------------------------------------

I spent some time on this issue to find the root cause of the issue.
Here is the reason for failure:
  We are currently picking source nodes from blockstorages, and we are randomly 
picking source nodes.
 In the current case, existing types[DISK] available in all 3 nodes. 
Here lets assume we have DN1[DISK, ARCHIVE], DN2[DISK, SSD], DN3[DISK, 
RAM_DISK]. When we set storage policy as ONE_SSD and then satisfy, first we 
need to find overlap nodes. From overlap report, DISK is existing type and SSD 
would be expected type. Here we are just picking source nodes from existing 
storages, here all 3 nodes will have existing storage types[DISK] which can be 
qualified as source node. 
{code}
for (StorageType existingType : existing) {
        iterator = existingBlockStorages.iterator();
        while (iterator.hasNext()) {
          DatanodeStorageInfo datanodeStorageInfo = iterator.next();
          StorageType storageType = datanodeStorageInfo.getStorageType();
          if (storageType == existingType) {
            iterator.remove();
            sourceWithStorageMap.add(new StorageTypeNodePair(storageType,
                datanodeStorageInfo.getDatanodeDescriptor()));
            break;
          }
        }
      }
{code}

But in reality if we choose DN1 or DN2 as source nodes[with DISK], obviously 
target node[SSD] would be DN2. But since DN2 already has replica, it would fail 
with ReplicaAlreadyExistsException. 
Some times it may pass if source node was picked as DN2 (This is possible, 
because we are just picking any one node among storages). When source and 
target node is same, we move the block locally.
{code}
 try {
      // Move the block to different storage in the same datanode
      if (proxySource.equals(datanode.getDatanodeId())) {
        ReplicaInfo oldReplica = datanode.data.moveBlockAcrossStorage(block,
            storageType);
        if (oldReplica != null) {
          LOG.info("Moved " + block + " from StorageType "
              + oldReplica.getVolume().getStorageType() + " to " + storageType);
        }
      } else {
{code}

Now Attached a patch to for find the nodes which has sourceType and TargetType, 
then remaining sources will be identified. 

I ran the test several times, it is passing consistently now.
 
[~yuanbo], Thank you for sharing the test case in email. Included testcase is 
your test case and passing now. Could you please verify now?


was (Author: umamaheswararao):
I spent some time on this issue to find the root cause of the issue.
Here is the reason for failure:
  We are currently picking source nodes from blockstorages, and we are randomly 
picking source nodes.
 In the current case, existing types[DISK] available in all 3 nodes. 
Here lets assume we have DN1[DISK, ARCHIVE], DN2[DISK, SSD], DN3[DISK, 
RAM_DISK]. When we set storage policy as ONE_SSD and then satisfy, first we 
need to find overlap nodes. From overlap report, DISK is existing type and SSD 
would be expected type. Here we are just picking source nodes from existing 
storages, here all 3 nodes will have existing storage types[DISK] which can be 
qualified as source node. 
{code}
for (StorageType existingType : existing) {
        iterator = existingBlockStorages.iterator();
        while (iterator.hasNext()) {
          DatanodeStorageInfo datanodeStorageInfo = iterator.next();
          StorageType storageType = datanodeStorageInfo.getStorageType();
          if (storageType == existingType) {
            iterator.remove();
            sourceWithStorageMap.add(new StorageTypeNodePair(storageType,
                datanodeStorageInfo.getDatanodeDescriptor()));
            break;
          }
        }
      }
{code}

But in reality if we choose DN1 or DN2 as source nodes[with DISK], obviously 
target node[SSD] would be DN2. But since DN2 already has replica, it would fail 
with ReplicaAlreadyExistsException. 
Some times it may pass if source node was picked as DN2 (This is possible, 
because we are just picking any one node among storages). When source and 
target node is same, we move the block locally.
{code}
 try {
      // Move the block to different storage in the same datanode
      if (proxySource.equals(datanode.getDatanodeId())) {
        ReplicaInfo oldReplica = datanode.data.moveBlockAcrossStorage(block,
            storageType);
        if (oldReplica != null) {
          LOG.info("Moved " + block + " from StorageType "
              + oldReplica.getVolume().getStorageType() + " to " + storageType);
        }
      } else {
{code}

Now Attached a patch to for find the nodes which has sourceType and TargetType, 
then remaining sources will be identified. 

I ran the test several times, it is passing consistently now.
 

> FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation
> -----------------------------------------------------------------------
>
>                 Key: HDFS-11293
>                 URL: https://issues.apache.org/jira/browse/HDFS-11293
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Yuanbo Liu
>            Assignee: Yuanbo Liu
>            Priority: Critical
>         Attachments: HDFS-11293-HDFS-10285-00.patch
>
>
> In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica 
> info by block pool id. But in this situation:
> {code}
> datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
> 1. the same block replica exists in A[DISK] and B[DISK].
> 2. the block pool id of datanode A and datanode B are the same.
> {code}
> Then we start to change the file's storage policy and move the block replica 
> in the cluster. Very likely we have to move block from B[DISK] to A[SSD], at 
> this time, datanode A throws ReplicaAlreadyExistsException and it's not a 
> correct behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-11293) FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation

Reply via email to