[ 
https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745848#comment-13745848
 ] 

Uma Maheswara Rao G edited comment on HDFS-4504 at 8/21/13 7:52 AM:
--------------------------------------------------------------------

Looks like this is good problem scenario pointed by Vinay here.
Client should not call complete without knowing that he receives ack for last 
packet. Otherwise DN might be there with RBW state block and NN block state may 
become committed as client called to complete. Now states will become 
inconsistent and internalRecoverLease also will not be allowed because 
committedBlock did not get minreplicas reported by DN.

{code}
 case COMMITTED:
      // Close file if committed blocks are minimally replicated
      if(penultimateBlockMinReplication &&
          blockManager.checkMinReplication(lastBlock)) {
        finalizeINodeFileUnderConstruction(src, pendingFile,
            iip.getLatestSnapshot(), false);
        NameNode.stateChangeLog.warn("BLOCK*"
          + " internalReleaseLease: Committed blocks are minimally replicated,"
          + " lease removed, file closed.");
        return true;  // closed!
      }
      // Cannot close file right now, since some blocks 
      // are not yet minimally replicated.
      // This may potentially cause infinite loop in lease recovery
      // if there are no valid replicas on data-nodes.
      String message = "DIR* NameSystem.internalReleaseLease: " +
          "Failed to release lease for file " + src +
          ". Committed blocks are waiting to be minimally replicated." +
          " Try again later.";
      NameNode.stateChangeLog.warn(message);
{code}

ideally block is committed means, DN must have finalyzed also. SO, DN will 
report finalysed block state in that case. Here, DN has RBW state only as it 
was failed in between. Due to that failure, it got added to zombie and it will 
try to complete the file without knowing whether he receives really last packet 
ack or not.

In normal recovery case, block will be finalysed by normal recovery flow as 
below:

{code}
   case UNDER_CONSTRUCTION:
   case UNDER_RECOVERY:
      final BlockInfoUnderConstruction uc = 
(BlockInfoUnderConstruction)lastBlock;
      // setup the last block locations from the blockManager if not known
      if (uc.getNumExpectedLocations() == 0) {
        uc.setExpectedLocations(blockManager.getNodes(lastBlock));
      }
      // start recovery of the last block for this file
      long blockRecoveryId = nextGenerationStamp(isLegacyBlock(uc));
      lease = reassignLease(lease, src, recoveryLeaseHolder, pendingFile);
      uc.initializeBlockRecovery(blockRecoveryId);
      leaseManager.renewLease(lease);
      // Cannot close file right now, since the last block requires recovery.
      // This may potentially cause infinite loop in lease recovery
      // if there are no valid replicas on data-nodes.
      NameNode.stateChangeLog.warn(
                "DIR* NameSystem.internalReleaseLease: " +
                "File " + src + " has not been closed." +
               " Lease recovery is in progress. " +
                "RecoveryId = " + blockRecoveryId + " for block " + lastBlock);
      break;
    }
{code}
this will make DN blocks to finalyze if they are in RBW state. But here if we 
change the state already to committed, then recovery flow will be diverted and 
no one will finalyze the block at DN. I am affraid that, this changes may cause 
the problems like this. So, better to do recovery with NN only I think by just 
just informing the zombie files to NN when NN available. Once we inform to NN 
successfully about zombie file successfully then we can remove such file 
entries from his list. untill that try informing to NN about zombie files. This 
may be better choice which may avoid the risks like above scenarios.
                
      was (Author: umamaheswararao):
    Looks like this is good problem scenario pointed by Vinay here.
Client should not call complete without knowing that he receives ack for last 
packet. Otherwise DN might be there with RBW state block and NN block state may 
become committed as client called to complete. Now states will become 
inconsistent and internalRecoverLease also will not be allowed because 
committedBlock did not get minreplicas reported by DN.

{code}
 case COMMITTED:
      // Close file if committed blocks are minimally replicated
      if(penultimateBlockMinReplication &&
          blockManager.checkMinReplication(lastBlock)) {
        finalizeINodeFileUnderConstruction(src, pendingFile,
            iip.getLatestSnapshot(), false);
        NameNode.stateChangeLog.warn("BLOCK*"
          + " internalReleaseLease: Committed blocks are minimally replicated,"
          + " lease removed, file closed.");
        return true;  // closed!
      }
      // Cannot close file right now, since some blocks 
      // are not yet minimally replicated.
      // This may potentially cause infinite loop in lease recovery
      // if there are no valid replicas on data-nodes.
      String message = "DIR* NameSystem.internalReleaseLease: " +
          "Failed to release lease for file " + src +
          ". Committed blocks are waiting to be minimally replicated." +
          " Try again later.";
      NameNode.stateChangeLog.warn(message);
{code}

ideally block is committed means, DN must have finalyzed also. SO, DN will 
report finalysed block state in that case. Here, DN has RBW state only as it 
was failed in between. Due to that failure, it got added to zombie and it will 
try to complete the file without knowing whether he receives really last packet 
ack or not.

In normal recovery case, block will be finalysed by normal recovery flow as 
below:

{code}
case UNDER_RECOVERY:
      final BlockInfoUnderConstruction uc = 
(BlockInfoUnderConstruction)lastBlock;
      // setup the last block locations from the blockManager if not known
      if (uc.getNumExpectedLocations() == 0) {
        uc.setExpectedLocations(blockManager.getNodes(lastBlock));
      }
      // start recovery of the last block for this file
      long blockRecoveryId = nextGenerationStamp(isLegacyBlock(uc));
      lease = reassignLease(lease, src, recoveryLeaseHolder, pendingFile);
      uc.initializeBlockRecovery(blockRecoveryId);
      leaseManager.renewLease(lease);
      // Cannot close file right now, since the last block requires recovery.
      // This may potentially cause infinite loop in lease recovery
      // if there are no valid replicas on data-nodes.
      NameNode.stateChangeLog.warn(
                "DIR* NameSystem.internalReleaseLease: " +
                "File " + src + " has not been closed." +
               " Lease recovery is in progress. " +
                "RecoveryId = " + blockRecoveryId + " for block " + lastBlock);
      break;
    }
{code}
this will make DN blocks to finalyze if they are in RBW state. But here if we 
change the state already to committed, then recovery flow will be diverted and 
no one will finalyze the block at DN. I am affraid that, this changes may cause 
the problems like this. So, better to do recovery with NN only I think by just 
just informing the zombie files to NN when NN available. Once we inform to NN 
successfully about zombie file successfully then we can remove such file 
entries from his list. untill that try informing to NN about zombie files. This 
may be better choice which may avoid the risks like above scenarios.
                  
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>
>                 Key: HDFS-4504
>                 URL: https://issues.apache.org/jira/browse/HDFS-4504
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, 
> HDFS-4504.007.patch, HDFS-4504.008.patch, HDFS-4504.009.patch, 
> HDFS-4504.010.patch, HDFS-4504.011.patch, HDFS-4504.014.patch, 
> HDFS-4504.015.patch, HDFS-4504.016.patch
>
>
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases.  One 
> example is if there is a pipeline error and then pipeline recovery fails.  
> Unfortunately, in this case, some of the resources used by the 
> {{DFSOutputStream}} are leaked.  One particularly important resource is file 
> leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many 
> blocks to a file, but then fail to close it.  Unfortunately, the 
> {{LeaseRenewerThread}} inside the client will continue to renew the lease for 
> the "undead" file.  Future attempts to close the file will just rethrow the 
> previous exception, and no progress can be made by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to