[
https://issues.apache.org/jira/browse/HDFS-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224407#comment-14224407
]
Vinayakumar B commented on HDFS-7342:
-------------------------------------
{quote}But the block has to be COMMITTED to be made COMPLETE. If it's not
COMMITTED yet (changing to COMMITTED is a request from client and it's
asynchronous) , even if it has min replication number of replications, it won't
be changed to COMPLETE. So I think we may still need to take care of changing
block's state to COMPLETE in FSNamesystem#internalReleaseLease. Right?{quote}
I agree that client request and Datanode's IBR are asynchronous. But both will
update the block state under writelock.
penultimate block will be COMMITTED in the {{getAdditionalBlock()}} client's
request.
Here there are 3 possibilities,
1. All IBRs comes before even block is COMMITTED. At this time, if the block is
FINALIZED in DN, replica will be accepted.
{code} if (ucBlock.reportedState == ReplicaState.FINALIZED &&
!block.findDatanode(storageInfo.getDatanodeDescriptor())) {
addStoredBlock(block, storageInfo, null, true);
}{code}
2. If client request comes after receiving 2 (=minReplication) IBRs, then
client request only will make the state to COMPLETED immediately after making
it COMMITTED in following code of {{BlockManager#commitOrCompleteLastBlock()}}
{code} final boolean b = commitBlock((BlockInfoUnderConstruction)lastBlock,
commitBlock);
if(countNodes(lastBlock).liveReplicas() >= minReplication)
completeBlock(bc, bc.numBlocks()-1, false);
return b;{code}
At this time, if the IBRs received are not enough, then block will be just
COMMITTED.
3. If the IBRs received after client request. i.e. after COMMITTED, then while
processing the second IBR block will be COMPLETED in below code.
{code} if(storedBlock.getBlockUCState() == BlockUCState.COMMITTED &&
numLiveReplicas >= minReplication) {
storedBlock = completeBlock(bc, storedBlock, false);{code}
So I couldnt find the possibility of the Block in COMMITTED state with
minReplication met.
{quote}{{recoverLeaseInternal()}} and {{internalReleaseLease()}} will need to
be made to distinguish the on-demand recovery from normal lease expiration. For
on-demand recovery, we might want it to fail if there is no live replicas, as a
file lease is normally recovered for subsequent append or copy(read). If there
is no data, they will fail.{quote}
I understood [~kihwal]'s suggestions as below.
{{recoverLease()}} call from client passes a {{force}} flag to
{{recoverLeaseInternal()}}. Based on this flag, we can check the block's states
(excluding last block) and # of replicas and decide to go ahead for recovery or
not even initiating request to DataNode.
So we need not worry this case in commitBlockSynchronization. In
{{commitBlockSynchronization()}} directly complete all blocks and close the
file.
Am I right [~kihwal] ?
> Lease Recovery doesn't happen some times
> ----------------------------------------
>
> Key: HDFS-7342
> URL: https://issues.apache.org/jira/browse/HDFS-7342
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.0.0-alpha
> Reporter: Ravi Prakash
> Assignee: Ravi Prakash
> Attachments: HDFS-7342.1.patch, HDFS-7342.2.patch, HDFS-7342.3.patch
>
>
> In some cases, LeaseManager tries to recover a lease, but is not able to.
> HDFS-4882 describes a possibility of that. We should fix this
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)