[jira] [Commented] (HDFS-14498) LeaseManager can loop forever on the file for which create has failed

Stephen O'Donnell (Jira) Mon, 06 Jul 2020 09:26:20 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152138#comment-17152138
 ]


Stephen O'Donnell commented on HDFS-14498:
------------------------------------------

I can create a problem file like this, by doing the following, using the 
DFSClient to make NN calls:

1. Create a file in the NN.
2. Add a block to it
3. Without setting up a DN pipeline call complete on the file, which will 
return false.

At that point, the file will have one "committed but not complete" block, and 
it will be stuck forever waiting for lease recovered. Manual calls to recover 
lease do not help, failing with this:

{code}
Retrying in 5000 ms...
Retry #4
recoverLease got exception: DIR* NameSystem.internalReleaseLease: Failed to 
release lease for file /test/f1. Committed blocks are waiting to be minimally 
replicated. Try again later.
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3690)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2835)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2769)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:852)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:824)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)
{code}

fsck shows:

{code}
/test/f1 0 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE:  OK

Under Construction Block:
0. BP-1860564593-172.26.0.3-1594050316675:blk_1073741825_1001 len=0 
Expected_repl=1  
[DatanodeInfoWithStorage[172.26.0.2:9866,DS-86adb84e-d5e8-4bcb-adb9-e8488b6eabef,DISK]]
{code}

I am not sure how the normal write path can get into this situation - perhaps 
if the client exits unexpectedly after allocating the file and block, but 
before setting up the DN pipeline. However it still needs the `complete(...)` 
call to be made against the NN. Without the `complete(...)` call, the recovery 
lease command works OK, and simply removes the block because it is empty.

I wonder how we can detect a scenario like this, so the lease can be recovered 
- perhaps if we have a "committed" block with zero length as the final block, 
it should simply remove the final block? When the client calls `complete(...)` 
it reports the expected block length to the namenode - if the expect length is 
zero, then the pending block is no useful anyway.

This is the code I used to create the problem, run against upstream trunk:

{code}
package com.sodonnel.Hadoop;
  
import java.io.IOException;
import java.util.Arrays;
import java.util.EnumSet;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.crypto.CryptoProtocolVersion;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.fs.permission.FsPermission;
import org.apache.hadoop.hdfs.DFSClient;
import org.apache.hadoop.hdfs.protocol.DatanodeInfo;
import org.apache.hadoop.hdfs.protocol.HdfsFileStatus;
import org.apache.hadoop.hdfs.protocol.LocatedBlock;
import org.apache.hadoop.io.EnumSetWritable;

public class PartialFile {
  public static void main(String[] args) throws IOException, 
InterruptedException {

    Configuration configuration = new Configuration();
    FileSystem hdfs = FileSystem.get( configuration );

    DFSClient c = new DFSClient(configuration);
    HdfsFileStatus stat = c.getNamenode().create("/test/f1", new 
FsPermission("777"), "test client", new 
EnumSetWritable<CreateFlag>(EnumSet.of(CreateFlag.CREATE)), true, (short)3, 
1024*1024*128, new CryptoProtocolVersion[0], null,  null);
    LocatedBlock blk = c.getNamenode().addBlock("/test/f1", "test client", 
null, new DatanodeInfo[0], stat.getFileId(), new String[0], null);
    boolean closed = c.getNamenode().complete("/test/f1", "test client", 
blk.getBlock(), stat.getFileId());
  }
}
{code}

> LeaseManager can loop forever on the file for which create has failed 
> ----------------------------------------------------------------------
>
>                 Key: HDFS-14498
>                 URL: https://issues.apache.org/jira/browse/HDFS-14498
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.9.0
>            Reporter: Sergey Shelukhin
>            Priority: Major
>
> The logs from file creation are long gone due to infinite lease logging, 
> however it presumably failed... the client who was trying to write this file 
> is definitely long dead.
> The version includes HDFS-4882.
> We get this log pattern repeating infinitely:
> {noformat}
> 2019-05-16 14:00:16,893 INFO 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1] has expired hard 
> limit
> 2019-05-16 14:00:16,893 INFO 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease.  
> Holder: DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1], src=<snip>
> 2019-05-16 14:00:16,893 WARN 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: 
> Failed to release lease for file <snip>. Committed blocks are waiting to be 
> minimally replicated. Try again later.
> 2019-05-16 14:00:16,893 WARN 
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the path 
> <snip> in the lease [Lease.  Holder: DFSClient_NONMAPREDUCE_-20898906_61, 
> pending creates: 1]. It will be retried.
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR* 
> NameSystem.internalReleaseLease: Failed to release lease for file <snip>. 
> Committed blocks are waiting to be minimally replicated. Try again later.
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3357)
>       at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:573)
>       at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:509)
>       at java.lang.Thread.run(Thread.java:745)
> $  grep -c "Recovering.*DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 
> 1" hdfs_nn*
> hdfs_nn.log:1068035
> hdfs_nn.log.2019-05-16-14:1516179
> hdfs_nn.log.2019-05-16-15:1538350
> {noformat}
> Aside from an actual bug fix, it might make sense to make LeaseManager not 
> log so much, in case if there are more bugs like this...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14498) LeaseManager can loop forever on the file for which create has failed

Reply via email to