[ 
https://issues.apache.org/jira/browse/HDFS-11367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15850339#comment-15850339
 ] 

Manjunath Anand edited comment on HDFS-11367 at 2/2/17 7:10 PM:
----------------------------------------------------------------

Sure Dmitry I think its a good idea to have a best pratice/wiki document for 
HDFS API. 

As per the below stacktrace, we can see that on say TimeoutException in the 
submitLineAppend method, when writer.close() is called, the 
LeaseManager.removeLease is inturn called which removes the cached lease.

{code}
"main@1" prio=5 tid=0x1 nid=NA waiting
  java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Object.java:-1)
          at java.lang.Object.wait(Object.java:502)
          at org.apache.hadoop.ipc.Client.call(Client.java:1397)
          at org.apache.hadoop.ipc.Client.call(Client.java:1364)
          at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
          at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
          at 
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
          at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:497)
          at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
          at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
          at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
          at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:412)
          at 
org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2136)
          at 
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2119)
          - locked <0x11b6> (a org.apache.hadoop.hdfs.DFSOutputStream)
          at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
          at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
          at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:320)
          at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149)
          - locked <0x11b4> (a java.io.OutputStreamWriter)
          at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233)
          at java.io.BufferedWriter.close(BufferedWriter.java:266)
          at 
org.apache.hadoop.hdfs.MyAppender.submitLineAppend(MyAppender.java:71)
          - locked <0x11a9> (a java.util.HashMap)
          at org.apache.hadoop.hdfs.MyAppender.main(MyAppender.java:220)

"IPC Server handler 4 on 40660@3266" daemon prio=5 tid=0x28 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
          at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:146)
          - locked <0x10f8> (a 
org.apache.hadoop.hdfs.server.namenode.LeaseManager)
          at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:158)
          at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:4204)
          at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3210)
          at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3141)
          at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:665)
          at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
          at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:-1)
          at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
          at 
java.security.AccessController.doPrivileged(AccessController.java:-1)
          at javax.security.auth.Subject.doAs(Subject.java:422)
          at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2009)
{code} 


was (Author: manju_hadoop):
Sure Dmitry I think its a good idea to have a best pratice/wiki document for 
HDFS API. 

As per the below stacktrace, we can see that on say TimeoutException in the 
submitLineAppend method, when writer.close() is called, the 
LeaseManager.removeLease is inturn called which removes the lease.

{code}
"main@1" prio=5 tid=0x1 nid=NA waiting
  java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Object.java:-1)
          at java.lang.Object.wait(Object.java:502)
          at org.apache.hadoop.ipc.Client.call(Client.java:1397)
          at org.apache.hadoop.ipc.Client.call(Client.java:1364)
          at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
          at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
          at 
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
          at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:497)
          at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
          at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
          at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
          at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:412)
          at 
org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2136)
          at 
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2119)
          - locked <0x11b6> (a org.apache.hadoop.hdfs.DFSOutputStream)
          at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
          at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
          at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:320)
          at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149)
          - locked <0x11b4> (a java.io.OutputStreamWriter)
          at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233)
          at java.io.BufferedWriter.close(BufferedWriter.java:266)
          at 
org.apache.hadoop.hdfs.MyAppender.submitLineAppend(MyAppender.java:71)
          - locked <0x11a9> (a java.util.HashMap)
          at org.apache.hadoop.hdfs.MyAppender.main(MyAppender.java:220)

"IPC Server handler 4 on 40660@3266" daemon prio=5 tid=0x28 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
          at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:146)
          - locked <0x10f8> (a 
org.apache.hadoop.hdfs.server.namenode.LeaseManager)
          at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:158)
          at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:4204)
          at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3210)
          at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3141)
          at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:665)
          at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
          at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:-1)
          at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
          at 
java.security.AccessController.doPrivileged(AccessController.java:-1)
          at javax.security.auth.Subject.doAs(Subject.java:422)
          at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2009)
{code} 

> AlreadyBeingCreatedException "current leaseholder is trying to recreate file" 
> when trying to append to file
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11367
>                 URL: https://issues.apache.org/jira/browse/HDFS-11367
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.5.0
>         Environment: Red Hat Enterprise Linux Server release 6.8
>            Reporter: Dmitry Goldenberg
>            Assignee: Manjunath Anand
>         Attachments: Appender.java
>
>
> We have code which creates a file in HDFS and continuously appends lines to 
> the file, then closes the file at the end. This is done by a single dedicated 
> thread.
> We specifically instrumented the code to make sure only one 'client'/thread 
> ever writes to the file because we were seeing "current leaseholder is trying 
> to recreate file" errors.
> For some background see this for example: 
> https://community.cloudera.com/t5/Storage-Random-Access-HDFS/How-to-append-files-to-HDFS-with-Java-quot-current-leaseholder/m-p/41369
> This issue is very critical to us as any error terminates a mission critical 
> application in production.
> Intermittently, we see the below exception, regardless of what our code is 
> doing which is create the file, keep appending, then close:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
>  failed to create file /data/records_20170125_1.txt for 
> DFSClient_NONMAPREDUCE_-167421175_1 for client 1XX.2XX.1XX.XXX because 
> current leaseholder is trying to recreate file.
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3189)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3153)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:612)
>         at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:125)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:414)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1767)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>  
>         at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy24.append(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:483)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy24.append(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:282)
>         at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1586)
>         at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1626)
>         at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1614)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:313)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:309)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:309)
>         at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1161)
>         at com.myco.MyAppender.getOutputStream(MyAppender.java:147)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to