[
https://issues.apache.org/jira/browse/HDFS-11367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15850339#comment-15850339
]
Manjunath Anand edited comment on HDFS-11367 at 2/2/17 7:10 PM:
----------------------------------------------------------------
Sure Dmitry I think its a good idea to have a best pratice/wiki document for
HDFS API.
As per the below stacktrace, we can see that on say TimeoutException in the
submitLineAppend method, when writer.close() is called, the
LeaseManager.removeLease is inturn called which removes the lease.
{code}
"main@1" prio=5 tid=0x1 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:502)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
at
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:412)
at
org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2136)
at
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2119)
- locked <0x11b6> (a org.apache.hadoop.hdfs.DFSOutputStream)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:320)
at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149)
- locked <0x11b4> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233)
at java.io.BufferedWriter.close(BufferedWriter.java:266)
at
org.apache.hadoop.hdfs.MyAppender.submitLineAppend(MyAppender.java:71)
- locked <0x11a9> (a java.util.HashMap)
at org.apache.hadoop.hdfs.MyAppender.main(MyAppender.java:220)
"IPC Server handler 4 on 40660@3266" daemon prio=5 tid=0x28 nid=NA runnable
java.lang.Thread.State: RUNNABLE
at
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:146)
- locked <0x10f8> (a
org.apache.hadoop.hdfs.server.namenode.LeaseManager)
at
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:158)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:4204)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3210)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3141)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:665)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:-1)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at
java.security.AccessController.doPrivileged(AccessController.java:-1)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2009)
{code}
was (Author: manju_hadoop):
Sure Dmitry I think its a good idea to have a best pratice/wiki document for
HDFS API.
As per the below stacktrace, we can see that on say TimeoutException in the
submitLineAppend method, calling writer.close() is called, the
LeaseManager.removeLease is called which removes the lease.
{code}
"main@1" prio=5 tid=0x1 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:502)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
at
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.complete(Unknown Source:-1)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:412)
at
org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2136)
at
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2119)
- locked <0x11b6> (a org.apache.hadoop.hdfs.DFSOutputStream)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:320)
at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149)
- locked <0x11b4> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233)
at java.io.BufferedWriter.close(BufferedWriter.java:266)
at
org.apache.hadoop.hdfs.MyAppender.submitLineAppend(MyAppender.java:71)
- locked <0x11a9> (a java.util.HashMap)
at org.apache.hadoop.hdfs.MyAppender.main(MyAppender.java:220)
"IPC Server handler 4 on 40660@3266" daemon prio=5 tid=0x28 nid=NA runnable
java.lang.Thread.State: RUNNABLE
at
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:146)
- locked <0x10f8> (a
org.apache.hadoop.hdfs.server.namenode.LeaseManager)
at
org.apache.hadoop.hdfs.server.namenode.LeaseManager.removeLease(LeaseManager.java:158)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:4204)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3210)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3141)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:665)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:-1)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at
java.security.AccessController.doPrivileged(AccessController.java:-1)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2009)
{code}
> AlreadyBeingCreatedException "current leaseholder is trying to recreate file"
> when trying to append to file
> -----------------------------------------------------------------------------------------------------------
>
> Key: HDFS-11367
> URL: https://issues.apache.org/jira/browse/HDFS-11367
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.5.0
> Environment: Red Hat Enterprise Linux Server release 6.8
> Reporter: Dmitry Goldenberg
> Assignee: Manjunath Anand
> Attachments: Appender.java
>
>
> We have code which creates a file in HDFS and continuously appends lines to
> the file, then closes the file at the end. This is done by a single dedicated
> thread.
> We specifically instrumented the code to make sure only one 'client'/thread
> ever writes to the file because we were seeing "current leaseholder is trying
> to recreate file" errors.
> For some background see this for example:
> https://community.cloudera.com/t5/Storage-Random-Access-HDFS/How-to-append-files-to-HDFS-with-Java-quot-current-leaseholder/m-p/41369
> This issue is very critical to us as any error terminates a mission critical
> application in production.
> Intermittently, we see the below exception, regardless of what our code is
> doing which is create the file, keep appending, then close:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
> failed to create file /data/records_20170125_1.txt for
> DFSClient_NONMAPREDUCE_-167421175_1 for client 1XX.2XX.1XX.XXX because
> current leaseholder is trying to recreate file.
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3189)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3153)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:612)
> at
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:125)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:414)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1767)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1411)
> at org.apache.hadoop.ipc.Client.call(Client.java:1364)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy24.append(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy24.append(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:282)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1586)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1626)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1614)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:313)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:309)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:309)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1161)
> at com.myco.MyAppender.getOutputStream(MyAppender.java:147)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]