[ 
https://issues.apache.org/jira/browse/HDFS-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shengjinmiao updated HDFS-6592:
-------------------------------

    Summary: Use Fluentd to write data into Hadoop HDFS over WebHDFS. Throw the 
AlreadyBeingCreatedException exception  (was: Use Fluentd to collect data to 
append to HDFS. Throw the AlreadyBeingCreatedException exception)

> Use Fluentd to write data into Hadoop HDFS over WebHDFS. Throw the 
> AlreadyBeingCreatedException exception
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6592
>                 URL: https://issues.apache.org/jira/browse/HDFS-6592
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: jack
>
> We use Fluent to collect log data. The log data append to the files in HDFS. 
> The cluster configuration:
> Namenode : namenode1(hostname)
> secondnamenode: namenode2
> 3 datanodes: datanode1, datanode2, datanode3
> 3 replications
> Every few days,  suffere from the following exception:
> Exception in nameNode1:
> 2014-06-22 09:54:41,892 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.append: Failed to create file [file_nameXXXXX] for 
> [DFSClient_NONMAPREDUCE_-1425263782_2027206] on client [dataNode1], because 
> this file is already being created by 
> [DFSClient_NONMAPREDUCE_349196146_2027206] on [dataNode1]
> 2014-06-22 09:54:41,892 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:hadoop (auth:SIMPLE) 
> cause:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
> create file [file_nameXXXXX] for [DFSClient_NONMAPREDUCE_-1425263782_2027206] 
> on client [dataNode1], because this file is already being created by 
> [DFSClient_NONMAPREDUCE_349196146_2027206] on [dataNode1]
> Exception in DataNode1:
> 2014-06-22 09:54:45,771 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException: 
> Unable to close file because the last block does not have enough number of 
> replicas.
> 2014-06-22 09:54:45,813 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:hadoop (auth:SIMPLE) 
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
>  Failed to create file [file_nameXXXXX] for 
> [DFSClient_NONMAPREDUCE_-1425263782_2027206] on client [dataNode1], because 
> this file is already being created by 
> [DFSClient_NONMAPREDUCE_349196146_2027206] on [dataNode1]
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2441)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2277)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2505)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2468)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:516)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:340)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> According to the log, we infer the flow of the exception:
> 1. Namenode update pipeline with just one datanode
>       namenode1 log: 2014-06-22 09:54:16,604 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline  
>       (block=BP-1611177164-datanode1-1399894698024:blk_1074496235_1935947, 
> newGenerationStamp=1935951, newLength=98839816, newNodes=[datanode1:50010], 
>       clientName=DFSClient_NONMAPREDUCE_349196146_2027206)
> 2. datanode1 throw exception during close.
>       datanode1 log: 2014-06-22 09:54:26,569 INFO 
> org.apache.hadoop.hdfs.DFSClient: Could not complete file_name retrying...
> 3. The subsequent collected data from  Fluent will triger another DFSClient 
> to append to the same file.
>       namenode1 log: 2014-06-22 09:54:41,892 WARN 
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
> as:hadoop (auth:SIMPLE) 
>       cause:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: 
> Failed to create file [file_name] for 
> [DFSClient_NONMAPREDUCE_-1425263782_2027206] on client 
>       [datanode1], because this file is already being created by 
> [DFSClient_NONMAPREDUCE_349196146_2027206] on [datanode1]
> 4. The subsequent DFSClient will triger to recover the Lease every 
> LEASE_SOFTLIMIT_PERIOD
>       
>       namenode1 log: 2014-06-22 09:58:34,722 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile: recover 
> [Lease.  Holder: 
>       DFSClient_NONMAPREDUCE_349196146_2027206, pendingcreates: 1], 
> src=file_name client DFSClient_NONMAPREDUCE_349196146_2027206
> 5. Fail to recover the lease.
>       namenode1 log:
>       2014-06-22 09:58:34,722 WARN 
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
> as:hadoop (auth:SIMPLE) 
>       cause:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: 
> DIR* NameSystem.internalReleaseLease: Failed to release lease for file 
> file_name. Committed 
>       blocks are waiting to be minimally replicated. Try again later.
>       datanode1 log:
>       2014-06-22 09:56:15,149 WARN 
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
> as:hadoop (auth:SIMPLE) 
>       
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
>  DIR* NameSystem.internalReleaseLease: Failed to release 
>       lease for file file_name. Committed blocks are waiting to be minimally 
> replicated. Try again later.
>        at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3722)
>   
>        at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2429)
>  
>        at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2277)
>  
>        at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2505)
>  
>        at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2468)
>  
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:516)
>  
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:340)
>  
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> 6. epire the LEASE_HARDLIMIT_PERIOD
>       namenode1 log: WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.append: failed to create file file_name for DFSClient on client 
> datanode1 because pendingCreates is non-null but no leases found.
> we found HDFS-4504 may be associated with our issue. However, need to check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to