[ 
https://issues.apache.org/jira/browse/HBASE-12201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164582#comment-14164582
 ] 

Jingcheng Du commented on HBASE-12201:
--------------------------------------

bq. At the worst i believe it the thrown exception exception blocked a the 
timely zk node removal but that all the main work of the sweeper job completes.

Yes Jon, [~jmhsieh]. The logic won't be impacted, and I believe the zk will be 
closed too, zk.close() resides in the finally.
We don't close the output created by the fs.create(path, true), when the JVM 
exits, the shutdown hook tries to close this output while this file had been 
deleted by the sweeper. I think this exception occurs at that time, it means 
all the logic of the sweeper had been finished before this exception is thrown.

> Close the writers in the MOB sweep tool
> ---------------------------------------
>
>                 Key: HBASE-12201
>                 URL: https://issues.apache.org/jira/browse/HBASE-12201
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: hbase-11339
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>            Priority: Minor
>             Fix For: hbase-11339
>
>         Attachments: HBASE-12201-V2.diff, HBASE-12201.diff
>
>
>  When running the sweep tool, we encountered such an exception.
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>  No lease on 
> /hbase/mobdir/.tmp/mobcompaction/SweepJob-SweepMapper-SweepReducer-testSweepToolExpiredNoMinVersion-data/working/names/all
>  (inode 46500): File does not exist. Holder 
> DFSClient_NONMAPREDUCE_-1863270027_1 does not have any open files.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3319)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3407)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3377)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:673)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:219)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:520)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy15.complete(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:435)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    at java.lang.reflect.Method.invoke(Method.java:606)
>    at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>    at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>    at com.sun.proxy.$Proxy16.complete(Unknown Source)
>    at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2180)
>    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2164)
>    at 
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:908)
>   at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:925)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:861)
>   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2687)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2704)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> This is because we have several writers opened by fs.create(path, true) are 
> not closed properly.
> Meanwhile, in the current implementation, we save the temp files under 
> sweepJobDir/working/..., and when we remove the directory of the sweep job 
> only the working is deleted. We should remove the whole sweepJobDir instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to