[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988020#comment-15988020
 ] 

Yongzhi Chen commented on HIVE-15997:
-------------------------------------

The shutdown will check the running list and shutdown each task in the list, 
shutdown on the task will stop the query because of failed task. But there are 
some chances the task is removed from the list(finished for example) when the 
shutdown happens. Add the if (driverContext.isShutdown()) may help this 
scenario. Even if later we can catch the cancel, it is the most prompt one. And 
if (driverContext.isShutdown()) check is not expensive call, so I think it is 
OK to have the check. 
{noformat}
 /**
   * Cleans up remaining tasks in case of failure
   */
  public synchronized void shutdown() {
    LOG.debug("Shutting down query " + ctx.getCmd());
    shutdown = true;
    for (TaskRunner runner : running) {
      if (runner.isRunning()) {
        Task<?> task = runner.getTask();
        LOG.warn("Shutting down task : " + task);
        try {
          task.shutdown();
        } catch (Exception e) {
          console.printError("Exception on shutting down task " + task.getId() 
+ ": " + e);
        }
        Thread thread = runner.getRunner();
        if (thread != null) {
          thread.interrupt();
        }
      }
    }
    running.clear();
  }
{noformat}

> Resource leaks when query is cancelled 
> ---------------------------------------
>
>                 Key: HIVE-15997
>                 URL: https://issues.apache.org/jira/browse/HIVE-15997
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yongzhi Chen
>            Assignee: Yongzhi Chen
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak: 
> {noformat} 
> 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1476) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1409) 
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  
> at com.sun.proxy.$Proxy25.delete(Unknown Source) 
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  
> at com.sun.proxy.$Proxy26.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>  
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>  
> at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) 
> at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) 
> at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) 
> at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>  
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: java.nio.channels.ClosedByInterruptException 
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>  
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) 
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>  
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) 
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) 
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) 
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714) 
> at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) 
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1448) 
> ... 35 more 
> 2017-02-02 12:26:52,706 INFO 
> org.apache.hive.service.cli.operation.OperationManager: 
> [HiveServer2-Background-Pool: Thread-23]: Operation is timed 
> out,operation=OperationHandle [opType=EXECUTE_STATEMENT, 
> getHandleIdentifier()=2af82100-94cf-4f26-abaa-c4b57c57b23c],state=CANCELED 
> {format} 
> Possible lock leak:
> Locks leak:
> {format}
> 2017-02-02 06:21:05,054 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-61]: Failed to release ZooKeeper lock: 
> java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Object.wait(Object.java:503)
>       at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
>       at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:238)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:233)
>       at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:214)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:41)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockPrimitive(ZooKeeperHiveLockManager.java:488)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:466)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:454)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:236)
>       at 
> org.apache.hadoop.hive.ql.Driver.releaseLocksAndCommitOrRollback(Driver.java:1175)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1432)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to