[
https://issues.apache.org/jira/browse/HIVE-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319063#comment-15319063
]
zhihai xu commented on HIVE-13960:
----------------------------------
Thanks for the review [~jxiang]! Checking opHandleSet.isEmpty() is to protect
asynchronous operations because an operation is only removed in function
closeOperation {{opHandleSet.remove(opHandle);}}, but not in function release.
After an asynchronous operation such as {{executeStatementAsync}} is running,
the pendingCount is zero and opHandleSet is not empty in function release
called from {{executeStatementAsync}}. So these two don't match in this case,
if we don't check opHandleSet in release for asynchronous operation, the
session may be expired when the asynchronous operation is still running (before
closeOperation is called). Checking pendingCount in release is to prevent
another race condition: if a long running synchronous operation is running in
{{executeStatement}} and another one is trying to call {{getInfo}} to get the
session information, then lastIdleTime will be changed to a non-zero value in
function release called from {{getInfo}} and the session may be expired when
the synchronous operation is still running.
> Session timeout may happen before HIVE_SERVER2_IDLE_SESSION_TIMEOUT for
> back-to-back synchronous operations.
> ------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-13960
> URL: https://issues.apache.org/jira/browse/HIVE-13960
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Reporter: zhihai xu
> Assignee: zhihai xu
> Attachments: HIVE-13960.000.patch
>
>
> Session timeout may happen before
> HIVE_SERVER2_IDLE_SESSION_TIMEOUT(hive.server2.idle.session.timeout) for
> back-to-back synchronous operations.
> This issue can happen with the following two operations op1 and op2: op2 is a
> synchronous long running operation, op2 is running right after op1 is closed.
>
> 1. closeOperation(op1) is called:
> this will set {{lastIdleTime}} with value System.currentTimeMillis() because
> {{opHandleSet}} becomes empty after {{closeOperation}} remove op1 from
> {{opHandleSet}}.
> 2. op2 is running for long time by calling {{executeStatement}} right after
> closeOperation(op1) is called.
> If op2 is running for more than HIVE_SERVER2_IDLE_SESSION_TIMEOUT, then the
> session will timeout even when op2 is still running.
> We hit this issue when we use PyHive to execute non-async operation
> The following is the exception we see:
> {code}
> File "/usr/local/lib/python2.7/dist-packages/pyhive/hive.py", line 126, in
> close
> _check_status(response)
> File "/usr/local/lib/python2.7/dist-packages/pyhive/hive.py", line 362, in
> _check_status
> raise OperationalError(response)
> OperationalError: TCloseSessionResp(status=TStatus(errorCode=0,
> errorMessage='Session does not exist!', sqlState=None,
> infoMessages=['*org.apache.hive.service.cli.HiveSQLException:Session does not
> exist!:12:11',
> 'org.apache.hive.service.cli.session.SessionManager:closeSession:SessionManager.java:311',
> 'org.apache.hive.service.cli.CLIService:closeSession:CLIService.java:221',
> 'org.apache.hive.service.cli.thrift.ThriftCLIService:CloseSession:ThriftCLIService.java:471',
>
> 'org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseSession:getResult:TCLIService.java:1273',
>
> 'org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseSession:getResult:TCLIService.java:1258',
> 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
> 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
> 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
>
> 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
>
> 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
>
> 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
> 'java.lang.Thread:run:Thread.java:745'], statusCode=3))
> TCloseSessionResp(status=TStatus(errorCode=0, errorMessage='Session does not
> exist!', sqlState=None,
> infoMessages=['*org.apache.hive.service.cli.HiveSQLException:Session does not
> exist!:12:11',
> 'org.apache.hive.service.cli.session.SessionManager:closeSession:SessionManager.java:311',
> 'org.apache.hive.service.cli.CLIService:closeSession:CLIService.java:221',
> 'org.apache.hive.service.cli.thrift.ThriftCLIService:CloseSession:ThriftCLIService.java:471',
>
> 'org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseSession:getResult:TCLIService.java:1273',
>
> 'org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseSession:getResult:TCLIService.java:1258',
> 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
> 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
> 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
>
> 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
>
> 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
>
> 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
> 'java.lang.Thread:run:Thread.java:745'], statusCode=3))
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)