[
https://issues.apache.org/jira/browse/HIVE-29286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032042#comment-18032042
]
Stamatis Zampetakis commented on HIVE-29286:
--------------------------------------------
As explained also under HIVE-11402 the objects that are handling the sessions
in HS2 are not really thread safe and there are many assumptions that only one
thread is making use of the session at every point in time. The scenario
described above violates these assumptions since we have many threads/clients
operating on the same session thus the use-case is not officially supported by
HS2.
HIVE-14227, is very similar to this issue reporting issues when sessions are
used by different Thrift connections. The respective patch was never merged
since there was no consensus to move forward.
Overall, there is a general agreement that clients should avoid using sessions
from different connections since the results in most cases are unpredictable.
> Session close from different clients/threads leaks resources
> ------------------------------------------------------------
>
> Key: HIVE-29286
> URL: https://issues.apache.org/jira/browse/HIVE-29286
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 4.2.0
> Reporter: Stamatis Zampetakis
> Priority: Major
> Attachments: session-close-txn-leak.patch
>
>
> Consider the following scenario where three clients/threads interact with
> HiveServer2 (HS2) using directly (not via JDBC/ODBC) the Thrift APIs. To
> showcase the problem we assume that the following properties are set.
> {code:sql}
> set hive.driver.parallel.compilation=true;
> set hive.driver.parallel.compilation.global.limit=1;
> set hive.server2.compile.lock.timeout=0s;
> {code}
> * C1: Connect with HS2 and start session with handle S1
> * C1: Send Q1, a slow-compilation query, using S1
> * HS2: Obtain the compilation lock and start compiling Q1
> * C2: Connect with HS2 and start session with handle S2
> * C2: Send Q2, a fast-compilation query, using S2
> * HS2: Q2 blocks, waiting for the compilation lock to become available.
> * C3: Connect with HS2 and close session with handle S2
> C1, C2, C3 are different Thrift connections so they are handled by separate
> HS2 threads. C3 will successfully close/kill/stop the session. However, since
> Q2 was blocked in compilation it can't be stopped immediately. When C2
> finally obtains the compilation lock and finishes, the operation will error
> out since various session related entries have been cleared out.
> Thread-152 handles the requests from C2, and it reaches the end of
> compilation it throws the following errors.
> {noformat}
> 2025-10-22T03:05:32,929 ERROR [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31
> HiveServer2-Handler-Pool: Thread-152] thrift.ThriftCLIService: Failed to
> execute statement [request:
> TExecuteStatementReq(sessionHandle:TSessionHandle(sessionId:THandleIdentifier(guid:87
> A0 53 CE F5 D8 4B AA BD BC
> 8F D7 BC 1E EF 31, secret:C5 A3 62 BB 51 0B 4A 76 B6 7A 04 0B 38 5C 0A 5A)),
> statement:EXPLAIN SELECT AVG(age) FROM person GROUP BY name, confOverlay:{},
> runAsync:false, queryTimeout:0)]
> java.lang.RuntimeException: java.lang.NullPointerException: Cannot invoke
> "org.apache.hadoop.hive.ql.session.SessionState.getSessionId()" because the
> return value of
> "org.apache.hive.service.cli.session.HiveSession.getSessionState()" is null
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:89)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
> ~[?:?]
> at java.base/javax.security.auth.Subject.doAs(Subject.java:525) ~[?:?]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
> ~[hadoop-common-3.4.1.jar:?]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at jdk.proxy2/jdk.proxy2.$Proxy43.executeStatement(Unknown Source)
> ~[?:?]
> at
> org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:281)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:651)
> ~[hive-service-4.2.0-SNAPSHOT.jar:?]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1670)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1650)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> ~[?:?]
> at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> Caused by: java.lang.NullPointerException: Cannot invoke
> "org.apache.hadoop.hive.ql.session.SessionState.getSessionId()" because the
> return value of
> "org.apache.hive.service.cli.session.HiveSession.getSessionState()" is null
> at
> org.apache.hive.service.cli.operation.Operation.afterRun(Operation.java:270)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:288)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:558)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:532)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> ~[?:?]
> at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> ... 17 more
> 2025-10-22T03:05:32,930 ERROR [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31
> HiveServer2-Handler-Pool: Thread-152] thrift.ThriftCLIService: Failed to
> close the session
> org.apache.hive.service.cli.HiveSQLException: Session does not exist:
> SessionHandle [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31]
> at
> org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:625)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:240)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseSession(ThriftCLIService.java:611)
> ~[hive-service-4.2.0-SNAPSHOT.jar:?]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1620)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1600)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> ~[?:?]
> at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> {noformat}
> The fact that C3 closes the session that is open by another client leads to
> race conditions and may also cause resource leaks. The error may not always
> be the same with the one reported above since it really depends at which
> stage of compilation/execution a session is closed/killed.
> When the queries are over ACID tables the abrupt session termination of S2
> will fail to close the transaction that is opened for Q2 leading to a
> transaction leak.
> The scenario is inspired from [Hue|https://gethue.com/] workflows where
> session close can be initiated by user request or [configuration
> settings|https://docs.gethue.com/administrator/configuration/server/#idle-session-timeout].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)