[ 
https://issues.apache.org/jira/browse/HIVE-29286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032038#comment-18032038
 ] 

Stamatis Zampetakis commented on HIVE-29286:
--------------------------------------------

The [^session-close-txn-leak.patch] contains a unit test that reproduces the 
problem on current master (commit ad55d58eadd6c1aabc7fed51e25dfffe158a44f6).

> Session close from different clients/threads leaks resources
> ------------------------------------------------------------
>
>                 Key: HIVE-29286
>                 URL: https://issues.apache.org/jira/browse/HIVE-29286
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 4.2.0
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>         Attachments: session-close-txn-leak.patch
>
>
> Consider the following scenario where three clients/threads interact with 
> HiveServer2 (HS2) using directly (not via JDBC/ODBC) the Thrift APIs. To 
> showcase the problem we assume that the following properties are set.
> {code:sql}
> set hive.driver.parallel.compilation=true;
> set hive.driver.parallel.compilation.global.limit=1;
> set hive.server2.compile.lock.timeout=0s; 
> {code}
>  * C1: Connect with HS2 and start session with handle S1
>  * C1: Send Q1, a slow-compilation query, using S1
>  * HS2: Obtain the compilation lock and start compiling Q1
>  * C2: Connect with HS2 and start session with handle S2
>  * C2: Send Q2, a fast-compilation query, using S2
>  * HS2: Q2 blocks, waiting for the compilation lock to become available.
>  * C3: Connect with HS2 and close session with handle S2
> C1, C2, C3 are different Thrift connections so they are handled by separate 
> HS2 threads. C3 will successfully close/kill/stop the session. However, since 
> Q2 was blocked in compilation it can't be stopped immediately. When C2 
> finally obtains the compilation lock and finishes, the operation will error 
> out since various session related entries have been cleared out.
> Thread-152 handles the requests from C2, and it reaches the end of 
> compilation it throws the following errors.
> {noformat}
> 2025-10-22T03:05:32,929 ERROR [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31 
> HiveServer2-Handler-Pool: Thread-152] thrift.ThriftCLIService: Failed to 
> execute statement [request: 
> TExecuteStatementReq(sessionHandle:TSessionHandle(sessionId:THandleIdentifier(guid:87
>  A0 53 CE F5 D8 4B AA BD BC 
> 8F D7 BC 1E EF 31, secret:C5 A3 62 BB 51 0B 4A 76 B6 7A 04 0B 38 5C 0A 5A)), 
> statement:EXPLAIN SELECT AVG(age) FROM person GROUP BY name, confOverlay:{}, 
> runAsync:false, queryTimeout:0)]
> java.lang.RuntimeException: java.lang.NullPointerException: Cannot invoke 
> "org.apache.hadoop.hive.ql.session.SessionState.getSessionId()" because the 
> return value of 
> "org.apache.hive.service.cli.session.HiveSession.getSessionState()" is null
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:89)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
>  ~[?:?]
>         at java.base/javax.security.auth.Subject.doAs(Subject.java:525) ~[?:?]
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
>  ~[hadoop-common-3.4.1.jar:?]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at jdk.proxy2/jdk.proxy2.$Proxy43.executeStatement(Unknown Source) 
> ~[?:?]
>         at 
> org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:281) 
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:651)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:?]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1670)
>  ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1650)
>  ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
>  ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>  ~[?:?]
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>  ~[?:?]
>         at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> Caused by: java.lang.NullPointerException: Cannot invoke 
> "org.apache.hadoop.hive.ql.session.SessionState.getSessionId()" because the 
> return value of 
> "org.apache.hive.service.cli.session.HiveSession.getSessionState()" is null
>         at 
> org.apache.hive.service.cli.operation.Operation.afterRun(Operation.java:270) 
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:288) 
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:558)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:532)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
>  ~[?:?]
>         at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         ... 17 more
> 2025-10-22T03:05:32,930 ERROR [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31 
> HiveServer2-Handler-Pool: Thread-152] thrift.ThriftCLIService: Failed to 
> close the session
> org.apache.hive.service.cli.HiveSQLException: Session does not exist: 
> SessionHandle [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31]
>         at 
> org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:625)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:240) 
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseSession(ThriftCLIService.java:611)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:?]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1620)
>  ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1600)
>  ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
>  ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>  ~[?:?]
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>  ~[?:?]
>         at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> {noformat}
> The fact that C3 closes the session that is open by another client leads to 
> race conditions and may also cause resource leaks. The error may not always 
> be the same with the one reported above since it really depends at which 
> stage of compilation/execution a session is closed/killed.
> When the queries are over ACID tables the abrupt session termination of S2 
> will fail to close the transaction that is opened for Q2 leading to a 
> transaction leak.
> The scenario is inspired from [Hue|https://gethue.com/] workflows where 
> session close can be initiated by user request or [configuration 
> settings|https://docs.gethue.com/administrator/configuration/server/#idle-session-timeout].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to