Stamatis Zampetakis created HIVE-29286:
------------------------------------------

             Summary: Session close from different clients/threads leaks 
resources
                 Key: HIVE-29286
                 URL: https://issues.apache.org/jira/browse/HIVE-29286
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
    Affects Versions: 4.2.0
            Reporter: Stamatis Zampetakis


Consider the following scenario where three clients/threads interact with 
HiveServer2 (HS2) using directly (not via JDBC/ODBC) the Thrift APIs. To 
showcase the problem we assume that the following properties are set.
{code:sql}
set hive.driver.parallel.compilation=true;
set hive.driver.parallel.compilation.global.limit=1;
set hive.server2.compile.lock.timeout=0s; 
{code}
C1: Connect with HS2 and start session with handle S1

C1: Send Q1, a slow-compilation query, using S1

HS2: Obtain the compilation lock and start compiling Q1

C2: Connect with HS2 and start session with handle S2

C2: Send Q2, a fast-compilation query, using S2

HS2: Q2 blocks, waiting for the compilation lock to become available.

C3: Connect with HS2 and close session with handle S2

C1, C2, C3 are different Thrift connections so they are handled by separate HS2 
threads. C3 will successfully close/kill/stop the session. However, since Q2 
was blocked in compilation it can't be stopped immediately. When C2 finally 
obtains the compilation lock and finishes, the operation will error out since 
various session related entries have been cleared out.

Thread-152 handles the requests from C2, and it reaches the end of compilation 
it throws the following errors.
{noformat}
2025-10-22T03:05:32,929 ERROR [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31 
HiveServer2-Handler-Pool: Thread-152] thrift.ThriftCLIService: Failed to 
execute statement [request: 
TExecuteStatementReq(sessionHandle:TSessionHandle(sessionId:THandleIdentifier(guid:87
 A0 53 CE F5 D8 4B AA BD BC 
8F D7 BC 1E EF 31, secret:C5 A3 62 BB 51 0B 4A 76 B6 7A 04 0B 38 5C 0A 5A)), 
statement:EXPLAIN SELECT AVG(age) FROM person GROUP BY name, confOverlay:{}, 
runAsync:false, queryTimeout:0)]
java.lang.RuntimeException: java.lang.NullPointerException: Cannot invoke 
"org.apache.hadoop.hive.ql.session.SessionState.getSessionId()" because the 
return value of 
"org.apache.hive.service.cli.session.HiveSession.getSessionState()" is null
        at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:89)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
 ~[?:?]
        at java.base/javax.security.auth.Subject.doAs(Subject.java:525) ~[?:?]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
 ~[hadoop-common-3.4.1.jar:?]
        at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at jdk.proxy2/jdk.proxy2.$Proxy43.executeStatement(Unknown Source) 
~[?:?]
        at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:281) 
~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:651)
 ~[hive-service-4.2.0-SNAPSHOT.jar:?]
        at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1670)
 ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1650)
 ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
 ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
 ~[?:?]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
 ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.lang.NullPointerException: Cannot invoke 
"org.apache.hadoop.hive.ql.session.SessionState.getSessionId()" because the 
return value of 
"org.apache.hive.service.cli.session.HiveSession.getSessionState()" is null
        at 
org.apache.hive.service.cli.operation.Operation.afterRun(Operation.java:270) 
~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:288) 
~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:558)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:532)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
 ~[?:?]
        at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
        at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        ... 17 more
2025-10-22T03:05:32,930 ERROR [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31 
HiveServer2-Handler-Pool: Thread-152] thrift.ThriftCLIService: Failed to close 
the session
org.apache.hive.service.cli.HiveSQLException: Session does not exist: 
SessionHandle [87a053ce-f5d8-4baa-bdbc-8fd7bc1eef31]
        at 
org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:625)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:240) 
~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.cli.thrift.ThriftCLIService.CloseSession(ThriftCLIService.java:611)
 ~[hive-service-4.2.0-SNAPSHOT.jar:?]
        at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1620)
 ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1600)
 ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
 ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
 ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
 ~[?:?]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
 ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
{noformat}
The fact that C3 closes the session that is open by another client leads to 
race conditions and may also cause resource leaks. The error may not always be 
the same with the one reported above since it really depends at which stage of 
compilation/execution a session is closed/killed.

When the queries are over ACID tables the abrupt session termination of S2 will 
fail to close the transaction that is opened for Q2 leading to a transaction 
leak.

The scenario is inspired from [Hue|https://gethue.com/] workflows where session 
close can be initiated by user request or [configuration 
settings|https://docs.gethue.com/administrator/configuration/server/#idle-session-timeout].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to