[ 
https://issues.apache.org/jira/browse/HIVE-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-20442:
----------------------------------
    Attachment: HIVE-20442.3-branch-1.2.patch
        Status: Patch Available  (was: Open)

> Hive stale lock when the hiveserver2 background thread died with NPE
> --------------------------------------------------------------------
>
>                 Key: HIVE-20442
>                 URL: https://issues.apache.org/jira/browse/HIVE-20442
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Transactions
>    Affects Versions: 2.1.1, 1.2.0
>         Environment: Hive-2.1
>            Reporter: Rajkumar Singh
>            Assignee: Rajkumar Singh
>            Priority: Major
>         Attachments: HIVE-20442.01.branch-2.patch, 
> HIVE-20442.1-branch-1.2.patch, HIVE-20442.2-branch-1.2.patch, 
> HIVE-20442.3-branch-1.2.patch
>
>
> this look like a race condition where background thread is not able to 
> release the lock it aquired.
> 1. hiveserver2 background thread request for lock
> {code}
> 2018-08-20T14:13:38,813 INFO  [HiveServer2-Background-Pool: Thread-XXXXX]: 
> lockmgr.DbLockManager (DbLockManager.java:lock(100)) - Requesting: 
> queryId=hive_xxxxxxx LockRequest(component:[LockComponent(type:SHARED_READ, 
> level:TABLE, dbname:testdb, tablename:test_table, operationType:SELECT)], 
> txnid:0, user:hive, hostname:HOSTNAME, agentInfo:hive_xxxxxxx)
> {code}
> 2. acquired the lock and start heartbeating
> {code}
> 2018-08-20T14:36:30,233 INFO  [HiveServer2-Background-Pool: Thread-XXXXX]: 
> lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(517)) - Started 
> heartbeat with delay/interval = 150000/150000             MILLISECONDS for 
> query: agentInfo:hive_xxxxxxx
> {code}
> 3. during time between event #1 and #2, client disconnected and deleteContext 
> cleanup the session dir
> {code}
> 2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-XXX]: 
> thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(136)) - 
> Session disconnected without closing properly.
> 2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-XXXX]: 
> thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(140)) - 
> Closing the session: SessionHandle [3be07faf-5544-4178-8b50-8173002b171a]
> 2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-XXXX]: 
> service.CompositeService (SessionManager.java:closeSession(363)) - Session 
> closed, SessionHandle [xxxxxxxxxxxxxxxxxxxxxxx], current sessions:2
> {code}
> 4. background thread died with NPE while trying to get the queryid 
> {code}
> java.lang.NullPointerException: null
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1568) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
>  [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
>  [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
>  [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
>         at java.security.AccessController.doPrivileged(Native Method) 
> [?:1.8.0_77]
>         at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_77]
> {code}
> did not get a chance to release the lock and heartbeater thread continue 
> heartbeat indefinately.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to