Josh Elser created PHOENIX-2496:
-----------------------------------
Summary: ShutdownHook preventing JVM from exiting after SIGTERM
Key: PHOENIX-2496
URL: https://issues.apache.org/jira/browse/PHOENIX-2496
Project: Phoenix
Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser
Fix For: 4.7.0
[~cartershanklin] pointed out to me that he got into a case where sending a
SIGTERM to the Phoenix QueryServer resulted in it not exiting. I've been able
to reproduce this.
1. Start HBase and PQS
2. Stop HBase master
3. Try to run a query through PQS
4. {{kill -15 <pqs_pid>}}
At this point, the thread from #3 is still running in PQS, trying to connect to
HBase (following the normal HBase retry policy which will retry for
order-minutes). The ShutdownHook, run as an attempt to cleanup nicely, gets
blocked trying to close the instance because the read lock is still held by the
step 3 query. The outward effect is that PQS stays up and running until HBase
becomes available or the HBase retries time out because the JVM will stay
running until all shutdown hooks return.
While the system will eventually fix itself, it's a bit awkward to send SIGTERM
to a process and not have it die within a few seconds. The code around the
shutdown hook registration certainly seems like blocking is unintentional too.
A simple fix is to wrap the PhoenixDriver closing in a timeout so that we don't
rely on the HBase timeout to exit the JVM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)