David Manning created HBASE-28663: ------------------------------------- Summary: CanaryTool continues executing and scanning after timeout Key: HBASE-28663 URL: https://issues.apache.org/jira/browse/HBASE-28663 Project: HBase Issue Type: Bug Components: canary Affects Versions: 2.0.0, 3.0.0 Reporter: David Manning Assignee: David Manning
If you run theĀ {{CanaryTool}} in region mode until it reaches the configured timeout, the logs and sink results will show that it can continue executing and scanning for 10 seconds. This is because the RegionTasks have already been submitted to an ExecutorService which continues execution after timeout, and the Monitor continues execution on a separate thread. The 10 seconds is seen in hbase 2.x, at least, because {{runMonitor}} will close the {{Connection}} and that process ([code|https://github.com/apache/hbase/blob/e865c852c0e9a1e9b55b9d1512d379072d3e7a7b/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L1054-L1094]) will lead to {{ConnectionImplementation#close}} ([code|https://github.com/apache/hbase/blob/e865c852c0e9a1e9b55b9d1512d379072d3e7a7b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L2272-L2300]) and inside {{shutdownPools}} we will potentially wait the full 10 seconds of {{awaitTermination}} if client operations are in progress. The scenario can be improved by simply interrupting the monitor thread, as we will often be in an {{invokeAll}} call in a {{sniff}} method, which will interrupt the client threads and generally shutdown properly and timely. However, we could be more robust by also watching for a shutdown signal in the various tasks such as {{{}RegionTask{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)