[ https://issues.apache.org/jira/browse/HBASE-25447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255841#comment-17255841 ]
Pankaj Kumar commented on HBASE-25447: -------------------------------------- Discussed with [~Bo Cui] offline, reigon was stuck in RIT until there is HM failover; since remoteProc is suspended due to OOM (unable to create new native thread) while dispatching the proc. Other chore services like CJ, QuotaObserverChore etc also failed with OOM. However it is env problem, It's better to abort HMaster so that healthy standby master will manage the HBase cluster operation after becoming active. > remoteProc is suspended due to OOM ERROR > ---------------------------------------- > > Key: HBASE-25447 > URL: https://issues.apache.org/jira/browse/HBASE-25447 > Project: HBase > Issue Type: Bug > Components: proc-v2 > Affects Versions: 3.0.0-alpha-1, 2.2.3 > Reporter: Bo Cui > Assignee: Bo Cui > Priority: Major > Attachments: image-2020-12-26-11-49-38-018.png > > > https://github.com/apache/hbase/blob/0f868da05d7ffabe4512a0cae110ed097b033ebf/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java#L317 > If resource leakage occurs due to other components or reasons, > BufferNode#dispatch() may fail. and then TimeoutExecutorThread will exit the > while (running.get()), and some procs will stuck... > !image-2020-12-26-11-49-38-018.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)