rmdmattingly opened a new pull request, #5461: URL: https://github.com/apache/hbase/pull/5461
On 2.x [the ServerManager registers admins in a HashMap](https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java). This can result in thread safety issues — we recently observed an exception which caused a region to be indefinitely stuck in transition until we could manually intervene. We saw the following exception in the HMaster logs: ``` 2023-10-11 02:20:05.213 [RSProcedureDispatcher-pool-325] ERROR org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher: Unexpected error caught, this may cause the procedure to hang forever java.lang.ClassCastException: class java.util.HashMap$Node cannot be cast to class java.util.HashMap$TreeNode (java.util.HashMap$Node and java.util.HashMap$TreeNode are in module java.base of loader 'bootstrap') at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1900) ~[?:?] at java.util.HashMap$TreeNode.treeify(HashMap.java:2016) ~[?:?] at java.util.HashMap.treeifyBin(HashMap.java:768) ~[?:?] at java.util.HashMap.putVal(HashMap.java:640) ~[?:?] at java.util.HashMap.put(HashMap.java:608) ~[?:?] at org.apache.hadoop.hbase.master.ServerManager.getRsAdmin(ServerManager.java:723) ``` cc @bbeaudreault @hgromer @eab148 @bozzkar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
