Duo Zhang created HBASE-29282:
---------------------------------

             Summary: Regions are left in CLOSED state after merging
                 Key: HBASE-29282
                 URL: https://issues.apache.org/jira/browse/HBASE-29282
             Project: HBase
          Issue Type: Bug
          Components: proc-v2, Region Assignment
            Reporter: Duo Zhang


When running ITBLL, some regions are left in CLOSED state for a long time and 
finally were cleaned up by CatalogJanitor.

After checking, the regions are merged, which should have been removed in 
hbase:meta, but seems they were still present in hbase:meta table with CLOSED 
state.

Need to dig more.

{noformat}
2025-05-01T00:08:32,903 INFO  [PEWorker-15] procedure2.ProcedureExecutor: 
Finished pid=3512, state=SUCCESS, hasLock=false; MergeTableRegionsProcedure 
table=IntegrationTestBigLinkedList, regions=[6a98dc86a491041b8d3ac584ac73c0a0, 
c9f07f77792feb0d8a845d6d9751f048], force=false in 734 msec
2025-05-01T00:11:26,333 WARN  [master/meta02:16000.Chore.1] 
janitor.CatalogJanitor: 
overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0./IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1.,
 
overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1./IntegrationTestBigLinkedList,\xA2!RV,1746028626716.c9f07f77792feb0d8a845d6d9751f048.
2025-05-01T00:41:40,856 WARN  [master/meta02:16000.Chore.1] 
janitor.CatalogJanitor: 283c738f170f361157b470868f6ad89., 
overlap=IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042178.815020ca73a2679bc0c0a298e4dddfda./IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042179.278a2eeee359488f859ac5334ee3cde0.,
 
overlap=IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042179.278a2eeee359488f859ac5334ee3cde0./IntegrationTestBigLinkedList,\x95U\x0D9}\xAB\xE1\x98\x80w\xED\xA7+\xF9\xA4\xED,1746029042178.b64120d20856552cd7d154b63bd2ce81.,
 
overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0./IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1.,
 
overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1./IntegrationTestBigLinkedList,\xA2!RV,1746028626716.c9f07f77792feb0d8a845d6d9751f048.
2025-05-01T00:42:00,853 INFO  [PEWorker-12] procedure.FlushRegionProcedure: 
State of region {ENCODED => 6a98dc86a491041b8d3ac584ac73c0a0, NAME => 
'IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0.',
 STARTKEY => '\x99\x99\x99\x99\x99\x99\x99\x99', ENDKEY => '\xA2!RV'} is not 
OPEN or in transition. Skip pid=5810, ppid=5789, state=RUNNABLE, hasLock=true; 
org.apache.hadoop.hbase.master.procedure.FlushRegionProcedure ...
2025-05-01T00:44:32,339 INFO  [PEWorker-3] procedure.MasterProcedureScheduler: 
Took xlock for pid=5964, ppid=5943, state=RUNNABLE, hasLock=false; 
SnapshotRegionProcedure 6a98dc86a491041b8d3ac584ac73c0a0
2025-05-01T00:44:32,340 WARN  [PEWorker-3] procedure.SnapshotRegionProcedure: 
pid=5964, ppid=5943, state=RUNNABLE, hasLock=true; SnapshotRegionProcedure 
6a98dc86a491041b8d3ac584ac73c0a0 can not run currently because region state of 
IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0.
 is CLOSED, wait 1000 ms to retry
{noformat}

{noformat}
2025-05-01 00:27:59,824 WARN [RPCClient-NioEventLoopGroup-1-2] 
org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator: Failed to locate 
region in 'IntegrationTestBigLinkedList', 
row='\xA6\x8B\x9E\xC1\xA98&K}g+7N/\xA1\x05', locateType=CURRENT
org.apache.hadoop.hbase.HBaseIOException: No location found for 
'IntegrationTestBigLinkedList', row='\xA6\x8B\x9E\xC1\xA98&K}g+7N/\xA1\x05', 
locateType=CURRENT
        at 
org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator.onScanNext(AsyncNonMetaRegionLocator.java:322)
        at 
org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator$1.onNext(AsyncNonMetaRegionLocator.java:437)
        at 
org.apache.hadoop.hbase.client.AsyncScanSingleRegionRpcRetryingCaller.onComplete(AsyncScanSingleRegionRpcRetryingCaller.java:535)
        at 
org.apache.hadoop.hbase.client.AsyncScanSingleRegionRpcRetryingCaller.start(AsyncScanSingleRegionRpcRetryingCaller.java:636)
        at 
org.apache.hadoop.hbase.client.AsyncRpcRetryingCallerFactory$ScanSingleRegionCallerBuilder.start(AsyncRpcRetryingCallerFactory.java:322)
        at 
org.apache.hadoop.hbase.client.AsyncClientScanner.startScan(AsyncClientScanner.java:208)
        at 
org.apache.hadoop.hbase.client.AsyncClientScanner.lambda$openScanner$2(AsyncClientScanner.java:268)
        at 
org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:71)
        at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
        at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.lambda$call$4(AsyncSingleRequestRpcRetryingCaller.java:92)
        at 
org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:71)
        at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
        at 
org.apache.hadoop.hbase.client.AsyncClientScanner.lambda$callOpenScanner$0(AsyncClientScanner.java:187)
        at 
org.apache.hbase.thirdparty.com.google.protobuf.RpcUtil$1.run(RpcUtil.java:56)
        at 
org.apache.hbase.thirdparty.com.google.protobuf.RpcUtil$1.run(RpcUtil.java:47)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:400)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:430)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
        at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:117)
        at org.apache.hadoop.hbase.ipc.Call.setResponse(Call.java:149)
        at 
org.apache.hadoop.hbase.ipc.RpcConnection.finishCall(RpcConnection.java:396)
        at 
org.apache.hadoop.hbase.ipc.RpcConnection.readResponse(RpcConnection.java:461)
        at 
org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.java:125)
        at 
org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.java:140)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
        at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at 
org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:289)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at 
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
        at 
org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at 
org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at 
org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at 
org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at 
org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at 
org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:840)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to