[
https://issues.apache.org/jira/browse/HBASE-29282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-29282.
-------------------------------
Fix Version/s: 2.6.3
2.5.12
Hadoop Flags: Reviewed
Resolution: Fixed
Pushed to all active branches.
Thanks [~reidchan] for reviewing!
> Regions are left in CLOSED state after merging
> ----------------------------------------------
>
> Key: HBASE-29282
> URL: https://issues.apache.org/jira/browse/HBASE-29282
> Project: HBase
> Issue Type: Bug
> Components: proc-v2, Region Assignment
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Critical
> Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12
>
>
> When running ITBLL, some regions are left in CLOSED state for a long time and
> finally were cleaned up by CatalogJanitor.
> After checking, the regions are merged, which should have been removed in
> hbase:meta, but seems they were still present in hbase:meta table with CLOSED
> state.
> Need to dig more.
> {noformat}
> 2025-05-01T00:08:32,903 INFO [PEWorker-15] procedure2.ProcedureExecutor:
> Finished pid=3512, state=SUCCESS, hasLock=false; MergeTableRegionsProcedure
> table=IntegrationTestBigLinkedList,
> regions=[6a98dc86a491041b8d3ac584ac73c0a0, c9f07f77792feb0d8a845d6d9751f048],
> force=false in 734 msec
> 2025-05-01T00:11:26,333 WARN [master/meta02:16000.Chore.1]
> janitor.CatalogJanitor:
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0./IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1.,
>
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1./IntegrationTestBigLinkedList,\xA2!RV,1746028626716.c9f07f77792feb0d8a845d6d9751f048.
> 2025-05-01T00:41:40,856 WARN [master/meta02:16000.Chore.1]
> janitor.CatalogJanitor: 283c738f170f361157b470868f6ad89.,
> overlap=IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042178.815020ca73a2679bc0c0a298e4dddfda./IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042179.278a2eeee359488f859ac5334ee3cde0.,
>
> overlap=IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042179.278a2eeee359488f859ac5334ee3cde0./IntegrationTestBigLinkedList,\x95U\x0D9}\xAB\xE1\x98\x80w\xED\xA7+\xF9\xA4\xED,1746029042178.b64120d20856552cd7d154b63bd2ce81.,
>
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0./IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1.,
>
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1./IntegrationTestBigLinkedList,\xA2!RV,1746028626716.c9f07f77792feb0d8a845d6d9751f048.
> 2025-05-01T00:42:00,853 INFO [PEWorker-12] procedure.FlushRegionProcedure:
> State of region {ENCODED => 6a98dc86a491041b8d3ac584ac73c0a0, NAME =>
> 'IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0.',
> STARTKEY => '\x99\x99\x99\x99\x99\x99\x99\x99', ENDKEY => '\xA2!RV'} is not
> OPEN or in transition. Skip pid=5810, ppid=5789, state=RUNNABLE,
> hasLock=true; org.apache.hadoop.hbase.master.procedure.FlushRegionProcedure
> ...
> 2025-05-01T00:44:32,339 INFO [PEWorker-3]
> procedure.MasterProcedureScheduler: Took xlock for pid=5964, ppid=5943,
> state=RUNNABLE, hasLock=false; SnapshotRegionProcedure
> 6a98dc86a491041b8d3ac584ac73c0a0
> 2025-05-01T00:44:32,340 WARN [PEWorker-3] procedure.SnapshotRegionProcedure:
> pid=5964, ppid=5943, state=RUNNABLE, hasLock=true; SnapshotRegionProcedure
> 6a98dc86a491041b8d3ac584ac73c0a0 can not run currently because region state
> of
> IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0.
> is CLOSED, wait 1000 ms to retry
> {noformat}
> {noformat}
> 2025-05-01 00:27:59,824 WARN [RPCClient-NioEventLoopGroup-1-2]
> org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator: Failed to locate
> region in 'IntegrationTestBigLinkedList',
> row='\xA6\x8B\x9E\xC1\xA98&K}g+7N/\xA1\x05', locateType=CURRENT
> org.apache.hadoop.hbase.HBaseIOException: No location found for
> 'IntegrationTestBigLinkedList', row='\xA6\x8B\x9E\xC1\xA98&K}g+7N/\xA1\x05',
> locateType=CURRENT
> at
> org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator.onScanNext(AsyncNonMetaRegionLocator.java:322)
> at
> org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator$1.onNext(AsyncNonMetaRegionLocator.java:437)
> at
> org.apache.hadoop.hbase.client.AsyncScanSingleRegionRpcRetryingCaller.onComplete(AsyncScanSingleRegionRpcRetryingCaller.java:535)
> at
> org.apache.hadoop.hbase.client.AsyncScanSingleRegionRpcRetryingCaller.start(AsyncScanSingleRegionRpcRetryingCaller.java:636)
> at
> org.apache.hadoop.hbase.client.AsyncRpcRetryingCallerFactory$ScanSingleRegionCallerBuilder.start(AsyncRpcRetryingCallerFactory.java:322)
> at
> org.apache.hadoop.hbase.client.AsyncClientScanner.startScan(AsyncClientScanner.java:208)
> at
> org.apache.hadoop.hbase.client.AsyncClientScanner.lambda$openScanner$2(AsyncClientScanner.java:268)
> at
> org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:71)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
> at
> org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.lambda$call$4(AsyncSingleRequestRpcRetryingCaller.java:92)
> at
> org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:71)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
> at
> org.apache.hadoop.hbase.client.AsyncClientScanner.lambda$callOpenScanner$0(AsyncClientScanner.java:187)
> at
> org.apache.hbase.thirdparty.com.google.protobuf.RpcUtil$1.run(RpcUtil.java:56)
> at
> org.apache.hbase.thirdparty.com.google.protobuf.RpcUtil$1.run(RpcUtil.java:47)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:400)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:430)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
> at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:117)
> at org.apache.hadoop.hbase.ipc.Call.setResponse(Call.java:149)
> at
> org.apache.hadoop.hbase.ipc.RpcConnection.finishCall(RpcConnection.java:396)
> at
> org.apache.hadoop.hbase.ipc.RpcConnection.readResponse(RpcConnection.java:461)
> at
> org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.java:125)
> at
> org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.java:140)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
> at
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
> at
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
> at
> org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:289)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
> at
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
> at
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
> at
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
> at
> org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
> at
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
> at
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
> at
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
> at
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
> at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
> at
> org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:840)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)