[jira] [Updated] (HBASE-15136) Explore different queuing behaviors while busy
[ https://issues.apache.org/jira/browse/HBASE-15136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-15136: -- Release Note: Previously RPC request scheduler in HBase had 2 modes in could operate in:https://translate.google.com/?source=gtx - simple FIFO - "partial" deadline, where deadline constraints are only imposed on long-running scan requests. This patch adds new type of scheduler to HBase, based on the research around controlled delay (CoDel) algorithm [1], used in networking to combat bufferbloat, as well as some analysis on generalizing it to generic request queues [2]. The purpose of that work is to prevent long standing call queues caused by discrepancy between request rate and available throughput, caused by kernel/disk IO/networking stalls. New RPC scheduler could be enabled by setting hbase.ipc.server.callqueue.type=codel in configuration. Several additional params allow to configure algorithm behavior - hbase.ipc.server.callqueue.codel.target.delay hbase.ipc.server.callqueue.codel.interval hbase.ipc.server.callqueue.codel.lifo.threshold [1] Controlling Queue Delay / A modern AQM is just one piece of the solution to bufferbloat. http://queue.acm.org/detail.cfm?id=2209336 [2] Fail at Scale / Reliability in the face of rapid change. http://queue.acm.org/detail.cfm?id=2839461 was: Previously RPC request scheduler in HBase had 2 modes in could operate in: - simple FIFO - "partial" deadline, where deadline constraints are only imposed on long-running scan requests. This patch adds new type of scheduler to HBase, based on the research around controlled delay (CoDel) algorithm [1], used in networking to combat bufferbloat, as well as some analysis on generalizing it to generic request queues [2]. The purpose of that work is to prevent long standing call queues caused by discrepancy between request rate and available throughput, caused by kernel/disk IO/networking stalls. New RPC scheduler could be enabled by setting hbase.ipc.server.callqueue.type=codel in configuration. Several additional params allow to configure algorithm behavior - hbase.ipc.server.callqueue.codel.target.delay hbase.ipc.server.callqueue.codel.interval hbase.ipc.server.callqueue.codel.lifo.threshold [1] Controlling Queue Delay / A modern AQM is just one piece of the solution to bufferbloat. http://queue.acm.org/detail.cfm?id=2209336 [2] Fail at Scale / Reliability in the face of rapid change. http://queue.acm.org/detail.cfm?id=2839461 > Explore different queuing behaviors while busy > -- > > Key: HBASE-15136 > URL: https://issues.apache.org/jira/browse/HBASE-15136 > Project: HBase > Issue Type: New Feature > Components: IPC/RPC, Scheduler >Reporter: Elliott Neil Clark >Assignee: Mikhail Antonov >Priority: Critical > Fix For: 1.3.0, 2.0.0 > > Attachments: HBASE-15136-1.2.v1.patch, HBASE-15136-v2.patch, > deadline_scheduler_v_0_2.patch > > > http://queue.acm.org/detail.cfm?id=2839461 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-20846: -- Description: Found this one when investigating ModifyTableProcedure got stuck while there was a MoveRegionProcedure going on after master restart. Though this issue can be solved by HBASE-20752. But I discovered something else. Before a MoveRegionProcedure can execute, it will hold the table's shared lock. so,, when a UnassignProcedure was spwaned, it will not check the table's shared lock since it is sure that its parent(MoveRegionProcedure) has aquired the table's lock. {code:java} // If there is parent procedure, it would have already taken xlock, so no need to take // shared lock here. Otherwise, take shared lock. if (!procedure.hasParent() && waitTableQueueSharedLock(procedure, table) == null) { return true; } {code} But, it is not the case when Master was restarted. The child procedure(UnassignProcedure) will be executed first after restart. Though it has a parent(MoveRegionProcedure), but apprently the parent didn't hold the table's lock. So, since it began to execute without hold the table's shared lock. A ModifyTableProcedure can aquire the table's exclusive lock and execute at the same time. Which is not possible if the master was not restarted. This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I wrote a simple UT to repo this case. I think we don't have to check the parent for table's shared lock. It is a shared lock, right? I think we can acquire it every time we need it. was: Found this one when investigating ModifyTableProcedure got stuck while there was a MoveRegionProcedure going on after master restart. Though this issue can be solved by HBASE-20752. But I discovered something else. Before a MoveRegionProcedure can execute, it will hold the table's shared lock. so,, when a UnassignProcedure was spwaned, it will not check the table's shared lock since it is sure that its parent(MoveRegionProcedure) has aquired the table's lock. {code:java} // If there is parent procedure, it would have already taken xlock, so no need to take // shared lock here. Otherwise, take shared lock. if (!procedure.hasParent() && waitTableQueueSharedLock(procedure, table) == null) { return true; } {code} But, it is not the case when Master was restarted. The child procedure(UnassignProcedure) will be executed first after restart. Though it has a parent(MoveRegionProcedure), but apprently the parent didn't hold the table's lock. So, since it began to execute without hold the table's shared lock. A ModifyTableProcedure can aquire the table's exclusive lock and execute at the same time. Which is not possible if the master was not restarted. This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I wrote a simple UT to repo this case. I think we don't have to check the parent for table's shared lock. It is a shared lock, right? I think we can acquire it every time we need it. > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0-alpha-1, 2.2.0, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, > HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, > HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, > HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold
[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata
[ https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-20447: -- Description: This is the issue I was originally having here: [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E] When we pread, we don't force the read to read all of the next block header. However, when we get into a race condition where two opener threads try to cache the same block and one thread read all of the next block header and the other one didn't, it will fail the open process. This is especially important in a splitting case where it will potentially fail the split process. Instead, in the caches, we should only fail if the required blocks are different. was: This is the issue I was originally having here: [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E] When we pread, we don't force the read to read all of the next block header. However, when we get into a race condition where two opener threads try to cache the same block and one thread read all of the next block header and the other one didn't, it will fail the open process. This is especially important in a splitting case where it will potentially fail the split process. Instead, in the caches, we should only fail if the required blocks are different. > Only fail cacheBlock if block collisions aren't related to next block metadata > -- > > Key: HBASE-20447 > URL: https://issues.apache.org/jira/browse/HBASE-20447 > Project: HBase > Issue Type: Bug > Components: BlockCache, BucketCache >Affects Versions: 1.4.3, 2.0.0 >Reporter: Zach York >Assignee: Zach York >Priority: Major > Fix For: 3.0.0-alpha-1, 2.1.0, 1.4.5 > > Attachments: HBASE-20447.branch-1.001.patch, > HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, > HBASE-20447.branch-1.004.patch, HBASE-20447.branch-1.005.patch, > HBASE-20447.branch-1.006.patch, HBASE-20447.master.001.patch, > HBASE-20447.master.002.patch, HBASE-20447.master.003.patch, > HBASE-20447.master.004.patch > > > This is the issue I was originally having here: > [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E] > > When we pread, we don't force the read to read all of the next block header. > However, when we get into a race condition where two opener threads try to > cache the same block and one thread read all of the next block header and the > other one didn't, it will fail the open process. This is especially important > in a splitting case where it will potentially fail the split process. > Instead, in the caches, we should only fail if the required blocks are > different. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-26708) Netty "leak detected" and OutOfDirectMemoryError due to direct memory buffering with SASL implementation
[ https://issues.apache.org/jira/browse/HBASE-26708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26708: -- Description: Under constant data ingestion, using default Netty based RpcServer and RpcClient implementation results in OutOfDirectMemoryError, supposedly caused by leaks detected by Netty's LeakDetector. {code:java} 2022-01-25 17:03:10,084 ERROR [S-EventLoopGroup-1-3] util.ResourceLeakDetector - java:115) org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.expandBTW. have you run the related test case?(ByteToMessageDecoder.java:538) org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:97) org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:274) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) java.lang.Thread.run(Thread.java:748) {code} {code:java} 2022-01-25 17:03:14,014 ERROR [S-EventLoopGroup-1-3] util.ResourceLeakDetector - apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507) org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446) org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) java.lang.Thread.run(Thread.java:748) {code} And finally handlers are removed from the pipeline due to OutOfDirectMemoryError: {code:java} 2022-01-25 17:36:28,657 WARN [S-EventLoopGroup-1-5] channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means
[jira] [Updated] (HBASE-27464) In memory compaction 'COMPACT' may cause data corruption when adding cells large than maxAlloc(default 256k) size.
[ https://issues.apache.org/jira/browse/HBASE-27464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-27464: -- Summary: In memory compaction 'COMPACT' may cause data corruption when adding cells large than maxAlloc(default 256k) size. (was: In memory compaction 'COMPACT' may cause data corruption when add cell bigger than maxAlloc(default 256k) size.) > In memory compaction 'COMPACT' may cause data corruption when adding cells > large than maxAlloc(default 256k) size. > -- > > Key: HBASE-27464 > URL: https://issues.apache.org/jira/browse/HBASE-27464 > Project: HBase > Issue Type: Bug > Components: in-memory-compaction >Reporter: zhuobin zheng >Priority: Critical > Attachments: image-2022-11-04-15-46-21-645.png > > > When init 'CellChunkImmutableSegment' for 'COMPACT' action, we not force copy > to current MSLab. > When cell size bigger than maxAlloc, cell will stay in previous chunk which > will recycle after segment replace, and we may read wrong data when these > chunk reused by others. > !image-2022-11-04-15-46-21-645.png! > > Timeline: > # add a cell 'A' bigger than 256K > # cell 'A' will copy to a chunk 'A' when first compact > # cell 'A' will retain in chunk 'A' when second compact > # chunk 'A' recycled after segment swap and close -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27464) In memory compaction 'COMPACT' may cause data corruption when add cell bigger than maxAlloc(default 256k) size.
[ https://issues.apache.org/jira/browse/HBASE-27464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-27464: -- Summary: In memory compaction 'COMPACT' may cause data corruption when add cell bigger than maxAlloc(default 256k) size. (was: In memory compaction 'COMPACT' may cause data mass when add cell bigger than maxAlloc(default 256k) size.) > In memory compaction 'COMPACT' may cause data corruption when add cell bigger > than maxAlloc(default 256k) size. > --- > > Key: HBASE-27464 > URL: https://issues.apache.org/jira/browse/HBASE-27464 > Project: HBase > Issue Type: Bug > Components: in-memory-compaction >Reporter: zhuobin zheng >Priority: Critical > Attachments: image-2022-11-04-15-46-21-645.png > > > When init 'CellChunkImmutableSegment' for 'COMPACT' action, we not force copy > to current MSLab. > When cell size bigger than maxAlloc, cell will stay in previous chunk which > will recycle after segment replace, and we may read wrong data when these > chunk reused by others. > !image-2022-11-04-15-46-21-645.png! > > Timeline: > # add a cell 'A' bigger than 256K > # cell 'A' will copy to a chunk 'A' when first compact > # cell 'A' will retain in chunk 'A' when second compact > # chunk 'A' recycled after segment swap and close -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27464) In memory compaction 'COMPACT' may cause data mass when add cell bigger than maxAlloc(default 256k) size.
zhuobin zheng created HBASE-27464: - Summary: In memory compaction 'COMPACT' may cause data mass when add cell bigger than maxAlloc(default 256k) size. Key: HBASE-27464 URL: https://issues.apache.org/jira/browse/HBASE-27464 Project: HBase Issue Type: Bug Components: in-memory-compaction Reporter: zhuobin zheng Attachments: image-2022-11-04-15-46-21-645.png When init 'CellChunkImmutableSegment' for 'COMPACT' action, we not force copy to current MSLab. When cell size bigger than maxAlloc, cell will stay in previous chunk which will recycle after segment replace, and we may read wrong data when these chunk reused by others. !image-2022-11-04-15-46-21-645.png! Timeline: # add a cell 'A' bigger than 256K # cell 'A' will copy to a chunk 'A' when first compact # cell 'A' will retain in chunk 'A' when second compact # chunk 'A' recycled after segment swap and close -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-26026) HBase Write may be stuck forever when using CompactingMemStore
[ https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26026: -- Description: Sometimes I observed that HBase Write might be stuck in my hbase cluster which enabling {{{}CompactingMemStore{}}}. I have simulated the problem by unit test in my PR. The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : {code:java} 425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell cellToAdd, 426 MemStoreSizing memstoreSizing) { 427if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) { 428 if (currActive.setInMemoryFlushed()) { 429flushInMemory(currActive); 430if (setInMemoryCompactionFlag()) { 431 // The thread is dispatched to do in-memory compaction in the background .. } {code} In line 427, {{shouldFlushInMemory}} checking if {{currActive.getDataSize}} adding the size of {{cellToAdd}} exceeds {{{}CompactingMemStore.inmemoryFlushSize{}}},if true, then {{currActive}} should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in line 428 : {code:java} public boolean setInMemoryFlushed() { return flushed.compareAndSet(false, true); } {code} After sucessfully set {{currActive.flushed}} to true, in above line 429 {{flushInMemory(currActive)}} invokes {{CompactingMemStore.pushActiveToPipeline}} : {code:java} protected void pushActiveToPipeline(MutableSegment currActive) { if (!currActive.isEmpty()) { pipeline.pushHead(currActive); resetActive(); } } {code} In above {{CompactingMemStore.pushActiveToPipeline}} method , if the {{currActive.cellSet}} is empty, then nothing is done. Due to concurrent writes and because we first add cell size to {{currActive.getDataSize}} and then actually add cell to {{{}currActive.cellSet{}}}, it is possible that {{currActive.getDataSize}} could not accommodate {{cellToAdd}} but {{currActive.cellSet}} is still empty if pending writes which not yet add cells to {{{}currActive.cellSet{}}}. So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is created, and new writes still continue target to {{{}currActive{}}}, but {{currActive.flushed}} is true, {{currActive}} could not enter {{flushInMemory(currActive)}} again,and new {{ActiveSegment}} could not be created forever ! In the end all writes would be stuck. In my opinion , once {{currActive.flushed}} is set true, it could not continue use as {{ActiveSegment}} , and because of concurrent pending writes, only after {{currActive.updatesLock.writeLock()}} is acquired(i.e. {{currActive.waitForUpdates}} is called) in {{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}} is empty or not. My fix is remove the {{if (!currActive.isEmpty())}} check here and left the check to background {{InMemoryCompactionRunnable}} after {{currActive.waitForUpdates}} is called. An alternative fix is we use synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent all writes , wait for all pending write completed(i.e. currActive.waitForUpdates is called) and if {{currActive}} is still empty ,then we set {{currActive.flushed}} back to false,but I am not inclined to use so heavy synchronization in write path, and I think we would better maintain lockless implementation for {{CompactingMemStore.add}} method just as now and {{currActive.waitForUpdates}} would better be left in background {{{}InMemoryCompactionRunnable{}}}. was: Sometimes I observed that HBase Write might be stuck in my hbase cluster which enabling {{CompactingMemStore}}. I have simulated the problem by unit test in my PR. The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : {code:java} 425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell cellToAdd, 426 MemStoreSizing memstoreSizing) { 427if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) { 428 if (currActive.setInMemoryFlushed()) { 429flushInMemory(currActive); 430if (setInMemoryCompactionFlag()) { 431 // The thread is dispatched to do in-memory compaction in the background .. } {code} In line 427, {{shouldFlushInMemory}} checking if {{currActive.getDataSize}} adding the size of {{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}},if true, then {{currActive}} should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in line 428 : {code:java} public boolean setInMemoryFlushed() { return flushed.compareAndSet(false, true); } {code} After sucessfully set {{currActive.flushed}} to true, in above line 429 {{flushInMemory(currActive)}} invokes {{CompactingMemStore.pushActiveToPipeline}} : {code:java} protected void pushActiveToPipeline(MutableSegment currActive) { if (!currActive.isEmpty()) { pipeline.pushHead(currActive);
[jira] [Assigned] (HBASE-26580) The message of StoreTooBusy is confused
[ https://issues.apache.org/jira/browse/HBASE-26580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-26580: - Assignee: zhuobin zheng > The message of StoreTooBusy is confused > --- > > Key: HBASE-26580 > URL: https://issues.apache.org/jira/browse/HBASE-26580 > Project: HBase > Issue Type: Task >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Trivial > > > When check Store limit. We both check parallelPutToStoreThreadLimit and > parallelPreparePutToStoreThreadLimit. > {code:java} > if (store.getCurrentParallelPutCount() > this.parallelPutToStoreThreadLimit > || preparePutCount > this.parallelPreparePutToStoreThreadLimit) { > tooBusyStore = (tooBusyStore == null ? > store.getColumnFamilyName() : > tooBusyStore + "," + store.getColumnFamilyName()); > } {code} > But we only print Above parallelPutToStoreThreadLimit only. > > > {code:java} > if (tooBusyStore != null) { > String msg = > "StoreTooBusy," + this.region.getRegionInfo().getRegionNameAsString() + > ":" + tooBusyStore > + " Above parallelPutToStoreThreadLimit(" + > this.parallelPutToStoreThreadLimit + ")"; > if (LOG.isTraceEnabled()) { > LOG.trace(msg); > } > throw new RegionTooBusyException(msg); > }{code} > It confused me a lot time .. > Just add message of parallelPreparePutToStoreThreadLimit > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26580) The message of StoreTooBusy is confused
[ https://issues.apache.org/jira/browse/HBASE-26580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26580: -- Priority: Trivial (was: Minor) > The message of StoreTooBusy is confused > --- > > Key: HBASE-26580 > URL: https://issues.apache.org/jira/browse/HBASE-26580 > Project: HBase > Issue Type: Task >Reporter: zhuobin zheng >Priority: Trivial > > > When check Store limit. We both check parallelPutToStoreThreadLimit and > parallelPreparePutToStoreThreadLimit. > {code:java} > if (store.getCurrentParallelPutCount() > this.parallelPutToStoreThreadLimit > || preparePutCount > this.parallelPreparePutToStoreThreadLimit) { > tooBusyStore = (tooBusyStore == null ? > store.getColumnFamilyName() : > tooBusyStore + "," + store.getColumnFamilyName()); > } {code} > But we only print Above parallelPutToStoreThreadLimit only. > > > {code:java} > if (tooBusyStore != null) { > String msg = > "StoreTooBusy," + this.region.getRegionInfo().getRegionNameAsString() + > ":" + tooBusyStore > + " Above parallelPutToStoreThreadLimit(" + > this.parallelPutToStoreThreadLimit + ")"; > if (LOG.isTraceEnabled()) { > LOG.trace(msg); > } > throw new RegionTooBusyException(msg); > }{code} > It confused me a lot time .. > Just add message of parallelPreparePutToStoreThreadLimit > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26580) The message of StoreTooBusy is confused
zhuobin zheng created HBASE-26580: - Summary: The message of StoreTooBusy is confused Key: HBASE-26580 URL: https://issues.apache.org/jira/browse/HBASE-26580 Project: HBase Issue Type: Task Reporter: zhuobin zheng When check Store limit. We both check parallelPutToStoreThreadLimit and parallelPreparePutToStoreThreadLimit. {code:java} if (store.getCurrentParallelPutCount() > this.parallelPutToStoreThreadLimit || preparePutCount > this.parallelPreparePutToStoreThreadLimit) { tooBusyStore = (tooBusyStore == null ? store.getColumnFamilyName() : tooBusyStore + "," + store.getColumnFamilyName()); } {code} But we only print Above parallelPutToStoreThreadLimit only. {code:java} if (tooBusyStore != null) { String msg = "StoreTooBusy," + this.region.getRegionInfo().getRegionNameAsString() + ":" + tooBusyStore + " Above parallelPutToStoreThreadLimit(" + this.parallelPutToStoreThreadLimit + ")"; if (LOG.isTraceEnabled()) { LOG.trace(msg); } throw new RegionTooBusyException(msg); }{code} It confused me a lot time .. Just add message of parallelPreparePutToStoreThreadLimit -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26579) Set storage policy of recovered edits when wal storage type is configured
[ https://issues.apache.org/jira/browse/HBASE-26579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26579: -- Status: Patch Available (was: Open) > Set storage policy of recovered edits when wal storage type is configured > -- > > Key: HBASE-26579 > URL: https://issues.apache.org/jira/browse/HBASE-26579 > Project: HBase > Issue Type: Improvement > Components: Recovery >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > > In our cluster, we has many SSD and a little HDD. (Most table configured > storage policy ONE_SSD, and all wals is configured ALL_SSD) > when all cluster down, It's difficult to recovery cluster. Because HDD Disk > IO bottleneck (Almost all disk io util is 100%). > I think the most hdfs operation when recovery is split wal to recovered edits > dir, And read it. > And it goes better when i stop hbase and set all recovered.edits to ALL_SSD. > So we can get benifit of recovery time if we set recovered.edits dir to > better storage like WAL. > Now i reuse config item hbase.wal.storage.policy to set recovered.edits > storage type. Because I did not find a scenario where they use different > storage Policy -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HBASE-26579) Set storage policy of recovered edits when wal storage type is configured
[ https://issues.apache.org/jira/browse/HBASE-26579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-26579: - Assignee: zhuobin zheng > Set storage policy of recovered edits when wal storage type is configured > -- > > Key: HBASE-26579 > URL: https://issues.apache.org/jira/browse/HBASE-26579 > Project: HBase > Issue Type: Improvement > Components: Recovery >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > > In our cluster, we has many SSD and a little HDD. (Most table configured > storage policy ONE_SSD, and all wals is configured ALL_SSD) > when all cluster down, It's difficult to recovery cluster. Because HDD Disk > IO bottleneck (Almost all disk io util is 100%). > I think the most hdfs operation when recovery is split wal to recovered edits > dir, And read it. > And it goes better when i stop hbase and set all recovered.edits to ALL_SSD. > So we can get benifit of recovery time if we set recovered.edits dir to > better storage like WAL. > Now i reuse config item hbase.wal.storage.policy to set recovered.edits > storage type. Because I did not find a scenario where they use different > storage Policy -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26579) Set storage policy of recovered edits when wal storage type is configured
zhuobin zheng created HBASE-26579: - Summary: Set storage policy of recovered edits when wal storage type is configured Key: HBASE-26579 URL: https://issues.apache.org/jira/browse/HBASE-26579 Project: HBase Issue Type: Improvement Components: Recovery Reporter: zhuobin zheng In our cluster, we has many SSD and a little HDD. (Most table configured storage policy ONE_SSD, and all wals is configured ALL_SSD) when all cluster down, It's difficult to recovery cluster. Because HDD Disk IO bottleneck (Almost all disk io util is 100%). I think the most hdfs operation when recovery is split wal to recovered edits dir, And read it. And it goes better when i stop hbase and set all recovered.edits to ALL_SSD. So we can get benifit of recovery time if we set recovered.edits dir to better storage like WAL. Now i reuse config item hbase.wal.storage.policy to set recovered.edits storage type. Because I did not find a scenario where they use different storage Policy -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases
[ https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449762#comment-17449762 ] zhuobin zheng commented on HBASE-26482: --- Push MR [(https://github.com/apache/hbase/pull/3887)|https://github.com/apache/hbase/pull/3887] for branch-1. There is a problem. We can't update cversion of root queuesZnode atomically when hbase.zookeeper.useMulti is set false. Now, I only fixed this problem when hbase.zookeeper.useMulti true. (default is true) Another way to totally solve this problem: Check cversion of /hbase/replication/rs and all znodes of /hbase/replication/rs/${servername} when master clean. But this way will cause the code of branch-1 different with master. I don't know which is better. > HMaster may clean wals that is replicating in rare cases > > > Key: HBASE-26482 > URL: https://issues.apache.org/jira/browse/HBASE-26482 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Critical > Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9 > > > In our cluster, i can found some FileNotFoundException when > ReplicationSourceWALReader running for replication recovery queue. > I guss the wal most likely removed by hmaste. And i found something to > support it. > The method getAllWALs: > [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509 > > |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use > zk cversion of /hbase/replication/rs as an optimistic lock to control > concurrent ops. > But, zk cversion *only can only reflect the changes of child nodes, but not > the changes of grandchildren.* > So, HMaster may loss some wal from this method in follow situation. > # HMaster do log clean , and invoke getAllWALs to filter log which should > not be deleted. > # HMaster cache current cversion of /hbase/replication/rs as *v0* > # HMaster cache all RS server name, and traverse them, get the WAL in each > Queue > # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2* > # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now > # By the way , the cversion of /hbase/replication/rs not changed before all > of *RS2* queue is removed, because the children of /hbase/replication/rs not > change. > # So, Hmaster will lost the wals in *peerid-RS2,* because we have already > traversed *RS1 ,* and ** this queue not exists in *RS2* > The above expression is currently only speculation, not confirmed > Flie Not Found Log. > > {code:java} > // code placeholder > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.WALEntryStream: Couldn't locate log: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.ReplicationSourceWALReader: Failed to read stream of > replication entries > java.io.FileNotFoundException: File does not exist: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303) > at >
[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases
[ https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448898#comment-17448898 ] zhuobin zheng commented on HBASE-26482: --- [~shahrs87] It seems yes. The code in branch-1 also has the same problem. I will test and submit a patch later. > HMaster may clean wals that is replicating in rare cases > > > Key: HBASE-26482 > URL: https://issues.apache.org/jira/browse/HBASE-26482 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Critical > Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9 > > > In our cluster, i can found some FileNotFoundException when > ReplicationSourceWALReader running for replication recovery queue. > I guss the wal most likely removed by hmaste. And i found something to > support it. > The method getAllWALs: > [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509 > > |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use > zk cversion of /hbase/replication/rs as an optimistic lock to control > concurrent ops. > But, zk cversion *only can only reflect the changes of child nodes, but not > the changes of grandchildren.* > So, HMaster may loss some wal from this method in follow situation. > # HMaster do log clean , and invoke getAllWALs to filter log which should > not be deleted. > # HMaster cache current cversion of /hbase/replication/rs as *v0* > # HMaster cache all RS server name, and traverse them, get the WAL in each > Queue > # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2* > # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now > # By the way , the cversion of /hbase/replication/rs not changed before all > of *RS2* queue is removed, because the children of /hbase/replication/rs not > change. > # So, Hmaster will lost the wals in *peerid-RS2,* because we have already > traversed *RS1 ,* and ** this queue not exists in *RS2* > The above expression is currently only speculation, not confirmed > Flie Not Found Log. > > {code:java} > // code placeholder > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.WALEntryStream: Couldn't locate log: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.ReplicationSourceWALReader: Failed to read stream of > replication entries > java.io.FileNotFoundException: File does not exist: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192) > at
[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases
[ https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448075#comment-17448075 ] zhuobin zheng commented on HBASE-26482: --- JIRA: HBASE-12865 Has fixed this problem. HBASE-12865 delete *rsZnode* after all queue claimed. Old method will claim all queue in rs. So we can delete rs Znode after all queue claimed But now, In this method, we only claim one queue, so we can't delete *rsZnode.* I submit a simple patch to fix it by add ops(create znode, delete znode) to multiOp for zk(for update cversion). > HMaster may clean wals that is replicating in rare cases > > > Key: HBASE-26482 > URL: https://issues.apache.org/jira/browse/HBASE-26482 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: zhuobin zheng >Priority: Critical > > In our cluster, i can found some FileNotFoundException when > ReplicationSourceWALReader running for replication recovery queue. > I guss the wal most likely removed by hmaste. And i found something to > support it. > The method getAllWALs: > [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509 > > |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use > zk cversion of /hbase/replication/rs as an optimistic lock to control > concurrent ops. > But, zk cversion *only can only reflect the changes of child nodes, but not > the changes of grandchildren.* > So, HMaster may loss some wal from this method in follow situation. > # HMaster do log clean , and invoke getAllWALs to filter log which should > not be deleted. > # HMaster cache current cversion of /hbase/replication/rs as *v0* > # HMaster cache all RS server name, and traverse them, get the WAL in each > Queue > # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2* > # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now > # By the way , the cversion of /hbase/replication/rs not changed before all > of *RS2* queue is removed, because the children of /hbase/replication/rs not > change. > # So, Hmaster will lost the wals in *peerid-RS2,* because we have already > traversed *RS1 ,* and ** this queue not exists in *RS2* > The above expression is currently only speculation, not confirmed > Flie Not Found Log. > > {code:java} > // code placeholder > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.WALEntryStream: Couldn't locate log: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.ReplicationSourceWALReader: Failed to read stream of > replication entries > java.io.FileNotFoundException: File does not exist: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175) > at >
[jira] [Assigned] (HBASE-26414) Tracing INSTRUMENTATION_NAME is incorrect
[ https://issues.apache.org/jira/browse/HBASE-26414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-26414: - Assignee: Nick Dimiduk (was: zhuobin zheng) > Tracing INSTRUMENTATION_NAME is incorrect > - > > Key: HBASE-26414 > URL: https://issues.apache.org/jira/browse/HBASE-26414 > Project: HBase > Issue Type: Bug > Components: tracing >Affects Versions: 2.5.0, 3.0.0-alpha-2 >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Blocker > Fix For: 2.5.0, 3.0.0-alpha-2 > > > I believe the value we use for {{TraceUtil#INSTRUMENTATION_NAME}}, > {{"io.opentelemetry.contrib.hbase"}}, is incorrect. According to the java > docs, > {noformat} >* @param instrumentationName The name of the instrumentation library, not > the name of the >* instrument*ed* library (e.g., "io.opentelemetry.contrib.mongodb"). > Must not be null. > {noformat} > This namespace appears to be reserved for implementations shipped by the otel > project, found under > https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation > I don't have a suggestion for a suitable name at this time. Will report back. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases
[ https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17447810#comment-17447810 ] zhuobin zheng commented on HBASE-26482: --- One way to solve the problem: We compare cversion of znode /hbase/replication/rs and all znodes of /hbase/replication/rs/${servername}. Because we only focus on regionserver add and replication queue add. We don't care wal add/remove under replicaiton queue. > HMaster may clean wals that is replicating in rare cases > > > Key: HBASE-26482 > URL: https://issues.apache.org/jira/browse/HBASE-26482 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: zhuobin zheng >Priority: Critical > > In our cluster, i can found some FileNotFoundException when > ReplicationSourceWALReader running for replication recovery queue. > I guss the wal most likely removed by hmaste. And i found something to > support it. > The method getAllWALs: > [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509 > > |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use > zk cversion of /hbase/replication/rs as an optimistic lock to control > concurrent ops. > But, zk cversion *only can only reflect the changes of child nodes, but not > the changes of grandchildren.* > So, HMaster may loss some wal from this method in follow situation. > # HMaster do log clean , and invoke getAllWALs to filter log which should > not be deleted. > # HMaster cache current cversion of /hbase/replication/rs as *v0* > # HMaster cache all RS server name, and traverse them, get the WAL in each > Queue > # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2* > # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now > # By the way , the cversion of /hbase/replication/rs not changed before all > of *RS2* queue is removed, because the children of /hbase/replication/rs not > change. > # So, Hmaster will lost the wals in *peerid-RS2,* because we have already > traversed *RS1 ,* and ** this queue not exists in *RS2* > The above expression is currently only speculation, not confirmed > Flie Not Found Log. > > {code:java} > // code placeholder > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.WALEntryStream: Couldn't locate log: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.ReplicationSourceWALReader: Failed to read stream of > replication entries > java.io.FileNotFoundException: File does not exist: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101) > at >
[jira] [Updated] (HBASE-26482) HMaster may clean wals that is replicating in rare cases
[ https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26482: -- Summary: HMaster may clean wals that is replicating in rare cases (was: HMaster may clean replication wals in rare cases) > HMaster may clean wals that is replicating in rare cases > > > Key: HBASE-26482 > URL: https://issues.apache.org/jira/browse/HBASE-26482 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: zhuobin zheng >Priority: Critical > > In our cluster, i can found some FileNotFoundException when > ReplicationSourceWALReader running for replication recovery queue. > I guss the wal most likely removed by hmaste. And i found something to > support it. > The method getAllWALs: > [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509 > > |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use > zk cversion of /hbase/replication/rs as an optimistic lock to control > concurrent ops. > But, zk cversion *only can only reflect the changes of child nodes, but not > the changes of grandchildren.* > So, HMaster may loss some wal from this method in follow situation. > # HMaster do log clean , and invoke getAllWALs to filter log which should > not be deleted. > # HMaster cache current cversion of /hbase/replication/rs as *v0* > # HMaster cache all RS server name, and traverse them, get the WAL in each > Queue > # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2* > # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now > # By the way , the cversion of /hbase/replication/rs not changed before all > of *RS2* queue is removed, because the children of /hbase/replication/rs not > change. > # So, Hmaster will lost the wals in *peerid-RS2,* because we have already > traversed *RS1 ,* and ** this queue not exists in *RS2* > The above expression is currently only speculation, not confirmed > Flie Not Found Log. > > {code:java} > // code placeholder > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.WALEntryStream: Couldn't locate log: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > 2021-11-22 15:18:39,593 ERROR > [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] > regionserver.ReplicationSourceWALReader: Failed to read stream of > replication entries > java.io.FileNotFoundException: File does not exist: > hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175) > at > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192) > at >
[jira] [Created] (HBASE-26482) HMaster may clean replication wals in rare cases
zhuobin zheng created HBASE-26482: - Summary: HMaster may clean replication wals in rare cases Key: HBASE-26482 URL: https://issues.apache.org/jira/browse/HBASE-26482 Project: HBase Issue Type: Bug Components: Replication Reporter: zhuobin zheng In our cluster, i can found some FileNotFoundException when ReplicationSourceWALReader running for replication recovery queue. I guss the wal most likely removed by hmaste. And i found something to support it. The method getAllWALs: [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509 |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use zk cversion of /hbase/replication/rs as an optimistic lock to control concurrent ops. But, zk cversion *only can only reflect the changes of child nodes, but not the changes of grandchildren.* So, HMaster may loss some wal from this method in follow situation. # HMaster do log clean , and invoke getAllWALs to filter log which should not be deleted. # HMaster cache current cversion of /hbase/replication/rs as *v0* # HMaster cache all RS server name, and traverse them, get the WAL in each Queue # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2* # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now # By the way , the cversion of /hbase/replication/rs not changed before all of *RS2* queue is removed, because the children of /hbase/replication/rs not change. # So, Hmaster will lost the wals in *peerid-RS2,* because we have already traversed *RS1 ,* and ** this queue not exists in *RS2* The above expression is currently only speculation, not confirmed Flie Not Found Log. {code:java} // code placeholder 2021-11-22 15:18:39,593 ERROR [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] regionserver.WALEntryStream: Couldn't locate log: hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 2021-11-22 15:18:39,593 ERROR [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348] regionserver.ReplicationSourceWALReader: Failed to read stream of replication entries java.io.FileNotFoundException: File does not exist: hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704 at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:138) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HBASE-26414) Tracing INSTRUMENTATION_NAME is incorrect
[ https://issues.apache.org/jira/browse/HBASE-26414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-26414: - Assignee: zhuobin zheng (was: Nick Dimiduk) > Tracing INSTRUMENTATION_NAME is incorrect > - > > Key: HBASE-26414 > URL: https://issues.apache.org/jira/browse/HBASE-26414 > Project: HBase > Issue Type: Bug > Components: tracing >Affects Versions: 2.5.0, 3.0.0-alpha-2 >Reporter: Nick Dimiduk >Assignee: zhuobin zheng >Priority: Blocker > Fix For: 2.5.0, 3.0.0-alpha-2 > > > I believe the value we use for {{TraceUtil#INSTRUMENTATION_NAME}}, > {{"io.opentelemetry.contrib.hbase"}}, is incorrect. According to the java > docs, > {noformat} >* @param instrumentationName The name of the instrumentation library, not > the name of the >* instrument*ed* library (e.g., "io.opentelemetry.contrib.mongodb"). > Must not be null. > {noformat} > This namespace appears to be reserved for implementations shipped by the otel > project, found under > https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation > I don't have a suggestion for a suitable name at this time. Will report back. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
[ https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26467: -- Status: Patch Available (was: In Progress) https://github.com/apache/hbase/pull/3858 > Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size > bigger than data chunk size > -- > > Key: HBASE-26467 > URL: https://issues.apache.org/jira/browse/HBASE-26467 > Project: HBase > Issue Type: Bug >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Critical > > In our company 2.X cluster. I found some region compaction keeps failling > because some cell can't construct succefully. In fact , we even can't read > these cell. > From follow stack , we can found the bug cause KeyValue can't constructed. > Simple Log and Stack: > {code:java} > // code placeholder > 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] > regionserver.CompactSplit: Compaction failed > region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., > storeName=c, priority=-319, startTime=1637225447127 > java.lang.IllegalArgumentException: Invalid tag length at position=4659867, > tagLength=0, > at > org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685) > at > org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345) > at > org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468) > at > org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) {code} > From further observation, I found the following characteristics: > # Cell size more than 2M > # We can reproduce the bug only after in memory compact > # Cell bytes end with \x00\x02\x00\x00 > > In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) > which only invoked when cell bigger than data chunk size construct cell with > wrong length. So there are 4 bytes (chunk head size) append end of the cell > bytes. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work started] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
[ https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-26467 started by zhuobin zheng. - > Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size > bigger than data chunk size > -- > > Key: HBASE-26467 > URL: https://issues.apache.org/jira/browse/HBASE-26467 > Project: HBase > Issue Type: Bug >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Critical > > In our company 2.X cluster. I found some region compaction keeps failling > because some cell can't construct succefully. In fact , we even can't read > these cell. > From follow stack , we can found the bug cause KeyValue can't constructed. > Simple Log and Stack: > {code:java} > // code placeholder > 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] > regionserver.CompactSplit: Compaction failed > region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., > storeName=c, priority=-319, startTime=1637225447127 > java.lang.IllegalArgumentException: Invalid tag length at position=4659867, > tagLength=0, > at > org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685) > at > org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345) > at > org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468) > at > org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) {code} > From further observation, I found the following characteristics: > # Cell size more than 2M > # We can reproduce the bug only after in memory compact > # Cell bytes end with \x00\x02\x00\x00 > > In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) > which only invoked when cell bigger than data chunk size construct cell with > wrong length. So there are 4 bytes (chunk head size) append end of the cell > bytes. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
[ https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26467: -- Description: In our company 2.X cluster. I found some region compaction keeps failling because some cell can't construct succefully. In fact , we even can't read these cell. >From follow stack , we can found the bug cause KeyValue can't constructed. Simple Log and Stack: {code:java} // code placeholder 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] regionserver.CompactSplit: Compaction failed region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., storeName=c, priority=-319, startTime=1637225447127 java.lang.IllegalArgumentException: Invalid tag length at position=4659867, tagLength=0, at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685) at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643) at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345) at org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487) at org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {code} >From further observation, I found the following characteristics: # Cell size more than 2M # We can reproduce the bug only after in memory compact # Cell bytes end with \x00\x02\x00\x00 In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) which only invoked when cell bigger than data chunk size construct cell with wrong length. So there are 4 bytes (chunk head size) append end of the cell bytes. was: In our company 2.X cluster. I found some region compaction keeps failling because some cell can't construct succefully. In fact , we even can't read these cell. >From follow stack , we can found the bug cause KeyValue can't constructed. Simple Log and Stack: {code:java} // code placeholder 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] regionserver.CompactSplit: Compaction failed region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., storeName=c, priority=-319, startTime=1637225447127 java.lang.IllegalArgumentException: Invalid tag length at position=4659867, tagLength=0, at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685) at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643) at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345) at org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487) at org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
[jira] [Assigned] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
[ https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-26467: - Assignee: zhuobin zheng > Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size > bigger than data chunk size > -- > > Key: HBASE-26467 > URL: https://issues.apache.org/jira/browse/HBASE-26467 > Project: HBase > Issue Type: Bug >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Critical > > In our company 2.X cluster. I found some region compaction keeps failling > because some cell can't construct succefully. In fact , we even can't read > these cell. > From follow stack , we can found the bug cause KeyValue can't constructed. > Simple Log and Stack: > {code:java} > // code placeholder > 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] > regionserver.CompactSplit: Compaction failed > region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., > storeName=c, priority=-319, startTime=1637225447127 > java.lang.IllegalArgumentException: Invalid tag length at position=4659867, > tagLength=0, > at > org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685) > at > org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345) > at > org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468) > at > org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) {code} > From further observation, I found the following characteristics: > # Cell size more than 2M > # We can reproduce the bug only after in memory compact > # Cell bytes end with \x00\x02\x00\x00 > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size
zhuobin zheng created HBASE-26467: - Summary: Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size Key: HBASE-26467 URL: https://issues.apache.org/jira/browse/HBASE-26467 Project: HBase Issue Type: Bug Reporter: zhuobin zheng In our company 2.X cluster. I found some region compaction keeps failling because some cell can't construct succefully. In fact , we even can't read these cell. >From follow stack , we can found the bug cause KeyValue can't constructed. Simple Log and Stack: {code:java} // code placeholder 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] regionserver.CompactSplit: Compaction failed region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f., storeName=c, priority=-319, startTime=1637225447127 java.lang.IllegalArgumentException: Invalid tag length at position=4659867, tagLength=0, at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685) at org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643) at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345) at org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487) at org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {code} >From further observation, I found the following characteristics: # Cell size more than 2M # We can reproduce the bug only after in memory compact # Cell bytes end with \x00\x02\x00\x00 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26022) DNS jitter causes hbase client to get stuck
[ https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368577#comment-17368577 ] zhuobin zheng commented on HBASE-26022: --- In *master branch*, it seem like RpcClient will dynamic generate server principal before create saslClient everyTime. So, it's not a problem. But it seems to be a problem too in branch-1. I will try to fix it latter. > DNS jitter causes hbase client to get stuck > --- > > Key: HBASE-26022 > URL: https://issues.apache.org/jira/browse/HBASE-26022 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > > In our product hbase cluster, we occasionally encounter below errors, and > stuck hbase a long time. Then hbase requests to this machine will fail > forever. > {code:java} > WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:${user@realm} (auth:KERBEROS) > cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Server not > found in Kerberos database (7) - LOOKING_UP_SERVER)] > WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:${user@realm} (auth:KERBEROS) > cause:java.io.IOException: Couldn't setup connection for ${user@realm} to > hbase/${ip}@realm > {code} > The main problem is the trully server principal we generated in KDC is > hbase/*${hostname}*@realm, so we must can't find hbase/*${ip}*@realm in KDC. > When RpcClientImpl#Connection construct, the field serverPrincial which never > changed generated by method InetAddress.getCanonicalHostName() which will > return IP when failed to get hostname. > Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will > never setup sasl env. And I'm not see connection abandon logic in sasl failed > code path. > I think of two solutions to this problem: > # Abandon connection when sasl failed. So next request will reconstruct a > connection, and will regenerate a new server principal. > # Refresh serverPrincial field when sasl failed. So next retry will use new > server principal. > HBase Version: 1.2.0-cdh5.14.4 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-26022) DNS jitter causes hbase client to get stuck
[ https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-26022: - Assignee: zhuobin zheng > DNS jitter causes hbase client to get stuck > --- > > Key: HBASE-26022 > URL: https://issues.apache.org/jira/browse/HBASE-26022 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > > In our product hbase cluster, we occasionally encounter below errors, and > stuck hbase a long time. Then hbase requests to this machine will fail > forever. > {code:java} > WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:${user@realm} (auth:KERBEROS) > cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Server not > found in Kerberos database (7) - LOOKING_UP_SERVER)] > WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:${user@realm} (auth:KERBEROS) > cause:java.io.IOException: Couldn't setup connection for ${user@realm} to > hbase/${ip}@realm > {code} > The main problem is the trully server principal we generated in KDC is > hbase/*${hostname}*@realm, so we must can't find hbase/*${ip}*@realm in KDC. > When RpcClientImpl#Connection construct, the field serverPrincial which never > changed generated by method InetAddress.getCanonicalHostName() which will > return IP when failed to get hostname. > Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will > never setup sasl env. And I'm not see connection abandon logic in sasl failed > code path. > I think of two solutions to this problem: > # Abandon connection when sasl failed. So next request will reconstruct a > connection, and will regenerate a new server principal. > # Refresh serverPrincial field when sasl failed. So next retry will use new > server principal. > HBase Version: 1.2.0-cdh5.14.4 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26022) DNS jitter causes hbase client to get stuck
[ https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26022: -- Description: In our product hbase cluster, we occasionally encounter below errors, and stuck hbase a long time. Then hbase requests to this machine will fail forever. {code:java} WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:${user@realm} (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)] WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:${user@realm} (auth:KERBEROS) cause:java.io.IOException: Couldn't setup connection for ${user@realm} to hbase/${ip}@realm {code} The main problem is the trully server principal we generated in KDC is hbase/*${hostname}*@realm, so we must can't find hbase/*${ip}*@realm in KDC. When RpcClientImpl#Connection construct, the field serverPrincial which never changed generated by method InetAddress.getCanonicalHostName() which will return IP when failed to get hostname. Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will never setup sasl env. And I'm not see connection abandon logic in sasl failed code path. I think of two solutions to this problem: # Abandon connection when sasl failed. So next request will reconstruct a connection, and will regenerate a new server principal. # Refresh serverPrincial field when sasl failed. So next retry will use new server principal. HBase Version: 1.2.0-cdh5.14.4 was: In our product hbase cluster, we occasionally encounter errors > DNS jitter causes hbase client to get stuck > --- > > Key: HBASE-26022 > URL: https://issues.apache.org/jira/browse/HBASE-26022 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: zhuobin zheng >Priority: Major > > In our product hbase cluster, we occasionally encounter below errors, and > stuck hbase a long time. Then hbase requests to this machine will fail > forever. > {code:java} > WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:${user@realm} (auth:KERBEROS) > cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Server not > found in Kerberos database (7) - LOOKING_UP_SERVER)] > WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:${user@realm} (auth:KERBEROS) > cause:java.io.IOException: Couldn't setup connection for ${user@realm} to > hbase/${ip}@realm > {code} > The main problem is the trully server principal we generated in KDC is > hbase/*${hostname}*@realm, so we must can't find hbase/*${ip}*@realm in KDC. > When RpcClientImpl#Connection construct, the field serverPrincial which never > changed generated by method InetAddress.getCanonicalHostName() which will > return IP when failed to get hostname. > Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will > never setup sasl env. And I'm not see connection abandon logic in sasl failed > code path. > I think of two solutions to this problem: > # Abandon connection when sasl failed. So next request will reconstruct a > connection, and will regenerate a new server principal. > # Refresh serverPrincial field when sasl failed. So next retry will use new > server principal. > HBase Version: 1.2.0-cdh5.14.4 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26022) DNS jitter causes hbase client to get stuck
[ https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-26022: -- Description: In our product hbase cluster, we occasionally encounter errors > DNS jitter causes hbase client to get stuck > --- > > Key: HBASE-26022 > URL: https://issues.apache.org/jira/browse/HBASE-26022 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: zhuobin zheng >Priority: Major > > In our product hbase cluster, we occasionally encounter errors > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-26022) DNS jitter causes hbase client to get stuck
zhuobin zheng created HBASE-26022: - Summary: DNS jitter causes hbase client to get stuck Key: HBASE-26022 URL: https://issues.apache.org/jira/browse/HBASE-26022 Project: HBase Issue Type: Bug Affects Versions: 1.2.0 Reporter: zhuobin zheng -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-21183) loadIncrementalHFiles sometimes throws FileNotFoundException on retry
[ https://issues.apache.org/jira/browse/HBASE-21183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297137#comment-17297137 ] zhuobin zheng commented on HBASE-21183: --- May caused by https://issues.apache.org/jira/browse/HBASE-19065. # client request bulkload and server move file to /hbase/data/${namespace}/\{table}/\{region}/\{columnfamily}/ # concurrent flush cause bulkload failed # bulkload client want retry and failed because file is not exists. > loadIncrementalHFiles sometimes throws FileNotFoundException on retry > - > > Key: HBASE-21183 > URL: https://issues.apache.org/jira/browse/HBASE-21183 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Tim Robertson >Priority: Major > > On a nightly batch job which prepares 100s of well balanced HFiles at around > 2GB each, we see sporadic failures in a bulk load. > I'm unable to paste the logs here (different network) but they show e.g. the > following on a failing day: > {code:java} > Trying to load hfile... /my/input/path/... > Attempt to bulk load region containing ... failed. This is recoverable and > will be retried > Attempt to bulk load region containing ... failed. This is recoverable and > will be retried > Attempt to bulk load region containing ... failed. This is recoverable and > will be retried > Split occurred while grouping HFiles, retry attempt 1 with 3 files remaining > to group or split > Trying to load hfile... > IOException during splitting > java.io.FileNotFoundException: File does not exist: /my/input/path/... > {code} > The exception get's thrown from [this > line|https://github.com/apache/hbase/blob/branch-1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java#L685]. > > I should note that this is a secure cluster (CDH 5.12.x). > I've tried to go through the code, and don't spot an obvious race condition. > I don't spot any changes related to this for the later 1.x versions so > presume this exists in 1.5. > I'm yet to get access to the NameNode audit logs when this occurs to trace > through the rename() calls around these particular files. > I don't see timeouts like HBASE-4030 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error
[ https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25512: -- Fix Version/s: 2.5.0 1.7.0 3.0.0-alpha-1 > May throw StringIndexOutOfBoundsException when construct illegal tablename > error > > > Key: HBASE-25512 > URL: https://issues.apache.org/jira/browse/HBASE-25512 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Trivial > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0 > > > When call Method: > {code:java} > // code placeholder > TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, > int end) > {code} > We want to construct quelifier String to print pretty error message. So we > call method: > {code:java} > // code placeholder > Bytes.toString(final byte[] b, int off, int len) > Bytes.toString(qualifierName, start, end) > {code} > But the param is wrong, we shoud pass *${Length}* instead of *${end index}* > to *Bytes.toString third param*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error
[ https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng reassigned HBASE-25512: - Assignee: zhuobin zheng > May throw StringIndexOutOfBoundsException when construct illegal tablename > error > > > Key: HBASE-25512 > URL: https://issues.apache.org/jira/browse/HBASE-25512 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Trivial > > When call Method: > {code:java} > // code placeholder > TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, > int end) > {code} > We want to construct quelifier String to print pretty error message. So we > call method: > {code:java} > // code placeholder > Bytes.toString(final byte[] b, int off, int len) > Bytes.toString(qualifierName, start, end) > {code} > But the param is wrong, we shoud pass *${Length}* instead of *${end index}* > to *Bytes.toString third param*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17266971#comment-17266971 ] zhuobin zheng commented on HBASE-25510: --- Thanks for your watching [~vjasani] . 1. I used JMH to benchmark. And i provide benchmark code in Attachments([^TestTableNameJMH.java]) 2. The errors represent in results is my benchmark code bug. (out the range of int. so negative number out of bounds of array.) I fixed the bug and re-uploaded a benchmark result: [^optimiz_benchmark_fix]. > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0 > > Attachments: TestTableNameJMH.java, optimiz_benchmark, > optimiz_benchmark_fix, origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Attachment: (was: TestTableNameJMH.java) > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0 > > Attachments: TestTableNameJMH.java, optimiz_benchmark, > optimiz_benchmark_fix, origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Attachment: TestTableNameJMH.java > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0 > > Attachments: TestTableNameJMH.java, optimiz_benchmark, > optimiz_benchmark_fix, origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Attachment: optimiz_benchmark_fix > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0 > > Attachments: TestTableNameJMH.java, optimiz_benchmark, > optimiz_benchmark_fix, origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Attachment: TestTableNameJMH.java > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0 > > Attachments: TestTableNameJMH.java, optimiz_benchmark, > origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error
[ https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25512: -- Priority: Trivial (was: Minor) > May throw StringIndexOutOfBoundsException when construct illegal tablename > error > > > Key: HBASE-25512 > URL: https://issues.apache.org/jira/browse/HBASE-25512 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Trivial > > When call Method: > {code:java} > // code placeholder > TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, > int end) > {code} > We want to construct quelifier String to print pretty error message. So we > call method: > {code:java} > // code placeholder > Bytes.toString(final byte[] b, int off, int len) > Bytes.toString(qualifierName, start, end) > {code} > But the param is wrong, we shoud pass *${Length}* instead of *${end index}* > to *Bytes.toString third param*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error
[ https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25512: -- Priority: Minor (was: Trivial) > May throw StringIndexOutOfBoundsException when construct illegal tablename > error > > > Key: HBASE-25512 > URL: https://issues.apache.org/jira/browse/HBASE-25512 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Minor > > When call Method: > {code:java} > // code placeholder > TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, > int end) > {code} > We want to construct quelifier String to print pretty error message. So we > call method: > {code:java} > // code placeholder > Bytes.toString(final byte[] b, int off, int len) > Bytes.toString(qualifierName, start, end) > {code} > But the param is wrong, we shoud pass *${Length}* instead of *${end index}* > to *Bytes.toString third param*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error
[ https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25512: -- External issue URL: https://github.com/apache/hbase/pull/2884 > May throw StringIndexOutOfBoundsException when construct illegal tablename > error > > > Key: HBASE-25512 > URL: https://issues.apache.org/jira/browse/HBASE-25512 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Trivial > > When call Method: > {code:java} > // code placeholder > TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, > int end) > {code} > We want to construct quelifier String to print pretty error message. So we > call method: > {code:java} > // code placeholder > Bytes.toString(final byte[] b, int off, int len) > Bytes.toString(qualifierName, start, end) > {code} > But the param is wrong, we shoud pass *${Length}* instead of *${end index}* > to *Bytes.toString third param*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error
zhuobin zheng created HBASE-25512: - Summary: May throw StringIndexOutOfBoundsException when construct illegal tablename error Key: HBASE-25512 URL: https://issues.apache.org/jira/browse/HBASE-25512 Project: HBase Issue Type: Bug Affects Versions: 2.4.1, 1.4.13, 1.2.12 Reporter: zhuobin zheng When call Method: {code:java} // code placeholder TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, int end) {code} We want to construct quelifier String to print pretty error message. So we call method: {code:java} // code placeholder Bytes.toString(final byte[] b, int off, int len) Bytes.toString(qualifierName, start, end) {code} But the param is wrong, we shoud pass *${Length}* instead of *${end index}* to *Bytes.toString third param*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Attachment: stucks-profile-info > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Major > Attachments: optimiz_benchmark, origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17265688#comment-17265688 ] zhuobin zheng commented on HBASE-25510: --- Add MR links and add benchmark attachments. Benchmark Result Explanation: # testStr means called TableName.valueOf(String name) # tastBB means called TableName valueOf(ByteBuffer namespace, ByteBuffer qualifier) # The number after testStr and testBB is TableNames num. like 1000 means 1000 different tableName. Origin: {code:java} // code placeholder BenchmarkMode Cnt Score Error Units TestTableNameJMH.testBB1thrpt 10 36132.014 ± 1628.381 ops/ms TestTableNameJMH.testBB10 thrpt 10 14056.243 ± 638.379 ops/ms TestTableNameJMH.testBB100 thrpt 102215.671 ±49.759 ops/ms TestTableNameJMH.testBB1000 thrpt 10 224.802 ± 4.253 ops/ms TestTableNameJMH.testBB1thrpt 10 22.476 ± 4.729 ops/ms TestTableNameJMH.testBB10 thrpt 10 1.931 ± 0.578 ops/ms TestTableNameJMH.testStr1 thrpt 10 147905.572 ± 20777.681 ops/ms TestTableNameJMH.testStr10 thrpt 10 44597.261 ± 6346.679 ops/ms TestTableNameJMH.testStr100 thrpt 105464.205 ± 1442.556 ops/ms TestTableNameJMH.testStr1000thrpt 10 360.183 ± 127.615 ops/ms TestTableNameJMH.testStr1 thrpt 10 45.338 ± 3.545 ops/ms TestTableNameJMH.testStr10 thrpt 10 1.927 ± 0.831 ops/ms {code} After Optimize: {code:java} // code placeholder BenchmarkMode Cnt Score Error Units TestTableNameJMH.testBB1thrpt 10 21585.408 ± 2519.495 ops/ms TestTableNameJMH.testBB10 thrpt 10 23474.278 ± 175.576 ops/ms TestTableNameJMH.testBB100 thrpt 10 20600.624 ± 4035.725 ops/ms TestTableNameJMH.testBB1000 thrpt 10 18349.054 ± 313.875 ops/ms TestTableNameJMH.testBB1thrpt 10 15981.688 ± 836.096 ops/ms TestTableNameJMH.testBB10 thrpt 10 14276.288 ± 201.779 ops/ms TestTableNameJMH.testStr1 thrpt 10 239837.152 ± 10767.013 ops/ms TestTableNameJMH.testStr10 thrpt4 236578.812 ± 57640.770 ops/ms TestTableNameJMH.testStr100 thrpt5 227980.174 ± 44822.292 ops/ms TestTableNameJMH.testStr1000thrpt 10 131935.073 ± 4495.644 ops/ms TestTableNameJMH.testStr1 thrpt 10 81979.448 ± 3230.575 ops/ms TestTableNameJMH.testStr10 thrpt 10 61054.516 ± 10613.181 ops/ms {code} > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Major > Attachments: optimiz_benchmark, origin_benchmark > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Attachment: optimiz_benchmark origin_benchmark > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Major > Attachments: optimiz_benchmark, origin_benchmark > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- External issue URL: https://github.com/apache/hbase/pull/2885 > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Major > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-25510: -- Description: Now, TableName.valueOf will try to find TableName Object in cache linearly(code show as below). So it is too slow when we has thousands of tables on cluster. {code:java} // code placeholder for (TableName tn : tableCache) { if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), bns)) { return tn; } }{code} I try to store the object in the hash table, so it can look up more quickly. code like this {code:java} // code placeholder TableName oldTable = tableCache.get(nameAsStr);{code} In our cluster which has tens thousands of tables. (Most of that is KYLIN table). We found that in the following two cases, the TableName.valueOf method will severely restrict our performance. Common premise: tens of thousands table in cluster cause: TableName.valueOf with low performance. (because we need to traverse all caches linearly) Case1. Replication premise1: one of table write with high qps, small value, Non-batch request. cause too much wal entry premise2: deserialize WAL Entry includes calling the TableName.valueOf method. Cause: Replicat Stuck. A lot of WAL files pile up. Case2. Active Master Start up NamespaceStateManager init should init all RegionInfo, and regioninfo init will call TableName.valueOf. It will cost some time if TableName.valueOf is slow. was: There are tens of thousands of tables on our cluster (Most of that is KYLIN table). We found that in the following two cases, the TableName.valueOf method will severely restrict our performance. Common premise: tens of thousands table in cluster cause: TableName.valueOf with low performance. (because we need to traverse all caches linearly) Case1. Replication premise: one of table write with high qps, small value, Non-batch request. cause: There are too much wal entry in WAL. So we need to deserialize too many WAL Entry which includes calling the TableName.valueOf method to instantiate the TableName object. Case2. Active Master Start up > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > -- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication >Affects Versions: 1.2.12, 1.4.13, 2.4.1 >Reporter: zhuobin zheng >Priority: Major > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens
zhuobin zheng created HBASE-25510: - Summary: Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens Key: HBASE-25510 URL: https://issues.apache.org/jira/browse/HBASE-25510 Project: HBase Issue Type: Improvement Components: master, Replication Affects Versions: 2.4.1, 1.4.13, 1.2.12 Reporter: zhuobin zheng There are tens of thousands of tables on our cluster (Most of that is KYLIN table). We found that in the following two cases, the TableName.valueOf method will severely restrict our performance. Common premise: tens of thousands table in cluster cause: TableName.valueOf with low performance. (because we need to traverse all caches linearly) Case1. Replication premise: one of table write with high qps, small value, Non-batch request. cause: There are too much wal entry in WAL. So we need to deserialize too many WAL Entry which includes calling the TableName.valueOf method to instantiate the TableName object. Case2. Active Master Start up -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT
[ https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004385#comment-17004385 ] zhuobin zheng commented on HBASE-20673: --- I Can't add attachment. So I add some info in this comment. version: cdh5-1.2.0_5.14.4 - Profiler Info: ns percent samples top -- --- --- --- 219079399723 16.81% 21956 itable stub 187885376151 14.42% 18830 itable stub 168718141522 12.95% 16909 itable stub 149243475899 11.45% 14957 itable stub 108522505239 8.33% 10876 itable stub 54090604368 4.15% 5421 itable stub 45659351417 3.50% 4576 org.apache.hadoop.hbase.CellComparator.compareRows_[j] 41398429360 3.18% 4149 itable stub 32259010132 2.48% 3233 itable stub 30253160585 2.32% 3032 itable stub 10897498052 0.84% 1092 HeapRegion::block_size(HeapWord const*) const 10467104761 0.80% 1049 org.apache.hadoop.hbase.KeyValue.getFamilyLength_[j] 10176340086 0.78% 1020 G1ParScanThreadState::trim_queue() 9867105912 0.76% 989 G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*) 9768378849 0.75% 979 itable stub - Jstack Info: "RpcServer.RW.fifo.Q.write.handler=73,queue=1,port=60020" #133 daemon prio=5 os_prio=0 tid=0x7faad0097800 nid=0x403d runnable [0x7faaceeec000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.CellComparator.compareRows(CellComparator.java:186) at org.apache.hadoop.hbase.CellComparator.compare(CellComparator.java:63) at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:2020) at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897) at java.util.concurrent.ConcurrentSkipListMap.cpr(ConcurrentSkipListMap.java:655) at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:899) at java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1581) at org.apache.hadoop.hbase.regionserver.CellSkipListSet.add(CellSkipListSet.java:134) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.addToCellSet(DefaultMemStore.java:242) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:276) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.add(DefaultMemStore.java:233) at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:686) at org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3280) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2944) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2886) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:765) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:716) at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2146) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33656) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2191) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:183) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:163) "RpcServer.RW.fifo.Q.write.handler=71,queue=8,port=60020" #131 daemon prio=5 os_prio=0 tid=0x7faad0093800 nid=0x403b runnable [0x7faacf0ee000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.CellComparator.compareColumns(CellComparator.java:157) at org.apache.hadoop.hbase.CellComparator.compareWithoutRow(CellComparator.java:224) at org.apache.hadoop.hbase.CellComparator.compare(CellComparator.java:66) at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:2020) at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897) at java.util.concurrent.ConcurrentSkipListMap.cpr(ConcurrentSkipListMap.java:655) at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:899) at java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1581) at org.apache.hadoop.hbase.regionserver.CellSkipListSet.add(CellSkipListSet.java:134) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.addToCellSet(DefaultMemStore.java:242) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:276) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.add(DefaultMemStore.java:233) at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:686) at
[jira] [Commented] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT
[ https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004384#comment-17004384 ] zhuobin zheng commented on HBASE-20673: --- Hi, [~stack] I also encountered similar problems in my production environment: *Too many KeyValue implementation types confused JIT* . And cresult in a large amount of CPU waste, resulting in full cpu usage of singal server(48 core, cpu 4800%), and a significant decrease in read and write speeds. Through Profiler analysis, it is found that itable stub takes a lot of CPU time (almost 70%) Through Jstack: A large number of read and write threads are stuck in several special places where CellComparator compares: [https://github.com/apache/hbase/blob/branch-1.2/hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparator.java#L186] [https://github.com/apache/hbase/blob/branch-1.2/hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparator.java#L157] The Jstack really confused me. Because this line of code does nothing but resolve the actual calling code address. But combining the above profiler result (70% cpu in itable stub). Now, I think there are two possible reasons for the crazy use of cpu. # Too many Cell implements confuesd JIT. Result jvm interface call to original itable scan. # KeyValue implements too many interface, cause itable too long. {code:java} //代码占位符 public class KeyValue implements Cell, HeapSize, Cloneable, SettableSequenceId, SettableTimestamp{code} I think this situation will occur when the two types of KeyValue / NoTagKeyValue are more evenly distributed. But unfortunately, although it has always appeared in the production environment, I cannot reproduce it in the test environment, so I cannot provide a better test solution. Now I will try to modify a part of the code so that a large number of Cell implementations are of one type, and put it into the production environment to see if it can solve the 100% CPU problem. > Reduce the number of Cell implementations; the profusion is distracting to > users and JIT > > > Key: HBASE-20673 > URL: https://issues.apache.org/jira/browse/HBASE-20673 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png > > > We have a wild blossom of Cell implementations in hbase. Purge the bulk of > them. Make it so we do one type well. JIT gets confused if it has an abstract > or an interface and then the instantiated classes are myriad (megamorphic). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23598) There are too much small WAL File
[ https://issues.apache.org/jira/browse/HBASE-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-23598: -- Attachment: HBASE-23598.patch > There are too much small WAL File > - > > Key: HBASE-23598 > URL: https://issues.apache.org/jira/browse/HBASE-23598 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.3.6, 2.2.2 > Environment: hbase version: cdh5-1.2.0_5.14.4 > hbase.wal.provider: multiwal > hbase.wal.regiongrouping.numgroups: 4 > The wals file shows 100+ wal files in wal-3 , and some of them has very small > size >Reporter: zhuobin zheng >Priority: Major > Attachments: HBASE-23598.patch, wals > > Original Estimate: 168h > Remaining Estimate: 168h > > I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files > will cause the cluster and recover very slowly when cluster crash completely > . (In the split log step) (because too many WAL files will cause too many ZK > requests). By default, WAL files start to roll when they reach HDFS Block > Size (256M In My Case) * 0.95. But I found that there are many small files > (0-100M) in the WAL directory. When I look at the code , I found that when I > configured multiwal (I configured 4 WALs for each RS), as long as a single > WAL file reached HDFS Block Size (256M In My Case) * 0.95, all WAL files > would scroll, so it caused a lot of WAL small files. > I tried to modify the code to solve the problem (making each WAL scroll > independently). Although this change is very small, I am not sure if such a > change will cause other problems, currently being tested ... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23598) There are too much small WAL File
[ https://issues.apache.org/jira/browse/HBASE-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuobin zheng updated HBASE-23598: -- Attachment: wals Component/s: wal Affects Version/s: 2.2.2 Description: I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files will cause the cluster and recover very slowly when cluster crash completely . (In the split log step) (because too many WAL files will cause too many ZK requests). By default, WAL files start to roll when they reach HDFS Block Size (256M In My Case) * 0.95. But I found that there are many small files (0-100M) in the WAL directory. When I look at the code , I found that when I configured multiwal (I configured 4 WALs for each RS), as long as a single WAL file reached HDFS Block Size (256M In My Case) * 0.95, all WAL files would scroll, so it caused a lot of WAL small files. I tried to modify the code to solve the problem (making each WAL scroll independently). Although this change is very small, I am not sure if such a change will cause other problems, currently being tested ... Environment: hbase version: cdh5-1.2.0_5.14.4 hbase.wal.provider: multiwal hbase.wal.regiongrouping.numgroups: 4 The wals file shows 100+ wal files in wal-3 , and some of them has very small size Summary: There are too much small WAL File (was: There are too much small WAL) Remaining Estimate: 168h Original Estimate: 168h > There are too much small WAL File > - > > Key: HBASE-23598 > URL: https://issues.apache.org/jira/browse/HBASE-23598 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.3.6, 2.2.2 > Environment: hbase version: cdh5-1.2.0_5.14.4 > hbase.wal.provider: multiwal > hbase.wal.regiongrouping.numgroups: 4 > The wals file shows 100+ wal files in wal-3 , and some of them has very small > size >Reporter: zhuobin zheng >Priority: Major > Attachments: wals > > Original Estimate: 168h > Remaining Estimate: 168h > > I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files > will cause the cluster and recover very slowly when cluster crash completely > . (In the split log step) (because too many WAL files will cause too many ZK > requests). By default, WAL files start to roll when they reach HDFS Block > Size (256M In My Case) * 0.95. But I found that there are many small files > (0-100M) in the WAL directory. When I look at the code , I found that when I > configured multiwal (I configured 4 WALs for each RS), as long as a single > WAL file reached HDFS Block Size (256M In My Case) * 0.95, all WAL files > would scroll, so it caused a lot of WAL small files. > I tried to modify the code to solve the problem (making each WAL scroll > independently). Although this change is very small, I am not sure if such a > change will cause other problems, currently being tested ... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-23598) There are too much small WAL
zhuobin zheng created HBASE-23598: - Summary: There are too much small WAL Key: HBASE-23598 URL: https://issues.apache.org/jira/browse/HBASE-23598 Project: HBase Issue Type: Improvement Affects Versions: 1.3.6 Reporter: zhuobin zheng -- This message was sent by Atlassian Jira (v8.3.4#803005)