[jira] [Updated] (HBASE-15136) Explore different queuing behaviors while busy

2024-02-08 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-15136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-15136:
--
Release Note: 
Previously RPC request scheduler in HBase had 2 modes in could operate 
in:https://translate.google.com/?source=gtx

 - simple FIFO
 - "partial" deadline, where deadline constraints are only imposed on 
long-running scan requests.

This patch adds new type of scheduler to HBase, based on the research around 
controlled delay (CoDel) algorithm [1], used in networking to combat 
bufferbloat, as well as some analysis on generalizing it to generic request 
queues [2]. The purpose of that work is to prevent long standing call queues 
caused by discrepancy between request rate and available throughput, caused by 
kernel/disk IO/networking stalls.

New RPC scheduler could be enabled by setting 
hbase.ipc.server.callqueue.type=codel in configuration. Several additional 
params allow to configure algorithm behavior - 

hbase.ipc.server.callqueue.codel.target.delay
hbase.ipc.server.callqueue.codel.interval
hbase.ipc.server.callqueue.codel.lifo.threshold

[1] Controlling Queue Delay / A modern AQM is just one piece of the solution to 
bufferbloat. http://queue.acm.org/detail.cfm?id=2209336
[2] Fail at Scale / Reliability in the face of rapid change. 
http://queue.acm.org/detail.cfm?id=2839461

  was:
Previously RPC request scheduler in HBase had 2 modes in could operate in:

 - simple FIFO
 - "partial" deadline, where deadline constraints are only imposed on 
long-running scan requests.

This patch adds new type of scheduler to HBase, based on the research around 
controlled delay (CoDel) algorithm [1], used in networking to combat 
bufferbloat, as well as some analysis on generalizing it to generic request 
queues [2]. The purpose of that work is to prevent long standing call queues 
caused by discrepancy between request rate and available throughput, caused by 
kernel/disk IO/networking stalls.

New RPC scheduler could be enabled by setting 
hbase.ipc.server.callqueue.type=codel in configuration. Several additional 
params allow to configure algorithm behavior - 

hbase.ipc.server.callqueue.codel.target.delay
hbase.ipc.server.callqueue.codel.interval
hbase.ipc.server.callqueue.codel.lifo.threshold

[1] Controlling Queue Delay / A modern AQM is just one piece of the solution to 
bufferbloat. http://queue.acm.org/detail.cfm?id=2209336
[2] Fail at Scale / Reliability in the face of rapid change. 
http://queue.acm.org/detail.cfm?id=2839461


> Explore different queuing behaviors while busy
> --
>
> Key: HBASE-15136
> URL: https://issues.apache.org/jira/browse/HBASE-15136
> Project: HBase
>  Issue Type: New Feature
>  Components: IPC/RPC, Scheduler
>Reporter: Elliott Neil Clark
>Assignee: Mikhail Antonov
>Priority: Critical
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HBASE-15136-1.2.v1.patch, HBASE-15136-v2.patch, 
> deadline_scheduler_v_0_2.patch
>
>
> http://queue.acm.org/detail.cfm?id=2839461



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts

2023-10-19 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-20846:
--
Description: 
Found this one when investigating ModifyTableProcedure got stuck while there 
was a MoveRegionProcedure going on after master restart.
Though this issue can be solved by HBASE-20752. But I discovered something else.
Before a MoveRegionProcedure can execute, it will hold the table's shared lock. 
so,, when a UnassignProcedure was spwaned, it will not check the table's shared 
lock since it is sure that its parent(MoveRegionProcedure) has aquired the 
table's lock.
{code:java}
// If there is parent procedure, it would have already taken xlock, so no need 
to take
  // shared lock here. Otherwise, take shared lock.
  if (!procedure.hasParent()
  && waitTableQueueSharedLock(procedure, table) == null) {
  return true;
  }
{code}
But, it is not the case when Master was restarted. The child 
procedure(UnassignProcedure) will be executed first after restart. Though it 
has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
table's lock.
So, since it began to execute without hold the table's shared lock. A 
ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
same time. Which is not possible if the master was not restarted.
This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I 
wrote a simple UT to repo this case.

I think we don't have to check the parent for table's shared lock. It is a 
shared lock, right? I think we can acquire it every time we need it.
 

  was:
Found this one when investigating ModifyTableProcedure got stuck while there 
was a MoveRegionProcedure going on after master restart.
Though this issue can be solved by HBASE-20752. But I discovered something else.
Before a MoveRegionProcedure can execute, it will hold the table's shared lock. 
so,, when a UnassignProcedure was spwaned, it will not check the table's shared 
lock since it is sure that its parent(MoveRegionProcedure) has aquired the 
table's lock.
{code:java}
// If there is parent procedure, it would have already taken xlock, so no need 
to take
  // shared lock here. Otherwise, take shared lock.
  if (!procedure.hasParent()
  && waitTableQueueSharedLock(procedure, table) == null) {
  return true;
  }
{code}

But, it is not the case when Master was restarted. The child 
procedure(UnassignProcedure) will be executed first after restart. Though it 
has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
table's lock.
So, since it began to execute without hold the table's shared lock. A 
ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
same time. Which is not possible if the master was not restarted.
This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I 
wrote a simple UT to repo this case.

I think we don't have to check the parent for table's shared lock. It is a 
shared lock, right? I think we can acquire it every time we need it.


> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold 

[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2023-09-13 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-20447:
--
Description: 
This is the issue I was originally having here: 
[http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]

 

When we pread, we don't force the read to read all of the next block header.
However, when we get into a race condition where two opener threads try to
cache the same block and one thread read all of the next block header and the 
other one didn't, it will fail the open process. This is especially important
in a splitting case where it will potentially fail the split process.
Instead, in the caches, we should only fail if the required blocks are 
different.
 

  was:
This is the issue I was originally having here: 
[http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]

 

When we pread, we don't force the read to read all of the next block header.
However, when we get into a race condition where two opener threads try to
cache the same block and one thread read all of the next block header and the 
other one didn't, it will fail the open process. This is especially important
in a splitting case where it will potentially fail the split process.
Instead, in the caches, we should only fail if the required blocks are 
different.


> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.1.0, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.branch-1.005.patch, 
> HBASE-20447.branch-1.006.patch, HBASE-20447.master.001.patch, 
> HBASE-20447.master.002.patch, HBASE-20447.master.003.patch, 
> HBASE-20447.master.004.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-26708) Netty "leak detected" and OutOfDirectMemoryError due to direct memory buffering with SASL implementation

2023-02-25 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26708:
--
Description: 
Under constant data ingestion, using default Netty based RpcServer and 
RpcClient implementation results in OutOfDirectMemoryError, supposedly caused 
by leaks detected by Netty's LeakDetector.
{code:java}
2022-01-25 17:03:10,084 ERROR [S-EventLoopGroup-1-3] util.ResourceLeakDetector 
- java:115)
  
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.expandBTW.
 have you run the related test case?(ByteToMessageDecoder.java:538)
  
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:97)
  
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:274)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
  
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
  
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
  
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
  
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
  
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
  
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
  
org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
  
org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  java.lang.Thread.run(Thread.java:748)
 {code}
{code:java}
2022-01-25 17:03:14,014 ERROR [S-EventLoopGroup-1-3] util.ResourceLeakDetector 
- 
apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
  
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
  
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
  
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
  
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
  
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
  
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
  
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
  
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
  
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
  
org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
  
org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  java.lang.Thread.run(Thread.java:748)
 {code}
And finally handlers are removed from the pipeline due to 
OutOfDirectMemoryError:
{code:java}
2022-01-25 17:36:28,657 WARN  [S-EventLoopGroup-1-5] 
channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it 
reached at the tail of the pipeline. It usually means 

[jira] [Updated] (HBASE-27464) In memory compaction 'COMPACT' may cause data corruption when adding cells large than maxAlloc(default 256k) size.

2022-11-04 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-27464:
--
Summary: In memory compaction 'COMPACT' may cause data corruption when 
adding cells large than maxAlloc(default 256k) size.  (was: In memory 
compaction 'COMPACT' may cause data corruption when add cell bigger than 
maxAlloc(default 256k) size.)

> In memory compaction 'COMPACT' may cause data corruption when adding cells 
> large than maxAlloc(default 256k) size.
> --
>
> Key: HBASE-27464
> URL: https://issues.apache.org/jira/browse/HBASE-27464
> Project: HBase
>  Issue Type: Bug
>  Components: in-memory-compaction
>Reporter: zhuobin zheng
>Priority: Critical
> Attachments: image-2022-11-04-15-46-21-645.png
>
>
> When init 'CellChunkImmutableSegment' for 'COMPACT' action, we not force copy 
> to current MSLab. 
> When cell size bigger than maxAlloc, cell will stay in previous chunk which 
> will recycle after segment replace, and we may read wrong data when these 
> chunk reused by others.
> !image-2022-11-04-15-46-21-645.png!
>  
> Timeline:
>  # add a cell 'A' bigger than 256K
>  # cell 'A' will copy to a chunk 'A' when first compact
>  # cell 'A' will retain in chunk 'A' when second compact
>  # chunk 'A' recycled after segment swap and close



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27464) In memory compaction 'COMPACT' may cause data corruption when add cell bigger than maxAlloc(default 256k) size.

2022-11-04 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-27464:
--
Summary: In memory compaction 'COMPACT' may cause data corruption when add 
cell bigger than maxAlloc(default 256k) size.  (was: In memory compaction 
'COMPACT' may cause data mass when add cell bigger than maxAlloc(default 256k) 
size.)

> In memory compaction 'COMPACT' may cause data corruption when add cell bigger 
> than maxAlloc(default 256k) size.
> ---
>
> Key: HBASE-27464
> URL: https://issues.apache.org/jira/browse/HBASE-27464
> Project: HBase
>  Issue Type: Bug
>  Components: in-memory-compaction
>Reporter: zhuobin zheng
>Priority: Critical
> Attachments: image-2022-11-04-15-46-21-645.png
>
>
> When init 'CellChunkImmutableSegment' for 'COMPACT' action, we not force copy 
> to current MSLab. 
> When cell size bigger than maxAlloc, cell will stay in previous chunk which 
> will recycle after segment replace, and we may read wrong data when these 
> chunk reused by others.
> !image-2022-11-04-15-46-21-645.png!
>  
> Timeline:
>  # add a cell 'A' bigger than 256K
>  # cell 'A' will copy to a chunk 'A' when first compact
>  # cell 'A' will retain in chunk 'A' when second compact
>  # chunk 'A' recycled after segment swap and close



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27464) In memory compaction 'COMPACT' may cause data mass when add cell bigger than maxAlloc(default 256k) size.

2022-11-04 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-27464:
-

 Summary: In memory compaction 'COMPACT' may cause data mass when 
add cell bigger than maxAlloc(default 256k) size.
 Key: HBASE-27464
 URL: https://issues.apache.org/jira/browse/HBASE-27464
 Project: HBase
  Issue Type: Bug
  Components: in-memory-compaction
Reporter: zhuobin zheng
 Attachments: image-2022-11-04-15-46-21-645.png

When init 'CellChunkImmutableSegment' for 'COMPACT' action, we not force copy 
to current MSLab. 

When cell size bigger than maxAlloc, cell will stay in previous chunk which 
will recycle after segment replace, and we may read wrong data when these chunk 
reused by others.

!image-2022-11-04-15-46-21-645.png!

 

Timeline:
 # add a cell 'A' bigger than 256K
 # cell 'A' will copy to a chunk 'A' when first compact
 # cell 'A' will retain in chunk 'A' when second compact
 # chunk 'A' recycled after segment swap and close



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-26026) HBase Write may be stuck forever when using CompactingMemStore

2022-10-13 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26026:
--
Description: 
Sometimes I observed that HBase Write might be stuck in my hbase cluster which 
enabling {{{}CompactingMemStore{}}}. I have simulated the problem by unit test 
in my PR. 
The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} :
{code:java}
425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
426  MemStoreSizing memstoreSizing) {
427if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428  if (currActive.setInMemoryFlushed()) {
429flushInMemory(currActive);
430if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the 
background
  ..
 }
{code}
In line 427, {{shouldFlushInMemory}} checking if {{currActive.getDataSize}} 
adding the size of {{cellToAdd}} exceeds 
{{{}CompactingMemStore.inmemoryFlushSize{}}},if true, then {{currActive}} 
should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
  }
{code}
After sucessfully set {{currActive.flushed}} to true, in above line 429 
{{flushInMemory(currActive)}} invokes 
{{CompactingMemStore.pushActiveToPipeline}} :
{code:java}
 protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
  pipeline.pushHead(currActive);
  resetActive();
}
  }
{code}
In above {{CompactingMemStore.pushActiveToPipeline}} method , if the 
{{currActive.cellSet}} is empty, then nothing is done. Due to concurrent writes 
and because we first add cell size to {{currActive.getDataSize}} and then 
actually add cell to {{{}currActive.cellSet{}}}, it is possible that 
{{currActive.getDataSize}} could not accommodate {{cellToAdd}} but 
{{currActive.cellSet}} is still empty if pending writes which not yet add cells 
to {{{}currActive.cellSet{}}}.
So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is 
created, and new writes still continue target to {{{}currActive{}}}, but 
{{currActive.flushed}} is true, {{currActive}} could not enter 
{{flushInMemory(currActive)}} again,and new {{ActiveSegment}} could not be 
created forever ! In the end all writes would be stuck.

In my opinion , once {{currActive.flushed}} is set true, it could not continue 
use as {{ActiveSegment}} , and because of concurrent pending writes, only after 
{{currActive.updatesLock.writeLock()}} is acquired(i.e. 
{{currActive.waitForUpdates}} is called) in 
{{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}} is 
empty or not.

My fix is remove the {{if (!currActive.isEmpty())}} check here and left the 
check to background {{InMemoryCompactionRunnable}} after 
{{currActive.waitForUpdates}} is called. An alternative fix is we use 
synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent all 
writes , wait for all pending write completed(i.e. currActive.waitForUpdates is 
called) and if {{currActive}} is still empty ,then we set 
{{currActive.flushed}} back to false,but I am not inclined to use so heavy 
synchronization in write path, and I think we would better maintain lockless 
implementation for {{CompactingMemStore.add}} method just as now and 
{{currActive.waitForUpdates}} would better be left in background 
{{{}InMemoryCompactionRunnable{}}}.

  was:
Sometimes I observed that HBase Write might be stuck  in my hbase cluster which 
enabling {{CompactingMemStore}}.  I have simulated the problem  by unit test in 
my PR. 
The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : 
{code:java}
425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
426  MemStoreSizing memstoreSizing) {
427if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428  if (currActive.setInMemoryFlushed()) {
429flushInMemory(currActive);
430if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the 
background
  ..
 }
{code}
In line 427, {{shouldFlushInMemory}} checking if  {{currActive.getDataSize}} 
adding the size of {{cellToAdd}} exceeds 
{{CompactingMemStore.inmemoryFlushSize}},if true,  then  {{currActive}} should 
be flushed, {{currActive.setInMemoryFlushed()}} is invoked in  line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
  }
{code}
After sucessfully set {{currActive.flushed}} to true, in above line 429 
{{flushInMemory(currActive)}} invokes 
{{CompactingMemStore.pushActiveToPipeline}} :
{code:java}
 protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
  pipeline.pushHead(currActive);
   

[jira] [Assigned] (HBASE-26580) The message of StoreTooBusy is confused

2021-12-15 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-26580:
-

Assignee: zhuobin zheng

> The message of StoreTooBusy is confused
> ---
>
> Key: HBASE-26580
> URL: https://issues.apache.org/jira/browse/HBASE-26580
> Project: HBase
>  Issue Type: Task
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Trivial
>
>  
> When check Store limit. We both check parallelPutToStoreThreadLimit and 
> parallelPreparePutToStoreThreadLimit. 
> {code:java}
> if (store.getCurrentParallelPutCount() > this.parallelPutToStoreThreadLimit
> || preparePutCount > this.parallelPreparePutToStoreThreadLimit) {
>   tooBusyStore = (tooBusyStore == null ?
>   store.getColumnFamilyName() :
>   tooBusyStore + "," + store.getColumnFamilyName());
> } {code}
> But we only print Above parallelPutToStoreThreadLimit only. 
>  
>  
> {code:java}
> if (tooBusyStore != null) {
>   String msg =
>   "StoreTooBusy," + this.region.getRegionInfo().getRegionNameAsString() + 
> ":" + tooBusyStore
>   + " Above parallelPutToStoreThreadLimit(" + 
> this.parallelPutToStoreThreadLimit + ")";
>   if (LOG.isTraceEnabled()) {
> LOG.trace(msg);
>   }
>   throw new RegionTooBusyException(msg);
> }{code}
> It confused me a lot time .. 
> Just add message of parallelPreparePutToStoreThreadLimit
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HBASE-26580) The message of StoreTooBusy is confused

2021-12-15 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26580:
--
Priority: Trivial  (was: Minor)

> The message of StoreTooBusy is confused
> ---
>
> Key: HBASE-26580
> URL: https://issues.apache.org/jira/browse/HBASE-26580
> Project: HBase
>  Issue Type: Task
>Reporter: zhuobin zheng
>Priority: Trivial
>
>  
> When check Store limit. We both check parallelPutToStoreThreadLimit and 
> parallelPreparePutToStoreThreadLimit. 
> {code:java}
> if (store.getCurrentParallelPutCount() > this.parallelPutToStoreThreadLimit
> || preparePutCount > this.parallelPreparePutToStoreThreadLimit) {
>   tooBusyStore = (tooBusyStore == null ?
>   store.getColumnFamilyName() :
>   tooBusyStore + "," + store.getColumnFamilyName());
> } {code}
> But we only print Above parallelPutToStoreThreadLimit only. 
>  
>  
> {code:java}
> if (tooBusyStore != null) {
>   String msg =
>   "StoreTooBusy," + this.region.getRegionInfo().getRegionNameAsString() + 
> ":" + tooBusyStore
>   + " Above parallelPutToStoreThreadLimit(" + 
> this.parallelPutToStoreThreadLimit + ")";
>   if (LOG.isTraceEnabled()) {
> LOG.trace(msg);
>   }
>   throw new RegionTooBusyException(msg);
> }{code}
> It confused me a lot time .. 
> Just add message of parallelPreparePutToStoreThreadLimit
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26580) The message of StoreTooBusy is confused

2021-12-15 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-26580:
-

 Summary: The message of StoreTooBusy is confused
 Key: HBASE-26580
 URL: https://issues.apache.org/jira/browse/HBASE-26580
 Project: HBase
  Issue Type: Task
Reporter: zhuobin zheng


 

When check Store limit. We both check parallelPutToStoreThreadLimit and 
parallelPreparePutToStoreThreadLimit. 
{code:java}
if (store.getCurrentParallelPutCount() > this.parallelPutToStoreThreadLimit
|| preparePutCount > this.parallelPreparePutToStoreThreadLimit) {
  tooBusyStore = (tooBusyStore == null ?
  store.getColumnFamilyName() :
  tooBusyStore + "," + store.getColumnFamilyName());
} {code}
But we only print Above parallelPutToStoreThreadLimit only. 

 

 
{code:java}
if (tooBusyStore != null) {
  String msg =
  "StoreTooBusy," + this.region.getRegionInfo().getRegionNameAsString() + 
":" + tooBusyStore
  + " Above parallelPutToStoreThreadLimit(" + 
this.parallelPutToStoreThreadLimit + ")";
  if (LOG.isTraceEnabled()) {
LOG.trace(msg);
  }
  throw new RegionTooBusyException(msg);
}{code}
It confused me a lot time .. 

Just add message of parallelPreparePutToStoreThreadLimit

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HBASE-26579) Set storage policy of recovered edits when wal storage type is configured

2021-12-15 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26579:
--
Status: Patch Available  (was: Open)

> Set storage policy of recovered edits  when wal storage type is configured
> --
>
> Key: HBASE-26579
> URL: https://issues.apache.org/jira/browse/HBASE-26579
> Project: HBase
>  Issue Type: Improvement
>  Components: Recovery
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>
> In our cluster, we has many SSD and a little HDD.  (Most table configured 
> storage policy ONE_SSD, and all wals is configured ALL_SSD)
> when all cluster down, It's difficult to recovery cluster. Because HDD Disk 
> IO bottleneck (Almost all disk io util is 100%).
> I think the most hdfs operation when recovery is split wal to recovered edits 
> dir, And read it.
> And it goes better when i stop hbase and set all recovered.edits to ALL_SSD.
> So we can get benifit of recovery time if we set recovered.edits dir to 
> better storage like WAL.
> Now i reuse config item  hbase.wal.storage.policy to set recovered.edits 
> storage type. Because I did not find a scenario where they use different 
> storage Policy



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HBASE-26579) Set storage policy of recovered edits when wal storage type is configured

2021-12-15 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-26579:
-

Assignee: zhuobin zheng

> Set storage policy of recovered edits  when wal storage type is configured
> --
>
> Key: HBASE-26579
> URL: https://issues.apache.org/jira/browse/HBASE-26579
> Project: HBase
>  Issue Type: Improvement
>  Components: Recovery
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>
> In our cluster, we has many SSD and a little HDD.  (Most table configured 
> storage policy ONE_SSD, and all wals is configured ALL_SSD)
> when all cluster down, It's difficult to recovery cluster. Because HDD Disk 
> IO bottleneck (Almost all disk io util is 100%).
> I think the most hdfs operation when recovery is split wal to recovered edits 
> dir, And read it.
> And it goes better when i stop hbase and set all recovered.edits to ALL_SSD.
> So we can get benifit of recovery time if we set recovered.edits dir to 
> better storage like WAL.
> Now i reuse config item  hbase.wal.storage.policy to set recovered.edits 
> storage type. Because I did not find a scenario where they use different 
> storage Policy



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26579) Set storage policy of recovered edits when wal storage type is configured

2021-12-15 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-26579:
-

 Summary: Set storage policy of recovered edits  when wal storage 
type is configured
 Key: HBASE-26579
 URL: https://issues.apache.org/jira/browse/HBASE-26579
 Project: HBase
  Issue Type: Improvement
  Components: Recovery
Reporter: zhuobin zheng


In our cluster, we has many SSD and a little HDD.  (Most table configured 
storage policy ONE_SSD, and all wals is configured ALL_SSD)

when all cluster down, It's difficult to recovery cluster. Because HDD Disk IO 
bottleneck (Almost all disk io util is 100%).

I think the most hdfs operation when recovery is split wal to recovered edits 
dir, And read it.

And it goes better when i stop hbase and set all recovered.edits to ALL_SSD.

So we can get benifit of recovery time if we set recovered.edits dir to better 
storage like WAL.

Now i reuse config item  hbase.wal.storage.policy to set recovered.edits 
storage type. Because I did not find a scenario where they use different 
storage Policy



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases

2021-11-26 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449762#comment-17449762
 ] 

zhuobin zheng commented on HBASE-26482:
---

Push MR 
[(https://github.com/apache/hbase/pull/3887)|https://github.com/apache/hbase/pull/3887]
 for branch-1.

There is a problem. We can't update cversion of root queuesZnode  atomically 
when  hbase.zookeeper.useMulti  is set false.

Now, I only fixed this problem when hbase.zookeeper.useMulti  true. (default is 
true)

Another way to totally solve this problem: Check cversion of 
/hbase/replication/rs and all znodes of /hbase/replication/rs/${servername} 
when master clean. 

But this way will cause the code of branch-1 different with master.

I don't know which is better. 

> HMaster may clean wals that is replicating in rare cases
> 
>
> Key: HBASE-26482
> URL: https://issues.apache.org/jira/browse/HBASE-26482
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Critical
> Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9
>
>
> In our cluster, i can found some FileNotFoundException when 
> ReplicationSourceWALReader running for replication recovery queue.
> I guss the wal most likely removed by hmaste. And i found something to 
> support it.
> The method getAllWALs: 
> [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
>    
> |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
>  zk cversion of /hbase/replication/rs as an optimistic lock to control 
> concurrent ops.
> But, zk cversion *only can only reflect the changes of child nodes, but not 
> the changes of grandchildren.*
> So, HMaster may loss some wal from this method in follow situation.
>  # HMaster do log clean , and invoke getAllWALs to filter log which should 
> not be deleted.
>  # HMaster cache current cversion of /hbase/replication/rs  as *v0*
>  # HMaster cache all RS server name, and traverse them, get the WAL in each 
> Queue
>  # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
>  # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
>  # By the way , the cversion of /hbase/replication/rs not changed before all 
> of *RS2* queue is removed, because the children of /hbase/replication/rs not 
> change.
>  # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
> traversed *RS1 ,* and ** this queue not exists in *RS2*
> The above expression is currently only speculation, not confirmed
> Flie Not Found Log.
>  
> {code:java}
> // code placeholder
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.WALEntryStream: Couldn't locate log: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
>         at 
> 

[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases

2021-11-24 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448898#comment-17448898
 ] 

zhuobin zheng commented on HBASE-26482:
---

[~shahrs87] It seems yes. The code in branch-1 also has the same problem.

I will test and submit a patch later.

> HMaster may clean wals that is replicating in rare cases
> 
>
> Key: HBASE-26482
> URL: https://issues.apache.org/jira/browse/HBASE-26482
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Critical
> Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9
>
>
> In our cluster, i can found some FileNotFoundException when 
> ReplicationSourceWALReader running for replication recovery queue.
> I guss the wal most likely removed by hmaste. And i found something to 
> support it.
> The method getAllWALs: 
> [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
>    
> |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
>  zk cversion of /hbase/replication/rs as an optimistic lock to control 
> concurrent ops.
> But, zk cversion *only can only reflect the changes of child nodes, but not 
> the changes of grandchildren.*
> So, HMaster may loss some wal from this method in follow situation.
>  # HMaster do log clean , and invoke getAllWALs to filter log which should 
> not be deleted.
>  # HMaster cache current cversion of /hbase/replication/rs  as *v0*
>  # HMaster cache all RS server name, and traverse them, get the WAL in each 
> Queue
>  # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
>  # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
>  # By the way , the cversion of /hbase/replication/rs not changed before all 
> of *RS2* queue is removed, because the children of /hbase/replication/rs not 
> change.
>  # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
> traversed *RS1 ,* and ** this queue not exists in *RS2*
> The above expression is currently only speculation, not confirmed
> Flie Not Found Log.
>  
> {code:java}
> // code placeholder
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.WALEntryStream: Couldn't locate log: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192)
>         at 

[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases

2021-11-23 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448075#comment-17448075
 ] 

zhuobin zheng commented on HBASE-26482:
---

JIRA:  HBASE-12865 Has fixed this problem.

HBASE-12865 delete *rsZnode* after all queue claimed. Old method will claim all 
queue in rs. So we can delete rs Znode after all queue claimed

But now, In this method, we only claim one queue, so we can't delete *rsZnode.*

I submit a simple patch to fix it by add ops(create znode, delete znode) to 
multiOp for zk(for update cversion).

> HMaster may clean wals that is replicating in rare cases
> 
>
> Key: HBASE-26482
> URL: https://issues.apache.org/jira/browse/HBASE-26482
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: zhuobin zheng
>Priority: Critical
>
> In our cluster, i can found some FileNotFoundException when 
> ReplicationSourceWALReader running for replication recovery queue.
> I guss the wal most likely removed by hmaste. And i found something to 
> support it.
> The method getAllWALs: 
> [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
>    
> |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
>  zk cversion of /hbase/replication/rs as an optimistic lock to control 
> concurrent ops.
> But, zk cversion *only can only reflect the changes of child nodes, but not 
> the changes of grandchildren.*
> So, HMaster may loss some wal from this method in follow situation.
>  # HMaster do log clean , and invoke getAllWALs to filter log which should 
> not be deleted.
>  # HMaster cache current cversion of /hbase/replication/rs  as *v0*
>  # HMaster cache all RS server name, and traverse them, get the WAL in each 
> Queue
>  # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
>  # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
>  # By the way , the cversion of /hbase/replication/rs not changed before all 
> of *RS2* queue is removed, because the children of /hbase/replication/rs not 
> change.
>  # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
> traversed *RS1 ,* and ** this queue not exists in *RS2*
> The above expression is currently only speculation, not confirmed
> Flie Not Found Log.
>  
> {code:java}
> // code placeholder
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.WALEntryStream: Couldn't locate log: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175)
>         at 
> 

[jira] [Assigned] (HBASE-26414) Tracing INSTRUMENTATION_NAME is incorrect

2021-11-23 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-26414:
-

Assignee: Nick Dimiduk  (was: zhuobin zheng)

> Tracing INSTRUMENTATION_NAME is incorrect
> -
>
> Key: HBASE-26414
> URL: https://issues.apache.org/jira/browse/HBASE-26414
> Project: HBase
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.5.0, 3.0.0-alpha-2
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
>Priority: Blocker
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>
> I believe the value we use for {{TraceUtil#INSTRUMENTATION_NAME}}, 
> {{"io.opentelemetry.contrib.hbase"}}, is incorrect. According to the java 
> docs,
> {noformat}
>* @param instrumentationName The name of the instrumentation library, not 
> the name of the
>* instrument*ed* library (e.g., "io.opentelemetry.contrib.mongodb"). 
> Must not be null.
> {noformat}
> This namespace appears to be reserved for implementations shipped by the otel 
> project, found under 
> https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation
> I don't have a suggestion for a suitable name at this time. Will report back.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26482) HMaster may clean wals that is replicating in rare cases

2021-11-22 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17447810#comment-17447810
 ] 

zhuobin zheng commented on HBASE-26482:
---

One way to solve the problem: We compare cversion of znode 
/hbase/replication/rs and all znodes of /hbase/replication/rs/${servername}.

Because we only focus on regionserver add and replication queue add. We don't 
care wal add/remove under replicaiton queue.

> HMaster may clean wals that is replicating in rare cases
> 
>
> Key: HBASE-26482
> URL: https://issues.apache.org/jira/browse/HBASE-26482
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: zhuobin zheng
>Priority: Critical
>
> In our cluster, i can found some FileNotFoundException when 
> ReplicationSourceWALReader running for replication recovery queue.
> I guss the wal most likely removed by hmaste. And i found something to 
> support it.
> The method getAllWALs: 
> [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
>    
> |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
>  zk cversion of /hbase/replication/rs as an optimistic lock to control 
> concurrent ops.
> But, zk cversion *only can only reflect the changes of child nodes, but not 
> the changes of grandchildren.*
> So, HMaster may loss some wal from this method in follow situation.
>  # HMaster do log clean , and invoke getAllWALs to filter log which should 
> not be deleted.
>  # HMaster cache current cversion of /hbase/replication/rs  as *v0*
>  # HMaster cache all RS server name, and traverse them, get the WAL in each 
> Queue
>  # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
>  # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
>  # By the way , the cversion of /hbase/replication/rs not changed before all 
> of *RS2* queue is removed, because the children of /hbase/replication/rs not 
> change.
>  # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
> traversed *RS1 ,* and ** this queue not exists in *RS2*
> The above expression is currently only speculation, not confirmed
> Flie Not Found Log.
>  
> {code:java}
> // code placeholder
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.WALEntryStream: Couldn't locate log: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
>         at 
> 

[jira] [Updated] (HBASE-26482) HMaster may clean wals that is replicating in rare cases

2021-11-22 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26482:
--
Summary: HMaster may clean wals that is replicating in rare cases  (was: 
HMaster may clean replication wals in rare cases)

> HMaster may clean wals that is replicating in rare cases
> 
>
> Key: HBASE-26482
> URL: https://issues.apache.org/jira/browse/HBASE-26482
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: zhuobin zheng
>Priority: Critical
>
> In our cluster, i can found some FileNotFoundException when 
> ReplicationSourceWALReader running for replication recovery queue.
> I guss the wal most likely removed by hmaste. And i found something to 
> support it.
> The method getAllWALs: 
> [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
>    
> |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
>  zk cversion of /hbase/replication/rs as an optimistic lock to control 
> concurrent ops.
> But, zk cversion *only can only reflect the changes of child nodes, but not 
> the changes of grandchildren.*
> So, HMaster may loss some wal from this method in follow situation.
>  # HMaster do log clean , and invoke getAllWALs to filter log which should 
> not be deleted.
>  # HMaster cache current cversion of /hbase/replication/rs  as *v0*
>  # HMaster cache all RS server name, and traverse them, get the WAL in each 
> Queue
>  # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
>  # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
>  # By the way , the cversion of /hbase/replication/rs not changed before all 
> of *RS2* queue is removed, because the children of /hbase/replication/rs not 
> change.
>  # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
> traversed *RS1 ,* and ** this queue not exists in *RS2*
> The above expression is currently only speculation, not confirmed
> Flie Not Found Log.
>  
> {code:java}
> // code placeholder
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.WALEntryStream: Couldn't locate log: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192)
>         at 
> 

[jira] [Created] (HBASE-26482) HMaster may clean replication wals in rare cases

2021-11-22 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-26482:
-

 Summary: HMaster may clean replication wals in rare cases
 Key: HBASE-26482
 URL: https://issues.apache.org/jira/browse/HBASE-26482
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: zhuobin zheng


In our cluster, i can found some FileNotFoundException when 
ReplicationSourceWALReader running for replication recovery queue.

I guss the wal most likely removed by hmaste. And i found something to support 
it.

The method getAllWALs: 
[https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
   
|https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
 zk cversion of /hbase/replication/rs as an optimistic lock to control 
concurrent ops.

But, zk cversion *only can only reflect the changes of child nodes, but not the 
changes of grandchildren.*

So, HMaster may loss some wal from this method in follow situation.
 # HMaster do log clean , and invoke getAllWALs to filter log which should not 
be deleted.
 # HMaster cache current cversion of /hbase/replication/rs  as *v0*
 # HMaster cache all RS server name, and traverse them, get the WAL in each 
Queue
 # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
 # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
 # By the way , the cversion of /hbase/replication/rs not changed before all of 
*RS2* queue is removed, because the children of /hbase/replication/rs not 
change.
 # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
traversed *RS1 ,* and ** this queue not exists in *RS2*

The above expression is currently only speculation, not confirmed

Flie Not Found Log.

 
{code:java}
// code placeholder
2021-11-22 15:18:39,593 ERROR 
[ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
 regionserver.WALEntryStream: Couldn't locate log: 
hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
2021-11-22 15:18:39,593 ERROR 
[ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
 regionserver.ReplicationSourceWALReader: Failed to read stream of replication 
entries
java.io.FileNotFoundException: File does not exist: 
hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
        at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
        at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
        at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
        at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
        at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
        at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175)
        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:138)
 {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HBASE-26414) Tracing INSTRUMENTATION_NAME is incorrect

2021-11-22 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-26414:
-

Assignee: zhuobin zheng  (was: Nick Dimiduk)

> Tracing INSTRUMENTATION_NAME is incorrect
> -
>
> Key: HBASE-26414
> URL: https://issues.apache.org/jira/browse/HBASE-26414
> Project: HBase
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.5.0, 3.0.0-alpha-2
>Reporter: Nick Dimiduk
>Assignee: zhuobin zheng
>Priority: Blocker
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>
> I believe the value we use for {{TraceUtil#INSTRUMENTATION_NAME}}, 
> {{"io.opentelemetry.contrib.hbase"}}, is incorrect. According to the java 
> docs,
> {noformat}
>* @param instrumentationName The name of the instrumentation library, not 
> the name of the
>* instrument*ed* library (e.g., "io.opentelemetry.contrib.mongodb"). 
> Must not be null.
> {noformat}
> This namespace appears to be reserved for implementations shipped by the otel 
> project, found under 
> https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation
> I don't have a suggestion for a suitable name at this time. Will report back.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size

2021-11-18 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26467:
--
Status: Patch Available  (was: In Progress)

https://github.com/apache/hbase/pull/3858

> Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size 
> bigger than data chunk size 
> --
>
> Key: HBASE-26467
> URL: https://issues.apache.org/jira/browse/HBASE-26467
> Project: HBase
>  Issue Type: Bug
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Critical
>
> In our company 2.X cluster. I found some region compaction keeps failling 
> because some cell can't construct succefully. In fact , we even can't read 
> these cell.
> From follow stack , we can found the bug cause KeyValue can't constructed.
> Simple Log and Stack: 
> {code:java}
> // code placeholder
> 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] 
> regionserver.CompactSplit: Compaction failed 
> region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f.,
>  storeName=c, priority=-319, startTime=1637225447127 
> java.lang.IllegalArgumentException: Invalid tag length at position=4659867, 
> tagLength=0,         
> at 
> org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
>         at 
> org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
>         at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345)
>         at 
> org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
>         at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748) {code}
> From further observation, I found the following characteristics:
>  # Cell size more than 2M
>  # We can reproduce the bug only after in memory compact
>  # Cell bytes end with \x00\x02\x00\x00
>  
> In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) 
> which only invoked when cell bigger than data chunk size construct cell with 
> wrong length.  So there are 4 bytes (chunk head size) append end of the cell 
> bytes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size

2021-11-18 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-26467 started by zhuobin zheng.
-
> Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size 
> bigger than data chunk size 
> --
>
> Key: HBASE-26467
> URL: https://issues.apache.org/jira/browse/HBASE-26467
> Project: HBase
>  Issue Type: Bug
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Critical
>
> In our company 2.X cluster. I found some region compaction keeps failling 
> because some cell can't construct succefully. In fact , we even can't read 
> these cell.
> From follow stack , we can found the bug cause KeyValue can't constructed.
> Simple Log and Stack: 
> {code:java}
> // code placeholder
> 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] 
> regionserver.CompactSplit: Compaction failed 
> region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f.,
>  storeName=c, priority=-319, startTime=1637225447127 
> java.lang.IllegalArgumentException: Invalid tag length at position=4659867, 
> tagLength=0,         
> at 
> org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
>         at 
> org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
>         at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345)
>         at 
> org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
>         at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748) {code}
> From further observation, I found the following characteristics:
>  # Cell size more than 2M
>  # We can reproduce the bug only after in memory compact
>  # Cell bytes end with \x00\x02\x00\x00
>  
> In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) 
> which only invoked when cell bigger than data chunk size construct cell with 
> wrong length.  So there are 4 bytes (chunk head size) append end of the cell 
> bytes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size

2021-11-18 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26467:
--
Description: 
In our company 2.X cluster. I found some region compaction keeps failling 
because some cell can't construct succefully. In fact , we even can't read 
these cell.

>From follow stack , we can found the bug cause KeyValue can't constructed.

Simple Log and Stack: 
{code:java}
// code placeholder
2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] 
regionserver.CompactSplit: Compaction failed 
region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f.,
 storeName=c, priority=-319, startTime=1637225447127 
java.lang.IllegalArgumentException: Invalid tag length at position=4659867, 
tagLength=0,         
at 
org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
        at 
org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
        at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345)
        at 
org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318)
        at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
        at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748) {code}
>From further observation, I found the following characteristics:
 # Cell size more than 2M
 # We can reproduce the bug only after in memory compact
 # Cell bytes end with \x00\x02\x00\x00

 

In fact, the root reason is method (MemStoreLABImpl.forceCopyOfBigCellInto) 
which only invoked when cell bigger than data chunk size construct cell with 
wrong length.  So there are 4 bytes (chunk head size) append end of the cell 
bytes.

  was:
In our company 2.X cluster. I found some region compaction keeps failling 
because some cell can't construct succefully. In fact , we even can't read 
these cell.

>From follow stack , we can found the bug cause KeyValue can't constructed.

Simple Log and Stack: 
{code:java}
// code placeholder
2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] 
regionserver.CompactSplit: Compaction failed 
region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f.,
 storeName=c, priority=-319, startTime=1637225447127 
java.lang.IllegalArgumentException: Invalid tag length at position=4659867, 
tagLength=0,         
at 
org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
        at 
org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
        at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345)
        at 
org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
   

[jira] [Assigned] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size

2021-11-18 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-26467:
-

Assignee: zhuobin zheng

> Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size 
> bigger than data chunk size 
> --
>
> Key: HBASE-26467
> URL: https://issues.apache.org/jira/browse/HBASE-26467
> Project: HBase
>  Issue Type: Bug
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Critical
>
> In our company 2.X cluster. I found some region compaction keeps failling 
> because some cell can't construct succefully. In fact , we even can't read 
> these cell.
> From follow stack , we can found the bug cause KeyValue can't constructed.
> Simple Log and Stack: 
> {code:java}
> // code placeholder
> 2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] 
> regionserver.CompactSplit: Compaction failed 
> region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f.,
>  storeName=c, priority=-319, startTime=1637225447127 
> java.lang.IllegalArgumentException: Invalid tag length at position=4659867, 
> tagLength=0,         
> at 
> org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
>         at 
> org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
>         at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345)
>         at 
> org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318)
>         at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
>         at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748) {code}
> From further observation, I found the following characteristics:
>  # Cell size more than 2M
>  # We can reproduce the bug only after in memory compact
>  # Cell bytes end with \x00\x02\x00\x00
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26467) Wrong Cell Generated by MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk size

2021-11-18 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-26467:
-

 Summary: Wrong Cell Generated by 
MemStoreLABImpl.forceCopyOfBigCellInto when Cell size bigger than data chunk 
size 
 Key: HBASE-26467
 URL: https://issues.apache.org/jira/browse/HBASE-26467
 Project: HBase
  Issue Type: Bug
Reporter: zhuobin zheng


In our company 2.X cluster. I found some region compaction keeps failling 
because some cell can't construct succefully. In fact , we even can't read 
these cell.

>From follow stack , we can found the bug cause KeyValue can't constructed.

Simple Log and Stack: 
{code:java}
// code placeholder
2021-11-18 16:50:47,708 ERROR [regionserver/:60020-longCompactions-4] 
regionserver.CompactSplit: Compaction failed 
region=xx_table,3610ff49595a0fc4a824f2a575f37a31,1570874723992.dac703ceb35e8d8703233bebf34ae49f.,
 storeName=c, priority=-319, startTime=1637225447127 
java.lang.IllegalArgumentException: Invalid tag length at position=4659867, 
tagLength=0,         
at 
org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueTagBytes(KeyValueUtil.java:685)
        at 
org.apache.hadoop.hbase.KeyValueUtil.checkKeyValueBytes(KeyValueUtil.java:643)
        at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:345)
        at 
org.apache.hadoop.hbase.SizeCachedKeyValue.(SizeCachedKeyValue.java:43)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.getCell(HFileReaderImpl.java:981)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:233)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:418)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:322)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:288)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:487)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor$1.createScanner(Compactor.java:248)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:318)
        at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
        at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1468)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2266)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:624)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:666)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748) {code}
>From further observation, I found the following characteristics:
 # Cell size more than 2M
 # We can reproduce the bug only after in memory compact
 # Cell bytes end with \x00\x02\x00\x00

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26022) DNS jitter causes hbase client to get stuck

2021-06-23 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368577#comment-17368577
 ] 

zhuobin zheng commented on HBASE-26022:
---

In *master branch*,  it seem like RpcClient will dynamic generate server 
principal before create saslClient everyTime.  So, it's not a problem. 
But it seems to be a problem too in branch-1. I will try to fix it latter.

 

 

> DNS jitter causes hbase client to get stuck
> ---
>
> Key: HBASE-26022
> URL: https://issues.apache.org/jira/browse/HBASE-26022
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>
> In our product hbase cluster, we occasionally encounter below errors, and 
> stuck hbase a long time. Then hbase requests to this machine will fail 
> forever.
> {code:java}
> WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
> cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - LOOKING_UP_SERVER)]
> WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
> cause:java.io.IOException: Couldn't setup connection for ${user@realm} to 
> hbase/${ip}@realm
> {code}
> The main problem is  the trully server principal we generated in KDC is  
> hbase/*${hostname}*@realm, so we must can't find  hbase/*${ip}*@realm in KDC.
> When RpcClientImpl#Connection construct, the field serverPrincial which never 
> changed generated by method InetAddress.getCanonicalHostName() which will 
> return IP when failed to get hostname.
> Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will 
> never setup sasl env. And I'm not see connection abandon logic in sasl failed 
> code path.
> I think of two solutions to this problem: 
>  # Abandon connection when sasl failed. So next request will reconstruct a 
> connection, and will regenerate a new server principal.
>  # Refresh serverPrincial field when sasl failed. So next retry will use new 
> server principal.
> HBase Version: 1.2.0-cdh5.14.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-26022) DNS jitter causes hbase client to get stuck

2021-06-22 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-26022:
-

Assignee: zhuobin zheng

> DNS jitter causes hbase client to get stuck
> ---
>
> Key: HBASE-26022
> URL: https://issues.apache.org/jira/browse/HBASE-26022
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>
> In our product hbase cluster, we occasionally encounter below errors, and 
> stuck hbase a long time. Then hbase requests to this machine will fail 
> forever.
> {code:java}
> WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
> cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - LOOKING_UP_SERVER)]
> WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
> cause:java.io.IOException: Couldn't setup connection for ${user@realm} to 
> hbase/${ip}@realm
> {code}
> The main problem is  the trully server principal we generated in KDC is  
> hbase/*${hostname}*@realm, so we must can't find  hbase/*${ip}*@realm in KDC.
> When RpcClientImpl#Connection construct, the field serverPrincial which never 
> changed generated by method InetAddress.getCanonicalHostName() which will 
> return IP when failed to get hostname.
> Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will 
> never setup sasl env. And I'm not see connection abandon logic in sasl failed 
> code path.
> I think of two solutions to this problem: 
>  # Abandon connection when sasl failed. So next request will reconstruct a 
> connection, and will regenerate a new server principal.
>  # Refresh serverPrincial field when sasl failed. So next retry will use new 
> server principal.
> HBase Version: 1.2.0-cdh5.14.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26022) DNS jitter causes hbase client to get stuck

2021-06-22 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26022:
--
Description: 
In our product hbase cluster, we occasionally encounter below errors, and stuck 
hbase a long time. Then hbase requests to this machine will fail forever.
{code:java}
WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Server not found 
in Kerberos database (7) - LOOKING_UP_SERVER)]
WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
cause:java.io.IOException: Couldn't setup connection for ${user@realm} to 
hbase/${ip}@realm
{code}
The main problem is  the trully server principal we generated in KDC is  
hbase/*${hostname}*@realm, so we must can't find  hbase/*${ip}*@realm in KDC.

When RpcClientImpl#Connection construct, the field serverPrincial which never 
changed generated by method InetAddress.getCanonicalHostName() which will 
return IP when failed to get hostname.

Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will 
never setup sasl env. And I'm not see connection abandon logic in sasl failed 
code path.

I think of two solutions to this problem: 
 # Abandon connection when sasl failed. So next request will reconstruct a 
connection, and will regenerate a new server principal.
 # Refresh serverPrincial field when sasl failed. So next retry will use new 
server principal.

HBase Version: 1.2.0-cdh5.14.4

  was:
In our product hbase cluster, we occasionally encounter  errors

 


> DNS jitter causes hbase client to get stuck
> ---
>
> Key: HBASE-26022
> URL: https://issues.apache.org/jira/browse/HBASE-26022
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: zhuobin zheng
>Priority: Major
>
> In our product hbase cluster, we occasionally encounter below errors, and 
> stuck hbase a long time. Then hbase requests to this machine will fail 
> forever.
> {code:java}
> WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
> cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - LOOKING_UP_SERVER)]
> WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:${user@realm} (auth:KERBEROS) 
> cause:java.io.IOException: Couldn't setup connection for ${user@realm} to 
> hbase/${ip}@realm
> {code}
> The main problem is  the trully server principal we generated in KDC is  
> hbase/*${hostname}*@realm, so we must can't find  hbase/*${ip}*@realm in KDC.
> When RpcClientImpl#Connection construct, the field serverPrincial which never 
> changed generated by method InetAddress.getCanonicalHostName() which will 
> return IP when failed to get hostname.
> Therefor, once DNS jitter when RpcClientImpl#Connection, this connection will 
> never setup sasl env. And I'm not see connection abandon logic in sasl failed 
> code path.
> I think of two solutions to this problem: 
>  # Abandon connection when sasl failed. So next request will reconstruct a 
> connection, and will regenerate a new server principal.
>  # Refresh serverPrincial field when sasl failed. So next retry will use new 
> server principal.
> HBase Version: 1.2.0-cdh5.14.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26022) DNS jitter causes hbase client to get stuck

2021-06-22 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-26022:
--
Description: 
In our product hbase cluster, we occasionally encounter  errors

 

> DNS jitter causes hbase client to get stuck
> ---
>
> Key: HBASE-26022
> URL: https://issues.apache.org/jira/browse/HBASE-26022
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: zhuobin zheng
>Priority: Major
>
> In our product hbase cluster, we occasionally encounter  errors
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26022) DNS jitter causes hbase client to get stuck

2021-06-22 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-26022:
-

 Summary: DNS jitter causes hbase client to get stuck
 Key: HBASE-26022
 URL: https://issues.apache.org/jira/browse/HBASE-26022
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: zhuobin zheng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21183) loadIncrementalHFiles sometimes throws FileNotFoundException on retry

2021-03-07 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297137#comment-17297137
 ] 

zhuobin zheng commented on HBASE-21183:
---

May caused by https://issues.apache.org/jira/browse/HBASE-19065. 
 # client request bulkload and server move file to 
/hbase/data/${namespace}/\{table}/\{region}/\{columnfamily}/
 # concurrent flush cause bulkload failed
 # bulkload client want retry and failed because file is not exists. 

> loadIncrementalHFiles sometimes throws FileNotFoundException on retry
> -
>
> Key: HBASE-21183
> URL: https://issues.apache.org/jira/browse/HBASE-21183
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Tim Robertson
>Priority: Major
>
> On a nightly batch job which prepares 100s of well balanced HFiles at around 
> 2GB each, we see sporadic failures in a bulk load. 
> I'm unable to paste the logs here (different network) but they show e.g. the 
> following on a failing day:
> {code:java}
> Trying to load hfile... /my/input/path/...
> Attempt to bulk load region containing ... failed. This is recoverable and 
> will be retried
> Attempt to bulk load region containing ... failed. This is recoverable and 
> will be retried
> Attempt to bulk load region containing ... failed. This is recoverable and 
> will be retried
> Split occurred while grouping HFiles, retry attempt 1 with 3 files remaining 
> to group or split
> Trying to load hfile...
> IOException during splitting
> java.io.FileNotFoundException: File does not exist: /my/input/path/...
> {code}
> The exception get's thrown from [this 
> line|https://github.com/apache/hbase/blob/branch-1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java#L685].
>   
>  I should note that this is a secure cluster (CDH 5.12.x).
> I've tried to go through the code, and don't spot an obvious race condition. 
> I don't spot any changes related to this for the later 1.x versions so 
> presume this exists in 1.5.
> I'm yet to get access to the NameNode audit logs when this occurs to trace 
> through the rename() calls around these particular files.
> I don't see timeouts like HBASE-4030



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error

2021-01-17 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25512:
--
Fix Version/s: 2.5.0
   1.7.0
   3.0.0-alpha-1

> May throw StringIndexOutOfBoundsException when construct illegal tablename 
> error
> 
>
> Key: HBASE-25512
> URL: https://issues.apache.org/jira/browse/HBASE-25512
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Trivial
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0
>
>
> When call Method:
> {code:java}
> // code placeholder
> TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, 
> int end)
> {code}
> We want to construct quelifier String to print pretty error message. So we 
> call method: 
> {code:java}
> // code placeholder
> Bytes.toString(final byte[] b, int off, int len)
> Bytes.toString(qualifierName, start, end)
> {code}
> But the param is wrong, we shoud pass *${Length}* instead of *${end index}* 
> to *Bytes.toString third param*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error

2021-01-17 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng reassigned HBASE-25512:
-

Assignee: zhuobin zheng

> May throw StringIndexOutOfBoundsException when construct illegal tablename 
> error
> 
>
> Key: HBASE-25512
> URL: https://issues.apache.org/jira/browse/HBASE-25512
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Trivial
>
> When call Method:
> {code:java}
> // code placeholder
> TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, 
> int end)
> {code}
> We want to construct quelifier String to print pretty error message. So we 
> call method: 
> {code:java}
> // code placeholder
> Bytes.toString(final byte[] b, int off, int len)
> Bytes.toString(qualifierName, start, end)
> {code}
> But the param is wrong, we shoud pass *${Length}* instead of *${end index}* 
> to *Bytes.toString third param*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-17 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17266971#comment-17266971
 ] 

zhuobin zheng commented on HBASE-25510:
---

Thanks for your watching [~vjasani] .
1.  I used JMH to benchmark. And i provide benchmark code in 
Attachments([^TestTableNameJMH.java])

2.  The errors represent in results is my benchmark code bug. (out the range of 
int. so negative number out of bounds of array.)

I fixed the bug and re-uploaded a benchmark result:  [^optimiz_benchmark_fix].

 

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0
>
> Attachments: TestTableNameJMH.java, optimiz_benchmark, 
> optimiz_benchmark_fix, origin_benchmark, stucks-profile-info
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-17 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Attachment: (was: TestTableNameJMH.java)

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0
>
> Attachments: TestTableNameJMH.java, optimiz_benchmark, 
> optimiz_benchmark_fix, origin_benchmark, stucks-profile-info
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-17 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Attachment: TestTableNameJMH.java

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0
>
> Attachments: TestTableNameJMH.java, optimiz_benchmark, 
> optimiz_benchmark_fix, origin_benchmark, stucks-profile-info
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-17 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Attachment: optimiz_benchmark_fix

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0
>
> Attachments: TestTableNameJMH.java, optimiz_benchmark, 
> optimiz_benchmark_fix, origin_benchmark, stucks-profile-info
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-17 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Attachment: TestTableNameJMH.java

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0
>
> Attachments: TestTableNameJMH.java, optimiz_benchmark, 
> origin_benchmark, stucks-profile-info
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error

2021-01-15 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25512:
--
Priority: Trivial  (was: Minor)

> May throw StringIndexOutOfBoundsException when construct illegal tablename 
> error
> 
>
> Key: HBASE-25512
> URL: https://issues.apache.org/jira/browse/HBASE-25512
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Trivial
>
> When call Method:
> {code:java}
> // code placeholder
> TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, 
> int end)
> {code}
> We want to construct quelifier String to print pretty error message. So we 
> call method: 
> {code:java}
> // code placeholder
> Bytes.toString(final byte[] b, int off, int len)
> Bytes.toString(qualifierName, start, end)
> {code}
> But the param is wrong, we shoud pass *${Length}* instead of *${end index}* 
> to *Bytes.toString third param*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error

2021-01-14 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25512:
--
Priority: Minor  (was: Trivial)

> May throw StringIndexOutOfBoundsException when construct illegal tablename 
> error
> 
>
> Key: HBASE-25512
> URL: https://issues.apache.org/jira/browse/HBASE-25512
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Minor
>
> When call Method:
> {code:java}
> // code placeholder
> TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, 
> int end)
> {code}
> We want to construct quelifier String to print pretty error message. So we 
> call method: 
> {code:java}
> // code placeholder
> Bytes.toString(final byte[] b, int off, int len)
> Bytes.toString(qualifierName, start, end)
> {code}
> But the param is wrong, we shoud pass *${Length}* instead of *${end index}* 
> to *Bytes.toString third param*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error

2021-01-14 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25512:
--
External issue URL: https://github.com/apache/hbase/pull/2884

> May throw StringIndexOutOfBoundsException when construct illegal tablename 
> error
> 
>
> Key: HBASE-25512
> URL: https://issues.apache.org/jira/browse/HBASE-25512
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Trivial
>
> When call Method:
> {code:java}
> // code placeholder
> TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, 
> int end)
> {code}
> We want to construct quelifier String to print pretty error message. So we 
> call method: 
> {code:java}
> // code placeholder
> Bytes.toString(final byte[] b, int off, int len)
> Bytes.toString(qualifierName, start, end)
> {code}
> But the param is wrong, we shoud pass *${Length}* instead of *${end index}* 
> to *Bytes.toString third param*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25512) May throw StringIndexOutOfBoundsException when construct illegal tablename error

2021-01-14 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-25512:
-

 Summary: May throw StringIndexOutOfBoundsException when construct 
illegal tablename error
 Key: HBASE-25512
 URL: https://issues.apache.org/jira/browse/HBASE-25512
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.4.1, 1.4.13, 1.2.12
Reporter: zhuobin zheng


When call Method:
{code:java}
// code placeholder
TableName.isLegalTableQualifierName(final byte[] qualifierName, int start, int 
end)
{code}
We want to construct quelifier String to print pretty error message. So we call 
method: 
{code:java}
// code placeholder
Bytes.toString(final byte[] b, int off, int len)
Bytes.toString(qualifierName, start, end)
{code}
But the param is wrong, we shoud pass *${Length}* instead of *${end index}* to 
*Bytes.toString third param*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-14 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Attachment: stucks-profile-info

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Major
> Attachments: optimiz_benchmark, origin_benchmark, stucks-profile-info
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-14 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17265688#comment-17265688
 ] 

zhuobin zheng commented on HBASE-25510:
---

Add MR links and add benchmark attachments.

Benchmark Result Explanation:
 # testStr means called  TableName.valueOf(String name)
 # tastBB means called  TableName valueOf(ByteBuffer namespace, ByteBuffer 
qualifier)
 # The number after testStr and testBB is TableNames num. like 1000 means 1000 
different tableName.

Origin:

 
{code:java}
// code placeholder
BenchmarkMode  Cnt   Score   Error   Units
TestTableNameJMH.testBB1thrpt   10   36132.014 ±  1628.381  ops/ms
TestTableNameJMH.testBB10   thrpt   10   14056.243 ±   638.379  ops/ms
TestTableNameJMH.testBB100  thrpt   102215.671 ±49.759  ops/ms
TestTableNameJMH.testBB1000 thrpt   10 224.802 ± 4.253  ops/ms
TestTableNameJMH.testBB1thrpt   10  22.476 ± 4.729  ops/ms
TestTableNameJMH.testBB10   thrpt   10   1.931 ± 0.578  ops/ms
TestTableNameJMH.testStr1   thrpt   10  147905.572 ± 20777.681  ops/ms
TestTableNameJMH.testStr10  thrpt   10   44597.261 ±  6346.679  ops/ms
TestTableNameJMH.testStr100 thrpt   105464.205 ±  1442.556  ops/ms
TestTableNameJMH.testStr1000thrpt   10 360.183 ±   127.615  ops/ms
TestTableNameJMH.testStr1   thrpt   10  45.338 ± 3.545  ops/ms
TestTableNameJMH.testStr10  thrpt   10   1.927 ± 0.831  ops/ms


{code}
After Optimize:
{code:java}
// code placeholder
BenchmarkMode  Cnt   Score   Error   Units
TestTableNameJMH.testBB1thrpt   10   21585.408 ±  2519.495  ops/ms
TestTableNameJMH.testBB10   thrpt   10   23474.278 ±   175.576  ops/ms
TestTableNameJMH.testBB100  thrpt   10   20600.624 ±  4035.725  ops/ms
TestTableNameJMH.testBB1000 thrpt   10   18349.054 ±   313.875  ops/ms
TestTableNameJMH.testBB1thrpt   10   15981.688 ±   836.096  ops/ms
TestTableNameJMH.testBB10   thrpt   10   14276.288 ±   201.779  ops/ms
TestTableNameJMH.testStr1   thrpt   10  239837.152 ± 10767.013  ops/ms
TestTableNameJMH.testStr10  thrpt4  236578.812 ± 57640.770  ops/ms
TestTableNameJMH.testStr100 thrpt5  227980.174 ± 44822.292  ops/ms
TestTableNameJMH.testStr1000thrpt   10  131935.073 ±  4495.644  ops/ms
TestTableNameJMH.testStr1   thrpt   10   81979.448 ±  3230.575  ops/ms
TestTableNameJMH.testStr10  thrpt   10   61054.516 ± 10613.181  ops/ms
{code}
 

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Major
> Attachments: optimiz_benchmark, origin_benchmark
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-14 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Attachment: optimiz_benchmark
origin_benchmark

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Major
> Attachments: optimiz_benchmark, origin_benchmark
>
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-14 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
External issue URL: https://github.com/apache/hbase/pull/2885

> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Major
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-14 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-25510:
--
Description: 
Now, TableName.valueOf will try to find TableName Object in cache linearly(code 
show as below). So it is too slow when we has  thousands of tables on cluster.
{code:java}
// code placeholder
for (TableName tn : tableCache) {
  if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
bns)) {
return tn;
  }
}{code}
I try to store the object in the hash table, so it can look up more quickly. 
code like this
{code:java}
// code placeholder
TableName oldTable = tableCache.get(nameAsStr);{code}
 

In our cluster which has tens thousands of tables. (Most of that is KYLIN 
table). 
 We found that in the following two cases, the TableName.valueOf method will 
severely restrict our performance.
  
 Common premise: tens of thousands table in cluster
 cause: TableName.valueOf with low performance. (because we need to traverse 
all caches linearly)
  
 Case1. Replication
 premise1: one of table write with high qps, small value, Non-batch request. 
cause too much wal entry

premise2: deserialize WAL Entry includes calling the TableName.valueOf method.

Cause: Replicat Stuck. A lot of WAL files pile up.

 

Case2. Active Master Start up

NamespaceStateManager init should init all RegionInfo, and regioninfo init will 
call TableName.valueOf.  It will cost some time if TableName.valueOf is slow.
  

  was:
There are tens of thousands of tables on our cluster (Most of that is KYLIN 
table). 
We found that in the following two cases, the TableName.valueOf method will 
severely restrict our performance.
 
Common premise: tens of thousands table in cluster
cause: TableName.valueOf with low performance. (because we need to traverse all 
caches linearly)
 
Case1. Replication
premise: one of table write with high qps, small value, Non-batch request.
cause: There are too much wal entry in WAL. So we need to deserialize too many 
WAL Entry which includes calling the TableName.valueOf method to instantiate 
the TableName object.


Case2. Active Master Start up
 
 


> Optimize TableName.valueOf from O(n) to O(1).  We can get benefits when the 
> number of tables in the cluster is greater than dozens
> --
>
> Key: HBASE-25510
> URL: https://issues.apache.org/jira/browse/HBASE-25510
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Replication
>Affects Versions: 1.2.12, 1.4.13, 2.4.1
>Reporter: zhuobin zheng
>Priority: Major
>
> Now, TableName.valueOf will try to find TableName Object in cache 
> linearly(code show as below). So it is too slow when we has  thousands of 
> tables on cluster.
> {code:java}
> // code placeholder
> for (TableName tn : tableCache) {
>   if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), 
> bns)) {
> return tn;
>   }
> }{code}
> I try to store the object in the hash table, so it can look up more quickly. 
> code like this
> {code:java}
> // code placeholder
> TableName oldTable = tableCache.get(nameAsStr);{code}
>  
> In our cluster which has tens thousands of tables. (Most of that is KYLIN 
> table). 
>  We found that in the following two cases, the TableName.valueOf method will 
> severely restrict our performance.
>   
>  Common premise: tens of thousands table in cluster
>  cause: TableName.valueOf with low performance. (because we need to traverse 
> all caches linearly)
>   
>  Case1. Replication
>  premise1: one of table write with high qps, small value, Non-batch request. 
> cause too much wal entry
> premise2: deserialize WAL Entry includes calling the TableName.valueOf method.
> Cause: Replicat Stuck. A lot of WAL files pile up.
>  
> Case2. Active Master Start up
> NamespaceStateManager init should init all RegionInfo, and regioninfo init 
> will call TableName.valueOf.  It will cost some time if TableName.valueOf is 
> slow.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25510) Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

2021-01-14 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-25510:
-

 Summary: Optimize TableName.valueOf from O(n) to O(1).  We can get 
benefits when the number of tables in the cluster is greater than dozens
 Key: HBASE-25510
 URL: https://issues.apache.org/jira/browse/HBASE-25510
 Project: HBase
  Issue Type: Improvement
  Components: master, Replication
Affects Versions: 2.4.1, 1.4.13, 1.2.12
Reporter: zhuobin zheng


There are tens of thousands of tables on our cluster (Most of that is KYLIN 
table). 
We found that in the following two cases, the TableName.valueOf method will 
severely restrict our performance.
 
Common premise: tens of thousands table in cluster
cause: TableName.valueOf with low performance. (because we need to traverse all 
caches linearly)
 
Case1. Replication
premise: one of table write with high qps, small value, Non-batch request.
cause: There are too much wal entry in WAL. So we need to deserialize too many 
WAL Entry which includes calling the TableName.valueOf method to instantiate 
the TableName object.


Case2. Active Master Start up
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT

2019-12-27 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004385#comment-17004385
 ] 

zhuobin zheng commented on HBASE-20673:
---

I Can't add attachment. So I add some info in this comment.

version: cdh5-1.2.0_5.14.4

 

-
Profiler Info:
 ns percent samples top
 -- --- --- ---
219079399723 16.81% 21956 itable stub
187885376151 14.42% 18830 itable stub
168718141522 12.95% 16909 itable stub
149243475899 11.45% 14957 itable stub
108522505239 8.33% 10876 itable stub
 54090604368 4.15% 5421 itable stub
 45659351417 3.50% 4576 org.apache.hadoop.hbase.CellComparator.compareRows_[j]
 41398429360 3.18% 4149 itable stub
 32259010132 2.48% 3233 itable stub
 30253160585 2.32% 3032 itable stub
 10897498052 0.84% 1092 HeapRegion::block_size(HeapWord const*) const
 10467104761 0.80% 1049 org.apache.hadoop.hbase.KeyValue.getFamilyLength_[j]
 10176340086 0.78% 1020 G1ParScanThreadState::trim_queue()
 9867105912 0.76% 989 G1ParScanThreadState::copy_to_survivor_space(InCSetState, 
oopDesc*, markOopDesc*)
 9768378849 0.75% 979 itable stub

 

-
Jstack Info:

"RpcServer.RW.fifo.Q.write.handler=73,queue=1,port=60020" #133 daemon prio=5 
os_prio=0 tid=0x7faad0097800 nid=0x403d runnable [0x7faaceeec000]
 java.lang.Thread.State: RUNNABLE
 at org.apache.hadoop.hbase.CellComparator.compareRows(CellComparator.java:186)
 at org.apache.hadoop.hbase.CellComparator.compare(CellComparator.java:63)
 at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:2020)
 at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897)
 at 
java.util.concurrent.ConcurrentSkipListMap.cpr(ConcurrentSkipListMap.java:655)
 at 
java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:899)
 at 
java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1581)
 at 
org.apache.hadoop.hbase.regionserver.CellSkipListSet.add(CellSkipListSet.java:134)
 at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.addToCellSet(DefaultMemStore.java:242)
 at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:276)
 at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.add(DefaultMemStore.java:233)
 at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:686)
 at 
org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3807)
 at 
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3280)
 at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2944)
 at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2886)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:765)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:716)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2146)
 at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33656)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2191)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:183)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:163)


"RpcServer.RW.fifo.Q.write.handler=71,queue=8,port=60020" #131 daemon prio=5 
os_prio=0 tid=0x7faad0093800 nid=0x403b runnable [0x7faacf0ee000]
 java.lang.Thread.State: RUNNABLE
 at 
org.apache.hadoop.hbase.CellComparator.compareColumns(CellComparator.java:157)
 at 
org.apache.hadoop.hbase.CellComparator.compareWithoutRow(CellComparator.java:224)
 at org.apache.hadoop.hbase.CellComparator.compare(CellComparator.java:66)
 at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:2020)
 at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897)
 at 
java.util.concurrent.ConcurrentSkipListMap.cpr(ConcurrentSkipListMap.java:655)
 at 
java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:899)
 at 
java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1581)
 at 
org.apache.hadoop.hbase.regionserver.CellSkipListSet.add(CellSkipListSet.java:134)
 at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.addToCellSet(DefaultMemStore.java:242)
 at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:276)
 at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.add(DefaultMemStore.java:233)
 at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:686)
 at 

[jira] [Commented] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT

2019-12-27 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004384#comment-17004384
 ] 

zhuobin zheng commented on HBASE-20673:
---

Hi, [~stack] 
 I also encountered similar problems in my production environment: *Too many 
KeyValue implementation types confused JIT* .

And cresult in a large amount of CPU waste, resulting in full cpu usage of 
singal server(48 core, cpu 4800%), and a significant decrease in read and write 
speeds.

Through Profiler analysis, it is found that itable stub takes a lot of CPU time 
(almost 70%)
Through Jstack: A large number of read and write threads are stuck in several 
special places where CellComparator compares: 

[https://github.com/apache/hbase/blob/branch-1.2/hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparator.java#L186]

[https://github.com/apache/hbase/blob/branch-1.2/hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparator.java#L157]

The Jstack really confused me. Because this line of code does nothing but 
resolve the actual calling code address.

But combining the above profiler result (70% cpu in itable stub).

Now, I think there are two possible reasons for the crazy use of cpu.
 # Too many Cell implements confuesd JIT. Result jvm interface call to original 
itable scan.
 # KeyValue implements too many interface, cause itable too long.

 
{code:java}
//代码占位符
public class KeyValue implements Cell, HeapSize, Cloneable, SettableSequenceId, 
SettableTimestamp{code}
 

I think this situation will occur when the two types of KeyValue / 
NoTagKeyValue are more evenly distributed. 

But unfortunately, although it has always appeared in the production 
environment, I cannot reproduce it in the test environment, so I cannot provide 
a better test solution.

Now I will try to modify a part of the code so that a large number of Cell 
implementations are of one type, and put it into the production environment to 
see if it can solve the 100% CPU problem.

 

 

> Reduce the number of Cell implementations; the profusion is distracting to 
> users and JIT
> 
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell implementations in hbase. Purge the bulk of 
> them. Make it so we do one type well. JIT gets confused if it has an abstract 
> or an interface and then the instantiated classes are myriad (megamorphic).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23598) There are too much small WAL File

2019-12-19 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-23598:
--
Attachment: HBASE-23598.patch

> There are too much small WAL File
> -
>
> Key: HBASE-23598
> URL: https://issues.apache.org/jira/browse/HBASE-23598
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 1.3.6, 2.2.2
> Environment: hbase version: cdh5-1.2.0_5.14.4
> hbase.wal.provider: multiwal
> hbase.wal.regiongrouping.numgroups: 4
> The wals file shows 100+ wal files in wal-3 , and some of them has very small 
> size
>Reporter: zhuobin zheng
>Priority: Major
> Attachments: HBASE-23598.patch, wals
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files 
> will cause the cluster and recover very slowly when cluster crash completely 
> . (In the split log step) (because too many WAL files will cause too many ZK 
> requests). By default, WAL files start to roll when they reach HDFS Block 
> Size (256M In My Case) * 0.95. But I found that there are many small files 
> (0-100M) in the WAL directory. When I look at the code , I found that when I 
> configured multiwal (I configured 4 WALs for each RS), as long as a single 
> WAL file reached HDFS Block Size (256M In My Case) * 0.95, all WAL files 
> would scroll, so it caused a lot of WAL small files.
> I tried to modify the code to solve the problem (making each WAL scroll 
> independently). Although this change is very small, I am not sure if such a 
> change will cause other problems, currently being tested ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23598) There are too much small WAL File

2019-12-19 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HBASE-23598:
--
Attachment: wals
   Component/s: wal
 Affects Version/s: 2.2.2
   Description: 
I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files will 
cause the cluster and recover very slowly when cluster crash completely . (In 
the split log step) (because too many WAL files will cause too many ZK 
requests). By default, WAL files start to roll when they reach HDFS Block Size 
(256M In My Case) * 0.95. But I found that there are many small files (0-100M) 
in the WAL directory. When I look at the code , I found that when I configured 
multiwal (I configured 4 WALs for each RS), as long as a single WAL file 
reached HDFS Block Size (256M In My Case) * 0.95, all WAL files would scroll, 
so it caused a lot of WAL small files.
I tried to modify the code to solve the problem (making each WAL scroll 
independently). Although this change is very small, I am not sure if such a 
change will cause other problems, currently being tested ...
   Environment: 
hbase version: cdh5-1.2.0_5.14.4

hbase.wal.provider: multiwal

hbase.wal.regiongrouping.numgroups: 4

The wals file shows 100+ wal files in wal-3 , and some of them has very small 
size
   Summary: There are too much small WAL File  (was: There are too 
much small WAL)
Remaining Estimate: 168h
 Original Estimate: 168h

> There are too much small WAL File
> -
>
> Key: HBASE-23598
> URL: https://issues.apache.org/jira/browse/HBASE-23598
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 1.3.6, 2.2.2
> Environment: hbase version: cdh5-1.2.0_5.14.4
> hbase.wal.provider: multiwal
> hbase.wal.regiongrouping.numgroups: 4
> The wals file shows 100+ wal files in wal-3 , and some of them has very small 
> size
>Reporter: zhuobin zheng
>Priority: Major
> Attachments: wals
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files 
> will cause the cluster and recover very slowly when cluster crash completely 
> . (In the split log step) (because too many WAL files will cause too many ZK 
> requests). By default, WAL files start to roll when they reach HDFS Block 
> Size (256M In My Case) * 0.95. But I found that there are many small files 
> (0-100M) in the WAL directory. When I look at the code , I found that when I 
> configured multiwal (I configured 4 WALs for each RS), as long as a single 
> WAL file reached HDFS Block Size (256M In My Case) * 0.95, all WAL files 
> would scroll, so it caused a lot of WAL small files.
> I tried to modify the code to solve the problem (making each WAL scroll 
> independently). Although this change is very small, I am not sure if such a 
> change will cause other problems, currently being tested ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23598) There are too much small WAL

2019-12-19 Thread zhuobin zheng (Jira)
zhuobin zheng created HBASE-23598:
-

 Summary: There are too much small WAL
 Key: HBASE-23598
 URL: https://issues.apache.org/jira/browse/HBASE-23598
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.3.6
Reporter: zhuobin zheng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)