from:"Yutong Xiao \(Jira\)"

[jira] [Commented] (HBASE-26780) HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in onDiskSizeWithHeader= A != B

2023-09-06 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762398#comment-17762398
 ] 

Yutong Xiao commented on HBASE-26780:
-

[~ndimiduk] hi, when we got the feedback of the issue, the problematic file is 
already compacted away as well. The problem cannot be reproduced. So that we 
only did some reviews about related hbase code. When we met the issue, the B is 
33, which is just the header size with checksum used in 
verifyOnDiskSizeMatchesHeader#getOnDiskSizeWithHeader#headerSize. That means 
the actual block size is missed. We currently try to re-read the header once 
again like I did in my mr. We have onlined the changes in my MR, it looks so 
far so good. If it is actual the corruption in hdfs block, we should meet it 
once again.

But from [~cribbee] 's log, the error value is not the same. The cause may be 
different. 

FYI, we are running 1.4.12 + some customised patches.

> HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in 
> onDiskSizeWithHeader= A != B
> --
>
> Key: HBASE-26780
> URL: https://issues.apache.org/jira/browse/HBASE-26780
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.2.2
>Reporter: yuzhang
>Priority: Major
> Attachments: IOException.png
>
>
> When I scan a region, HBase throw IOException: Passed in 
> onDiskSizeWithHeader= A != B
> The HFile mentioned Error message can be access normally.
> it recover by command – move region. I guess that onDiskSizeWithHeader of 
> HFileBlock has been changed. And RS get the correct BlockHeader Info after 
> region reopened.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-26780) HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in onDiskSizeWithHeader= A != B

2023-08-15 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754506#comment-17754506
 ] 

Yutong Xiao commented on HBASE-26780:
-

We also met this issue recently. The cause is that the block size read from the 
cached header in FSReaderImpl is not correct (But not clear why the header is 
incorrect without any IOException from hdfs client). As the hfile is not 
corrupted, read the header again from hdfs should work to avoid this issue. I 
raised an MR to read the header again when the cached header dose not match the 
block size.

> HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in 
> onDiskSizeWithHeader= A != B
> --
>
> Key: HBASE-26780
> URL: https://issues.apache.org/jira/browse/HBASE-26780
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.2.2
>Reporter: yuzhang
>Priority: Major
> Attachments: IOException.png
>
>
> When I scan a region, HBase throw IOException: Passed in 
> onDiskSizeWithHeader= A != B
> The HFile mentioned Error message can be access normally.
> it recover by command – move region. I guess that onDiskSizeWithHeader of 
> HFileBlock has been changed. And RS get the correct BlockHeader Info after 
> region reopened.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-17 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743649#comment-17743649
 ] 

Yutong Xiao commented on HBASE-27962:
-

Added three sub properties to make the ratio configuration more flexible.

> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, when we set the ratio to 0.2, if we 
> have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 
> 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 
> for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility under various workloads 
> and guarantee a level of R/W/S isolation for requests processing at the same 
> time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-14 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743132#comment-17743132
 ] 

Yutong Xiao edited comment on HBASE-27962 at 7/14/23 12:24 PM:
---

There is one thing I need to figure out that the shared handlers will run the 
queued requests first (that is to say, a shared write handler will process 
write requests with higher priority), and it only run other type requests when 
it is idling, when there is no idling handlers it could be regarded as an 
RWQueueExecutor. Cannot agree with you that this is a serious problem. And also 
not get the point that how it make slow requests hard to debug, we have a lot 
of metrics for debugging.  

Further, HBASE-27766 has no fastpath feature. The mode need to suffer the cost 
of queue lock competing. 

As for the isolate model of reads and write, we can also introduce two ratios, 
one control shared writers, one control shared readers. This could let the 
client escalate the handler allocation.

HBASE-27766 only consider idling scan handlers run gets, but HBASE-27962 also 
covers idling write handlers and idling get handlers. From my point of view, 
HBASE-27766 is redundant when we employed HBASE-27962.

Thank you~


was (Author: xytss123):
There is one thing I need to figure out that the shared handlers will run the 
queued requests first (that is to say, a shared write handler will process 
write requests with high priority), and it only run other type requests when it 
is idling, when there is no idling handlers it could be regarded as an 
RWQueueExecutor. Cannot agree with you that this is a serious problem. And also 
not get the point that how it make slow requests hard to debug, we have a lot 
of metrics for debugging.  

Further, HBASE-27766 has no fastpath feature. The mode need to suffer the cost 
of queue lock competing. 

As for the isolate model of reads and write, we can also introduce two ratios, 
one control shared writers, one control shared readers. This could let the 
client escalate the handler allocation.

HBASE-27766 only consider idling scan handlers run gets, but HBASE-27962 also 
covers idling write handlers and idling get handlers. From my point of view, 
HBASE-27766 is redundant when we employed HBASE-27962.

Thank you~

> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, when we set the ratio to 0.2, if we 
> have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 
> 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 
> for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility under various workloads 
> and guarantee a level of R/W/S isolation for requests processing at the same 
> time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-14 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743132#comment-17743132
 ] 

Yutong Xiao commented on HBASE-27962:
-

There is one thing I need to figure out that the shared handlers will run the 
queued requests first (that is to say, a shared write handler will process 
write requests with high priority), and it only run other type requests when it 
is idling, when there is no idling handlers it could be regarded as an 
RWQueueExecutor. Cannot agree with you that this is a serious problem. And also 
not get the point that how it make slow requests hard to debug, we have a lot 
of metrics for debugging.  

Further, HBASE-27766 has no fastpath feature. The mode need to suffer the cost 
of queue lock competing. 

As for the isolate model of reads and write, we can also introduce two ratios, 
one control shared writers, one control shared readers. This could let the 
client escalate the handler allocation.

HBASE-27766 only consider idling scan handlers run gets, but HBASE-27962 also 
covers idling write handlers and idling get handlers. From my point of view, 
HBASE-27766 is redundant when we employed HBASE-27962.

Thank you~

> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, when we set the ratio to 0.2, if we 
> have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 
> 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 
> for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility under various workloads 
> and guarantee a level of R/W/S isolation for requests processing at the same 
> time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-13 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742999#comment-17742999
 ] 

Yutong Xiao commented on HBASE-27962:
-

HBASE-27962 is based on the FastPathFeature introduced in HBASE-26551, which is 
proved having a better performance than original RWQueueExecutor.

Furthermore, AdaptiveFastPathRWRpcExecutor also covers the write handlers.  

Besides, AdaptiveFastPathRWRpcExecutor do not need to calculate the priority, 
when taking requests.

In this case, AdaptiveFastPathRWRpcExecutor outperforms RpcStealQueue 
theoretically.  What do you think [~haxiaolin]?

> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, when we set the ratio to 0.2, if we 
> have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 
> 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 
> for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility under various workloads 
> and guarantee a level of R/W/S isolation for requests processing at the same 
> time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: 
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, when we set the ratio to 0.2, if we 
have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 
+ 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for 
scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility under various workloads and 
guarantee a level of R/W/S isolation for requests processing at the same time.

  was:
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, when we set the ratio to 0.2, if we 
have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 
+ 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for 
scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility in various workloads and 
guarantee a level of R/W/S isolation for requests processing at the same time.


> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, when we set the ratio to 0.2, if we 
> have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 
> 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 
> for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility under various workloads 
> and guarantee a level of R/W/S isolation for requests processing at the same 
> time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: 
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, when we set the ratio to 0.2, if we 
have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 
+ 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for 
scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility in various workloads and 
guarantee a level of R/W/S isolation for requests processing at the same time.

  was:
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility in various workloads and 
guarantee a level of R/W/S isolation for requests processing at the same time.


> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, when we set the ratio to 0.2, if we 
> have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 
> 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 
> for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility in various workloads and 
> guarantee a level of R/W/S isolation for requests processing at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: 
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility in various workloads and 
guarantee a level of R/W/S isolation for requests processing at the same time.

  was:
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility in various workload and 
guarantee a level of R/W/S isolation for requests processing at the same time.


> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, if we have 100 handlers, 50 for 
> write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 
> 40 isolated handlers for write, 20 for get and 20 for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility in various workloads and 
> guarantee a level of R/W/S isolation for requests processing at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: 
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

This improvement will improve the resource utility in various workload and 
guarantee a level of R/W/S isolation for requests processing at the same time.

  was:
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.


> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, if we have 100 handlers, 50 for 
> write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 
> 40 isolated handlers for write, 20 for get and 20 for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.
> This improvement will improve the resource utility in various workload and 
> guarantee a level of R/W/S isolation for requests processing at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: 
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * Shared handler could run all the three kinds of requests.
 * Shared handler will be shared only when it is idle.
 * Shared handler is also bounded to a kind of RPCQueue, it will process the 
requests in that queue first.

  was:
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * When there is no idle fastpath handler for the three groups of handlers 
respectively, it will try to get an idling shared handlers.
 * The shared handlers are bounded to their own R/W/S queue, it will process 
that kind of request first and only be shared after being added to idle handler 
stacks.


> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, if we have 100 handlers, 50 for 
> write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 
> 40 isolated handlers for write, 20 for get and 20 for scan).
>  * Shared handler could run all the three kinds of requests.
>  * Shared handler will be shared only when it is idle.
>  * Shared handler is also bounded to a kind of RPCQueue, it will process the 
> requests in that queue first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: 
We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write 
handlers are fixed, which make the RegionServer performance not so good in our 
prod env.

The logic of it is described below:
 * The basic architecture is same as FastPathRWRpcExecutor.
 * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
number of shared handlers. (for example, if we have 100 handlers, 50 for write, 
25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 
isolated handlers for write, 20 for get and 20 for scan).
 * When there is no idle fastpath handler for the three groups of handlers 
respectively, it will try to get an idling shared handlers.
 * The shared handlers are bounded to their own R/W/S queue, it will process 
that kind of request first and only be shared after being added to idle handler 
stacks.

  was:We currently use the FastPathRWQueue


> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueueRpcExecutor. But the numbers of 
> read/write handlers are fixed, which make the RegionServer performance not so 
> good in our prod env.
> The logic of it is described below:
>  * The basic architecture is same as FastPathRWRpcExecutor.
>  * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the 
> number of shared handlers. (for example, if we have 100 handlers, 50 for 
> write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 
> 40 isolated handlers for write, 20 for get and 20 for scan).
>  * When there is no idle fastpath handler for the three groups of handlers 
> respectively, it will try to get an idling shared handlers.
>  * The shared handlers are bounded to their own R/W/S queue, it will process 
> that kind of request first and only be shared after being added to idle 
> handler stacks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27962:

Description: We currently use the FastPathRWQueue

> Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit 
> various workloads 
> ---
>
> Key: HBASE-27962
> URL: https://issues.apache.org/jira/browse/HBASE-27962
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> We currently use the FastPathRWQueue



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads

2023-07-05 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-27962:
---

 Summary: Introduce an AdaptiveFastPathRWRpcExecutor to make the 
W/R/S separations fit various workloads 
 Key: HBASE-27962
 URL: https://issues.apache.org/jira/browse/HBASE-27962
 Project: HBase
  Issue Type: Improvement
Reporter: Yutong Xiao
Assignee: Yutong Xiao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup has thread safety problem

2022-08-16 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Summary: RSGroupMappingScript#getRSGroup has thread safety problem  (was: 
RSGroupMappingScript#getRSGroup Should be Synchronized)

> RSGroupMappingScript#getRSGroup has thread safety problem
> -
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The master branch also has this problem. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
> rsgroupMappingScript.execute();
>   } catch (IOException e) {
> // This exception may happen, like process doesn't have permission to 
> run this script.
> LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
>   TableName.valueOf(namespace, tablename));
> return RSGroupInfo.DEFAULT_GROUP;
>   }
>   return rsgroupMappingScript.getOutput().trim();
> }
> {code}
> here the rsgourpMappingScript could be executed by multi-threads.
> To test it is a multi-thread issue, I ran a piece of code locally and found 
> that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
> hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. 
> Besides, I found that this issue is retained in master branch also.
> The test code is attached and my rsgroup mapping script is very simple:
> {code:java}
> #!/bin/bash
> namespace=$1
> tablename=$2
> echo default
> {code}
> The reproduced screenshot is also attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized

2022-08-01 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Description: 
We are using version 1.4.12 and met a problem in table creation phase some 
time. The master branch also has this problem. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. 
Besides, I found that this issue is retained in master branch also.

The test code is attached and my rsgroup mapping script is very simple:

{code:java}
#!/bin/bash

namespace=$1
tablename=$2
echo default
{code}
The reproduced screenshot is also attached.


  was:
We are using version 1.4.12 and met a problem in table creation phase some 
time. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. 
Besides, I found that this issue is retained in master branch also.

The test code is attached and my rsgroup mapping script is very simple:

{code:java}
#!/bin/bash

namespace=$1
tablename=$2
echo default
{code}
The reproduced screenshot is also attached.



> RSGroupMappingScript#getRSGroup Should be Synchronized
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The master branch also has this problem. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Component/s: rsgroup

> RSGroupMappingScript#getRSGroup Should be Synchronized
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
> rsgroupMappingScript.execute();
>   } catch (IOException e) {
> // This exception may happen, like process doesn't have permission to 
> run this script.
> LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
>   TableName.valueOf(namespace, tablename));
> return RSGroupInfo.DEFAULT_GROUP;
>   }
>   return rsgroupMappingScript.getOutput().trim();
> }
> {code}
> here the rsgourpMappingScript could be executed by multi-threads.
> To test it is a multi-thread issue, I ran a piece of code locally and found 
> that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
> hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. 
> Besides, I found that this issue is retained in master branch also.
> The test code is attached and my rsgroup mapping script is very simple:
> {code:java}
> #!/bin/bash
> namespace=$1
> tablename=$2
> echo default
> {code}
> The reproduced screenshot is also attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Description: 
We are using version 1.4.12 and met a problem in table creation phase some 
time. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. 
Besides, I found that this issue is retained in master branch also.

The test code is attached and my rsgroup mapping script is very simple:

{code:java}
#!/bin/bash

namespace=$1
tablename=$2
echo default
{code}
The reproduced screenshot is also attached.


  was:
We are using version 1.4.12 and met a problem in table creation phase some 
time. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.

The test code is attached and my rsgroup mapping script is very simple:

{code:java}
#!/bin/bash

namespace=$1
tablename=$2
echo default
{code}
The reproduced screenshot is also attached.



> RSGroupMappingScript#getRSGroup Should be Synchronized
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Summary: RSGroupMappingScript#getRSGroup Should be Synchronized  (was: 
RSGroupMappingScript#getRSGroup should be synchronized)

> RSGroupMappingScript#getRSGroup Should be Synchronized
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
> rsgroupMappingScript.execute();
>   } catch (IOException e) {
> // This exception may happen, like process doesn't have permission to 
> run this script.
> LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
>   TableName.valueOf(namespace, tablename));
> return RSGroupInfo.DEFAULT_GROUP;
>   }
>   return rsgroupMappingScript.getOutput().trim();
> }
> {code}
> here the rsgourpMappingScript could be executed by multi-threads.
> To test it is a multi-thread issue, I ran a piece of code locally and found 
> that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
> hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.
> The test code is attached and my rsgroup mapping script is very simple:
> {code:java}
> #!/bin/bash
> namespace=$1
> tablename=$2
> echo default
> {code}
> The reproduced screenshot is also attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronized

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Summary: RSGroupMappingScript#getRSGroup should be synchronized  (was: 
RSGroupMappingScript#getRSGroup should be synchronised)

> RSGroupMappingScript#getRSGroup should be synchronized
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
> rsgroupMappingScript.execute();
>   } catch (IOException e) {
> // This exception may happen, like process doesn't have permission to 
> run this script.
> LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
>   TableName.valueOf(namespace, tablename));
> return RSGroupInfo.DEFAULT_GROUP;
>   }
>   return rsgroupMappingScript.getOutput().trim();
> }
> {code}
> here the rsgourpMappingScript could be executed by multi-threads.
> To test it is a multi-thread issue, I ran a piece of code locally and found 
> that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
> hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.
> The test code is attached and my rsgroup mapping script is very simple:
> {code:java}
> #!/bin/bash
> namespace=$1
> tablename=$2
> echo default
> {code}
> The reproduced screenshot is also attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Attachment: result.png

> RSGroupMappingScript#getRSGroup should be synchronised
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
> rsgroupMappingScript.execute();
>   } catch (IOException e) {
> // This exception may happen, like process doesn't have permission to 
> run this script.
> LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
>   TableName.valueOf(namespace, tablename));
> return RSGroupInfo.DEFAULT_GROUP;
>   }
>   return rsgroupMappingScript.getOutput().trim();
> }
> {code}
> here the rsgourpMappingScript could be executed by multi-threads.
> To test it is a multi-thread issue, I ran a piece of code locally and found 
> that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
> hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.
> The test code is attached and my rsgroup mapping script is very simple:
> {code:java}
> #!/bin/bash
> namespace=$1
> tablename=$2
> echo default
> {code}
> The reproduced screenshot is also attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Description: 
We are using version 1.4.12 and met a problem in table creation phase some 
time. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.

The test code is attached and my rsgroup mapping script is very simple:

{code:java}
#!/bin/bash

namespace=$1
tablename=$2
echo default
{code}
The reproduced screenshot is also attached.


  was:
We are using version 1.4.12 and met a problem in table creation phase some 
time. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.



> RSGroupMappingScript#getRSGroup should be synchronised
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java, result.png
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
>

[jira] [Created] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised

2022-07-27 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-27246:
---

 Summary: RSGroupMappingScript#getRSGroup should be synchronised
 Key: HBASE-27246
 URL: https://issues.apache.org/jira/browse/HBASE-27246
 Project: HBase
  Issue Type: Bug
Reporter: Yutong Xiao
Assignee: Yutong Xiao
 Attachments: Test.java

We are using version 1.4.12 and met a problem in table creation phase some 
time. The error message is:

{code:java}
2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
HBaseServiceImpl - hbase create table: xxx: failed. 
(HBaseServiceImpl.java:116)
java.lang.RuntimeException: 
org.apache.hadoop.hbase.constraint.ConstraintException: 
org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default
default) for this table's namespace does not exist.
{code}
The rsgroup here should be one 'default' but not two consecutive 'default'.  
The code to get RSGroup from a mapping script is:

{code:java}
String getRSGroup(String namespace, String tablename) {
  if (rsgroupMappingScript == null) {
return null;
  }
  String[] exec = rsgroupMappingScript.getExecString();
  exec[1] = namespace;
  exec[2] = tablename;
  try {
rsgroupMappingScript.execute();
  } catch (IOException e) {
// This exception may happen, like process doesn't have permission to 
run this script.
LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
  TableName.valueOf(namespace, tablename));
return RSGroupInfo.DEFAULT_GROUP;
  }
  return rsgroupMappingScript.getOutput().trim();
}
{code}
here the rsgourpMappingScript could be executed by multi-threads.

To test it is a multi-thread issue, I ran a piece of code locally and found 
that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised

2022-07-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-27246:

Attachment: Test.java

> RSGroupMappingScript#getRSGroup should be synchronised
> --
>
> Key: HBASE-27246
> URL: https://issues.apache.org/jira/browse/HBASE-27246
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Test.java
>
>
> We are using version 1.4.12 and met a problem in table creation phase some 
> time. The error message is:
> {code:java}
> 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR 
> HBaseServiceImpl - hbase create table: xxx: failed. 
> (HBaseServiceImpl.java:116)
> java.lang.RuntimeException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup 
> (default
> default) for this table's namespace does not exist.
> {code}
> The rsgroup here should be one 'default' but not two consecutive 'default'.  
> The code to get RSGroup from a mapping script is:
> {code:java}
> String getRSGroup(String namespace, String tablename) {
>   if (rsgroupMappingScript == null) {
> return null;
>   }
>   String[] exec = rsgroupMappingScript.getExecString();
>   exec[1] = namespace;
>   exec[2] = tablename;
>   try {
> rsgroupMappingScript.execute();
>   } catch (IOException e) {
> // This exception may happen, like process doesn't have permission to 
> run this script.
> LOG.error("{}, placing {} back to default rsgroup", e.getMessage(),
>   TableName.valueOf(namespace, tablename));
> return RSGroupInfo.DEFAULT_GROUP;
>   }
>   return rsgroupMappingScript.getOutput().trim();
> }
> {code}
> here the rsgourpMappingScript could be executed by multi-threads.
> To test it is a multi-thread issue, I ran a piece of code locally and found 
> that the hadoop ShellCommandExecutor is not thread-safe (I run the code with 
> hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-23158) If KVs are in memstore, small batch get can come across MultiActionResultTooLarge

2022-02-09 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-23158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489351#comment-17489351
 ] 

Yutong Xiao edited comment on HBASE-23158 at 2/10/22, 6:00 AM:
---

Encountered this problem also. For this problem, could we just use the number 
of cells times a predefined blocksize to approximate the total block size (one 
cell, one block)? 
For cells in memstore, they will be flushed in short time, so that we can just 
regard them as cells read from HFiles I think. This may solve this annoying 
exception. 
The problem is the approximation is rough (different table can have different 
blocksize).  


was (Author: xytss123):
Encountered this problem also. For this problem, could we just use the number 
of cells times a predefined blocksize to approximate the total block size (one 
cell, one block)? 
For cells in memstore, they will be flushed in short time, so that we can just 
regard them as cells read from HFiles I think. This may solve this annoying 
exception. 
The problem is the approximation is rough granularity (different table can have 
different blocksize).  
What do you think? 

> If KVs are in memstore, small  batch get can come across 
> MultiActionResultTooLarge
> --
>
> Key: HBASE-23158
> URL: https://issues.apache.org/jira/browse/HBASE-23158
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
> Environment: [^TestMultiRespectsLimitsMemstore.patch]
>Reporter: junfei liang
>Priority: Minor
> Attachments: TestMultiRespectsLimitsMemstore.patch
>
>
> to protect against big scan, we set   hbase.server.scanner.max.result.size  = 
> 10MB in our customer  hbase cluster,  however our clients  can meet 
> MultiActionResultTooLarge even in small batch get (for ex. 15 batch get,  and 
> row size is about 5KB ) .
> after  [HBASE-14978|https://issues.apache.org/jira/browse/HBASE-14978] hbase 
> take the data block reference into consideration, but the  block size is 64KB 
> (the default value ), even if all cells are from different block , the block 
> size retained is less than 1MB, so what's the problem ?
> finally  i found that HBASE-14978 also consider the cell in memstore, as 
> MSLAB is enabled default, so if the cell is from memstore, cell backend array 
> can be large (2MB as default), so even if a small batch can meet this error,  
> is this reasonable ?
> plus:
> when throw MultiActionResultTooLarge exception,  hbase client should retry 
> ignore rpc retry num,  however  if set retry num to zero, client  will fail 
> without retry in this case.
>  
> see attachment  TestMultiRespectsLimitsMemstore  for details.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-23158) If KVs are in memstore, small batch get can come across MultiActionResultTooLarge

2022-02-09 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-23158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489351#comment-17489351
 ] 

Yutong Xiao commented on HBASE-23158:
-

Encountered this problem also. For this problem, could we just use the number 
of cells times a predefined blocksize to approximate the total block size (one 
cell, one block)? 
For cells in memstore, they will be flushed in short time, so that we can just 
regard them as cells read from HFiles I think. This may solve this annoying 
exception. 
The problem is the approximation is rough granularity (different table can have 
different blocksize).  
What do you think? 

> If KVs are in memstore, small  batch get can come across 
> MultiActionResultTooLarge
> --
>
> Key: HBASE-23158
> URL: https://issues.apache.org/jira/browse/HBASE-23158
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
> Environment: [^TestMultiRespectsLimitsMemstore.patch]
>Reporter: junfei liang
>Priority: Minor
> Attachments: TestMultiRespectsLimitsMemstore.patch
>
>
> to protect against big scan, we set   hbase.server.scanner.max.result.size  = 
> 10MB in our customer  hbase cluster,  however our clients  can meet 
> MultiActionResultTooLarge even in small batch get (for ex. 15 batch get,  and 
> row size is about 5KB ) .
> after  [HBASE-14978|https://issues.apache.org/jira/browse/HBASE-14978] hbase 
> take the data block reference into consideration, but the  block size is 64KB 
> (the default value ), even if all cells are from different block , the block 
> size retained is less than 1MB, so what's the problem ?
> finally  i found that HBASE-14978 also consider the cell in memstore, as 
> MSLAB is enabled default, so if the cell is from memstore, cell backend array 
> can be large (2MB as default), so even if a small batch can meet this error,  
> is this reasonable ?
> plus:
> when throw MultiActionResultTooLarge exception,  hbase client should retry 
> ignore rpc retry num,  however  if set retry num to zero, client  will fail 
> without retry in this case.
>  
> see attachment  TestMultiRespectsLimitsMemstore  for details.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-27 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483509#comment-17483509
 ] 

Yutong Xiao edited comment on HBASE-26688 at 1/28/22, 1:33 AM:
---

For non-empty Result objects, the logic is the same after the change. 
This only changes the behaviour of Result with empty cell list, which used to 
have a thread safety bug. 
A more common usage is to check the return value of Result#advance once first, 
if it is false, the client should search the next result. 
This usage is impacted by this bug. Even if the client wants to catch the 
exception, they may get the exception at the first time of advance() but not 
the second time, which is not the original logic. 

By the way, in the original code, if the cell list is null, the method also 
always returns false and never raises the exception. 
Result#isEmpty with null cell list and Result#isEmpty with empty list all 
return true. But in the original code, "empty" Results have different 
behaviours. Should we also care about this?

This is actually a bug.  

If we want to keep return NoSuchElementException for Result with empty list. My 
opinion is we can change the Result class thread safe. (e.g. make 
Result#cellScannerIndex threadlocal?) 


was (Author: xytss123):
For non-empty Result objects, the logic is the same after the change. 
This only changes the behaviour of Result with empty cell list, which used to 
have a thread safety bug. 
A more common usage is to check the return value of Result#advance once first, 
if it is false, the client should search the next result. 
This usage is impacted by this bug. Even if the client wants to catch the 
exception, they may get the exception at the first time of advance() but not 
the second time, which is not the original logic. 

By the way, in the original code, if the cell list is null, the method also 
always returns false and never raises the exception. 
Result#isEmpty with null cell list and Result#isEmpty with empty list all 
return true. But in the original code, "empty" Results have different 
behaviours.

This is actually a bug.  

If we want to keep return NoSuchElementException for Result with empty list. My 
opinion is we can change the Result class thread safe. (e.g. make 
Result#cellScannerIndex threadlocal?) 

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-27 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483509#comment-17483509
 ] 

Yutong Xiao edited comment on HBASE-26688 at 1/28/22, 1:33 AM:
---

For non-empty Result objects, the logic is the same after the change. 
This only changes the behaviour of Result with empty cell list, which used to 
have a thread safety bug. 
A more common usage is to check the return value of Result#advance once first, 
if it is false, the client should search the next result. 
This usage is impacted by this bug. Even if the client wants to catch the 
exception, they may get the exception at the first time of advance() but not 
the second time, which is not the original logic. 

By the way, in the original code, if the cell list is null, the method also 
always returns false and never raises the exception. 
Result#isEmpty with null cell list and Result#isEmpty with empty list all 
return true. But in the original code, "empty" Results have different 
behaviours. Should we also care about this? 

If we want to keep return NoSuchElementException for Result with empty list. My 
opinion is we can change the Result class thread safe. (e.g. make 
Result#cellScannerIndex threadlocal?) 


was (Author: xytss123):
For non-empty Result objects, the logic is the same after the change. 
This only changes the behaviour of Result with empty cell list, which used to 
have a thread safety bug. 
A more common usage is to check the return value of Result#advance once first, 
if it is false, the client should search the next result. 
This usage is impacted by this bug. Even if the client wants to catch the 
exception, they may get the exception at the first time of advance() but not 
the second time, which is not the original logic. 

By the way, in the original code, if the cell list is null, the method also 
always returns false and never raises the exception. 
Result#isEmpty with null cell list and Result#isEmpty with empty list all 
return true. But in the original code, "empty" Results have different 
behaviours. Should we also care about this?

This is actually a bug.  

If we want to keep return NoSuchElementException for Result with empty list. My 
opinion is we can change the Result class thread safe. (e.g. make 
Result#cellScannerIndex threadlocal?) 

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-27 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483509#comment-17483509
 ] 

Yutong Xiao commented on HBASE-26688:
-

For non-empty Result objects, the logic is the same after the change. 
This only changes the behaviour of Result with empty cell list, which used to 
have a thread safety bug. 
A more common usage is to check the return value of Result#advance once first, 
if it is false, the client should search the next result. 
This usage is impacted by this bug. Even if the client wants to catch the 
exception, they may get the exception at the first time of advance() but not 
the second time, which is not the original logic. 

By the way, in the original code, if the cell list is null, the method also 
always returns false and never raises the exception. 
Result#isEmpty with null cell list and Result#isEmpty with empty list all 
return true. But in the original code, "empty" Results have different 
behaviours.

This is actually a bug.  

If we want to keep return NoSuchElementException for Result with empty list. My 
opinion is we can change the Result class thread safe. (e.g. make 
Result#cellScannerIndex threadlocal?) 

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-27 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483139#comment-17483139
 ] 

Yutong Xiao commented on HBASE-26688:
-

Right. And we have marked in release note.

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-27 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483044#comment-17483044
 ] 

Yutong Xiao commented on HBASE-26688:
-

[~zhangduo] I just wanted to rebase master and revise the commit message. But 
mixed the commits So that I opened another PR and closed the old one. Sorry 
for the inconvenience.

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Release Note: Result#advance with empty cell list will always return false 
but not raise NoSuchElementException when called multiple times.  (was: 
Result#advance with empty cell list will always return false but not raise 
NoSuchElementException when called twice.)

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-26 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Release Note: Result#advance with empty cell list will always return false 
but not raise NoSuchElementException when called twice.  (was: Result#advance 
with empty cell list will always return false but not raise 
NoSuchElementException.)

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-26 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Release Note: Result#advance with empty cell list will always return false 
but not raise NoSuchElementException.  (was: Removed unit test case 
TestResult#testAdvanceTwiceOnEmptyCell.)

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-26 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Release Note: Removed unit test case TestResult#testAdvanceTwiceOnEmptyCell.

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-26 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482484#comment-17482484
 ] 

Yutong Xiao commented on HBASE-26688:
-

[~zhangduo] OK, will push another PR then.

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-25 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482195#comment-17482195
 ] 

Yutong Xiao commented on HBASE-26688:
-

As a thread-unsafe object but shared among many client threads, it is hard to 
make it follow the single thread logic without synchronisation. In my opinion, 
we could just remove TestResult#testAdvanceTwiceOnEmptyCell and add some 
comments to mark the empty result is different.

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26705) Backport HBASE-26688 to branch-1

2022-01-25 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26705:
---

 Summary: Backport HBASE-26688 to branch-1
 Key: HBASE-26705
 URL: https://issues.apache.org/jira/browse/HBASE-26705
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 1.7.1
Reporter: Yutong Xiao
Assignee: Yutong Xiao
 Fix For: 1.7.2


Backport issue [HBASE-26688|https://issues.apache.org/jira/browse/HBASE-26688] 
to branch-1 to eliminate the client crash when using threads shared 
EMPTY_RESULT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.

2022-01-24 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Summary: Threads shared EMPTY_RESULT may lead to unexpected client job 
down.  (was: Threads shared EMPTY_RESULT may lead to )

> Threads shared EMPTY_RESULT may lead to unexpected client job down.
> ---
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to

2022-01-24 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Summary: Threads shared EMPTY_RESULT may lead to   (was: Result#advance, 
current are not thread safe. But the pre-created EMPTY_RESULT like objects can 
be shared by multi client threads.)

> Threads shared EMPTY_RESULT may lead to 
> 
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented by multi threads and 
> throw this exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// Although it is almost impossible,
>// We can arrive here when the client threads share the common reused 
> EMPTY_RESULT.
> return this.cells[cellScannerIndex];
>   }
> {code}
> In this case, the client can easily got confusing exceptions even if they use 
> different connections, tables in different threads.
> We should change the if condition cells == null to isEmpty() to avoid the 
> client crashed from these exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-20 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Description: 
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented by multi threads and 
throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // Although it is almost impossible,
   // We can arrive here when the client threads share the common reused 
EMPTY_RESULT.
return this.cells[cellScannerIndex];
  }
{code}

In this case, the client can easily got confusing exceptions even if they use 
different connections, tables in different threads.
We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.

  was:
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // Although it is almost impossible,
   // We can arrive here when the client threads share the common reused 
EMPTY_RESULT.
return this.cells[cellScannerIndex];
  }
{code}

In this case, the client can easily got confusing exceptions even if they use 
different connections, tables in different threads.
We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.


> Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT 
> like objects can be shared by multi client threads.
> -
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null,

[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-20 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Description: 
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // Although it is almost impossible,
   // We can arrive here when the client threads share the common reused 
EMPTY_RESULT.
return this.cells[cellScannerIndex];
  }
{code}

In this case, the client can easily got confusing exceptions even if they use 
different connections, tables in different threads.
We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.

  was:
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // Although it is almost impossible,
   // We can arrive here when the client threads share the common reused 
EMPTY_RESULT.
return this.cells[cellScannerIndex];
  }
{code}

We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.


> Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT 
> like objects can be shared by multi client threads.
> -
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
>

[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-19 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Description: 
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // Although it is almost impossible,
   // We can arrive here when the client threads share the common reused 
EMPTY_RESULT.
return this.cells[cellScannerIndex];
  }
{code}

We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.

  was:
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // When at the same time another thread invoke cellScanner to reset the 
index to -1, we will get problem. 
   // although the possibility is small, but it can happen.
return this.cells[cellScannerIndex];
  }
{code}

We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.


> Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT 
> like objects can be shared by multi client threads.
> -
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public

[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-19 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Description: 
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // When at the same time another thread invoke cellScanner to reset the 
index to -1, we will get problem. 
   // although the possibility is small, but it can happen.
return this.cells[cellScannerIndex];
  }
{code}

We should change the if condition cells == null to isEmpty() to avoid the 
client crashed from these exception.

  was:
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // When at the same time another thread invoke cellScanner to reset the 
index to -1, we will get problem. 
   // although the possibility is small, but it can happen.
return this.cells[cellScannerIndex];
  }
{code}

We can change the if condition cells == null to isEmpty()


> Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT 
> like objects can be shared by multi client threads.
> -
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean

[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-19 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Description: 
Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the 
object creation. But these objects could be shared by multi client threads. The 
Result#cellScannerIndex related methods could throw confusing exception and 
make the client job down. Could refine the logic of these methods.

The precreated objects in ProtoBufUtil.java:

{code:java}
private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
private final static Result EMPTY_RESULT_STALE = 
Result.create(EMPTY_CELL_ARRAY, null, true);
{code}

Result#advance 
{code:java}
public boolean advance() {
if (cells == null) return false;
cellScannerIndex++;
if (cellScannerIndex < this.cells.length) {
  return true;
} else if (cellScannerIndex == this.cells.length) {
  return false;
}
// The index of EMPTY_RESULT could be incremented and throw this exception.
throw new NoSuchElementException("Cannot advance beyond the last cell");
  }
{code}

Result#current

{code:java}
public Cell current() {
if (cells == null
|| cellScannerIndex == INITIAL_CELLSCANNER_INDEX
|| cellScannerIndex >= cells.length)
  return null;
   // When at the same time another thread invoke cellScanner to reset the 
index to -1, we will get problem. 
   // although the possibility is small, but it can happen.
return this.cells[cellScannerIndex];
  }
{code}

We can change the if condition cells == null to isEmpty()

  was:Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to 
reduce the object creation. But these objects could be shared by multi client 
threads. The Result#cellScannerIndex related methods could throw confusing 
exception and make the client job down. Could refine the logic of these methods.


> Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT 
> like objects can be shared by multi client threads.
> -
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.
> The precreated objects in ProtoBufUtil.java:
> {code:java}
> private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{};
> private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY);
> final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true);
> final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false);
> private final static Result EMPTY_RESULT_STALE = 
> Result.create(EMPTY_CELL_ARRAY, null, true);
> {code}
> Result#advance 
> {code:java}
> public boolean advance() {
> if (cells == null) return false;
> cellScannerIndex++;
> if (cellScannerIndex < this.cells.length) {
>   return true;
> } else if (cellScannerIndex == this.cells.length) {
>   return false;
> }
> // The index of EMPTY_RESULT could be incremented and throw this 
> exception.
> throw new NoSuchElementException("Cannot advance beyond the last cell");
>   }
> {code}
> Result#current
> {code:java}
> public Cell current() {
> if (cells == null
> || cellScannerIndex == INITIAL_CELLSCANNER_INDEX
> || cellScannerIndex >= cells.length)
>   return null;
>// When at the same time another thread invoke cellScanner to reset the 
> index to -1, we will get problem. 
>// although the possibility is small, but it can happen.
> return this.cells[cellScannerIndex];
>   }
> {code}
> We can change the if condition cells == null to isEmpty()



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-19 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26688:

Description: Currently, we use a pre-created EMPTY_RESULT in the 
ProtoBuf.util to reduce the object creation. But these objects could be shared 
by multi client threads. The Result#cellScannerIndex related methods could 
throw confusing exception and make the client job down. Could refine the logic 
of these methods.

> Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT 
> like objects can be shared by multi client threads.
> -
>
> Key: HBASE-26688
> URL: https://issues.apache.org/jira/browse/HBASE-26688
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce 
> the object creation. But these objects could be shared by multi client 
> threads. The Result#cellScannerIndex related methods could throw confusing 
> exception and make the client job down. Could refine the logic of these 
> methods.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.

2022-01-19 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26688:
---

 Summary: Result#advance, current are not thread safe. But the 
pre-created EMPTY_RESULT like objects can be shared by multi client threads.
 Key: HBASE-26688
 URL: https://issues.apache.org/jira/browse/HBASE-26688
 Project: HBase
  Issue Type: Bug
Reporter: Yutong Xiao
Assignee: Yutong Xiao






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

The operation distribution of YCSB workload is latest.

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.

As the master & branch-2 using ByteBufferAllocator to manage the bucketcache 
memory allocation, the RAMBuffer may not have GC improvement as much as 
branch-1. 


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

The operation distribution of YCSB workload is latest.

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.

As the master & branch-2 using ByteBufferAllocator to manage the bucketcache 
memory allocation, the RAMBuffer may not have GC improvement as much as branch-1



> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> The operation distribution of YCSB workload is latest.
> Client Side Metrics
> See the attachment ClientSideMetrics.png
> Server Side GC:
> The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
> With RAMBuffer, the server side had 210 times GC and 2.56

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

The operation distribution of YCSB workload is latest.

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.

As the master & branch-2 using ByteBufferAllocator to manage the bucketcache 
memory allocation, the RAMBuffer may not have GC improvement as much as branch-1


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.

As the master & branch-2 using ByteBufferAllocator to manage the bucketcache 
memory allocation, the RAMBuffer may not have GC improvement as much as branch-1



> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> The operation distribution of YCSB workload is latest.
> Client Side Metrics
> See the attachment ClientSideMetrics.png
> Server Side GC:
> The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
> With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.
> As the master & branch-2 using

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.

As the master & branch-2 using ByteBufferAllocator to manage the bucketcache 
memory allocation, the RAMBuffer may not have GC improvement as much as branch-1


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.





> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> Client Side Metrics
> See the attachment ClientSideMetrics.png
> Server Side GC:
> The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
> With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.
> As the master & branch-2 using ByteBufferAllocator to manage the bucketcache 
> memory allocation, the RAMBuffer may not have GC improvement as much as 
> branch-1



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
See the attachment ClientSideMetrics.png

Server Side GC:
The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.




  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
See the attachment ClientSideMetrics.png





> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> Client Side Metrics
> See the attachment ClientSideMetrics.png
> Server Side GC:
> The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total.
> With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 

The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
See the attachment ClientSideMetrics.png




  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
!ClientSideMetrics.png|height=300|width=300!





> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> Client Side Metrics
> See the attachment ClientSideMetrics.png



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
!ClientSideMetrics.png|height=250|width=250!




  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics
!ClientSideMetrics.png|height=250|width=250!

{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> Client Side Metrics
> !ClientSideMetrics.png|height=250|width=250!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
!ClientSideMetrics.png|height=300|width=300!




  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

Client Side Metrics
!ClientSideMetrics.png|height=250|width=250!





> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> Client Side Metrics
> !ClientSideMetrics.png|height=300|width=300!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics
!ClientSideMetrics.png|height=250|width=250!

{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics


{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> {panel:title=YCSB Test}
> Client Side Metrics
> !ClientSideMetrics.png|height=250|width=250!
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Attachment: ClientSideMetrics.png

> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Target table size: 112 GB
> Properties:
> !Properties.png|height=250|width=250!
> {panel:title=YCSB Test}
> Client Side Metrics
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics


{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics

||BucketCache without RAMBuffer||BucketCache with RAMBuffer||
|[OVERALL], RunTime(ms), 1772005
[OVERALL], Throughput(ops/sec), 2821.6624670923616
[TOTAL_GCS_PS_Scavenge], Count, 2760
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423
[TOTAL_GCS_PS_MarkSweep], Count, 4
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 217
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.012246015107180848
[TOTAL_GCs], Count, 2764
[TOTAL_GC_TIME], Time(ms), 17574
[TOTAL_GC_TIME_%], Time(%), 0.9917579239336233
[READ], Operations, 251
[READ], AverageLatency(us), 6831.8289292684285
[READ], MinLatency(us), 175
[READ], MaxLatency(us), 226431
[READ], 95thPercentileLatency(us), 12863
[READ], 99thPercentileLatency(us), 17823
[READ], Return=OK, 251
[CLEANUP], Operations, 60
[CLEANUP], AverageLatency(us), 961.1
[CLEANUP], MinLatency(us), 2
[CLEANUP], MaxLatency(us), 56191
[CLEANUP], 95thPercentileLatency(us), 73
[CLEANUP], 99thPercentileLatency(us), 541
[SCAN], Operations, 249
[SCAN], AverageLatency(us), 14388.572877029152
[SCAN], MinLatency(us), 320
[SCAN], MaxLatency(us), 441343
[SCAN], 95thPercentileLatency(us), 24751
[SCAN], 99thPercentileLatency(us), 32287
[SCAN], Return=OK, 249 |
|[OVERALL], RunTime(ms), 1699253
[OVERALL], Throughput(ops/sec), 2942.4694262714265
[TOTAL_GCS_PS_Scavenge], Count, 2714
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17158
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.0097378083193025
[TOTAL_GCS_PS_MarkSweep], Count, 3
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 172
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010122094826373705
[TOTAL_GCs], Count, 2717
[TOTAL_GC_TIME], Time(ms), 17330
[TOTAL_GC_TIME_%], Time(%), 1.0198599031456763
[READ], Operations, 2499189
[READ], AverageLatency(us), 6507.363253039286
[READ], MinLatency(us), 177
[READ], MaxLatency(us), 102783
[READ], 95thPercentileLatency(us), 12055
[READ], 99thPercentileLatency(us), 16431
[READ], Return=OK, 2499189
[CLEANUP], Operations, 60
[CLEANUP], AverageLatency(us), 1247.81666
[CLEANUP], MinLatency(us), 2
[CLEANUP], MaxLatency(us), 73471
[CLEANUP], 95thPercentileLatency(us), 72
[CLEANUP], 99thPercentileLatency(us), 605
[SCAN], Operations, 2500811
[SCAN], AverageLatency(us), 13850.626164872116
[SCAN], MinLatency(us), 297
[SCAN], MaxLatency(us), 368383
[SCAN], 95thPercentileLatency(us), 23791
[SCAN], 99thPercentileLatency(us), 30783
[SCAN], Return=OK, 2500811
|

{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL:

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics

||BucketCache without RAMBuffer||BucketCache with RAMBuffer||
|[OVERALL], RunTime(ms), 1772005
[OVERALL], Throughput(ops/sec), 2821.6624670923616
[TOTAL_GCS_PS_Scavenge], Count, 2760
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423
[TOTAL_GCS_PS_MarkSweep], Count, 4
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 217
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.012246015107180848
[TOTAL_GCs], Count, 2764
[TOTAL_GC_TIME], Time(ms), 17574
[TOTAL_GC_TIME_%], Time(%), 0.9917579239336233
[READ], Operations, 251
[READ], AverageLatency(us), 6831.8289292684285
[READ], MinLatency(us), 175
[READ], MaxLatency(us), 226431
[READ], 95thPercentileLatency(us), 12863
[READ], 99thPercentileLatency(us), 17823
[READ], Return=OK, 251
[CLEANUP], Operations, 60
[CLEANUP], AverageLatency(us), 961.1
[CLEANUP], MinLatency(us), 2
[CLEANUP], MaxLatency(us), 56191
[CLEANUP], 95thPercentileLatency(us), 73
[CLEANUP], 99thPercentileLatency(us), 541
[SCAN], Operations, 249
[SCAN], AverageLatency(us), 14388.572877029152
[SCAN], MinLatency(us), 320
[SCAN], MaxLatency(us), 441343
[SCAN], 95thPercentileLatency(us), 24751
[SCAN], 99thPercentileLatency(us), 32287
[SCAN], Return=OK, 249 |
|[OVERALL], RunTime(ms), 1699253
[OVERALL], Throughput(ops/sec), 2942.4694262714265
[TOTAL_GCS_PS_Scavenge], Count, 2714
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17158
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.0097378083193025
[TOTAL_GCS_PS_MarkSweep], Count, 3
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 172
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010122094826373705
[TOTAL_GCs], Count, 2717
[TOTAL_GC_TIME], Time(ms), 17330
[TOTAL_GC_TIME_%], Time(%), 1.0198599031456763
[READ], Operations, 2499189
[READ], AverageLatency(us), 6507.363253039286
[READ], MinLatency(us), 177
[READ], MaxLatency(us), 102783
[READ], 95thPercentileLatency(us), 12055
[READ], 99thPercentileLatency(us), 16431
[READ], Return=OK, 2499189
[CLEANUP], Operations, 60
[CLEANUP], AverageLatency(us), 1247.81666
[CLEANUP], MinLatency(us), 2
[CLEANUP], MaxLatency(us), 73471
[CLEANUP], 95thPercentileLatency(us), 72
[CLEANUP], 99thPercentileLatency(us), 605
[SCAN], Operations, 2500811
[SCAN], AverageLatency(us), 13850.626164872116
[SCAN], MinLatency(us), 297
[SCAN], MaxLatency(us), 368383
[SCAN], 95thPercentileLatency(us), 23791
[SCAN], 99thPercentileLatency(us), 30783
[SCAN], Return=OK, 2500811
|

{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics

||BucketCache without RAMBuffer||BucketCache with RAMBuffer||
|[OVERALL], RunTime(ms), 1772005
[OVERALL], Throughput(ops/sec), 2821.6624670923616
[TOTAL_GCS_PS_Scavenge], Count, 2760
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Target table size: 112 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Client Side Metrics

||BucketCache without RAMBuffer||BucketCache with RAMBuffer||
|[OVERALL], RunTime(ms), 1772005
[OVERALL], Throughput(ops/sec), 2821.6624670923616
[TOTAL_GCS_PS_Scavenge], Count, 2760
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423
[TOTAL_GCS_PS_MarkSweep], Count, 4
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 217
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.012246015107180848
[TOTAL_GCs], Count, 2764
[TOTAL_GC_TIME], Time(ms), 17574
[TOTAL_GC_TIME_%], Time(%), 0.9917579239336233
[READ], Operations, 251
[READ], AverageLatency(us), 6831.8289292684285
[READ], MinLatency(us), 175
[READ], MaxLatency(us), 226431
[READ], 95thPercentileLatency(us), 12863
[READ], 99thPercentileLatency(us), 17823
[READ], Return=OK, 251
[CLEANUP], Operations, 60
[CLEANUP], AverageLatency(us), 961.1
[CLEANUP], MinLatency(us), 2
[CLEANUP], MaxLatency(us), 56191
[CLEANUP], 95thPercentileLatency(us), 73
[CLEANUP], 99thPercentileLatency(us), 541
[SCAN], Operations, 249
[SCAN], AverageLatency(us), 14388.572877029152
[SCAN], MinLatency(us), 320
[SCAN], MaxLatency(us), 441343
[SCAN], 95thPercentileLatency(us), 24751
[SCAN], 99thPercentileLatency(us), 32287
[SCAN], Return=OK, 249
|[OVERALL], RunTime(ms), 1699253
[OVERALL], Throughput(ops/sec), 2942.4694262714265
[TOTAL_GCS_PS_Scavenge], Count, 2714
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17158
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.0097378083193025
[TOTAL_GCS_PS_MarkSweep], Count, 3
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 172
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010122094826373705
[TOTAL_GCs], Count, 2717
[TOTAL_GC_TIME], Time(ms), 17330
[TOTAL_GC_TIME_%], Time(%), 1.0198599031456763
[READ], Operations, 2499189
[READ], AverageLatency(us), 6507.363253039286
[READ], MinLatency(us), 177
[READ], MaxLatency(us), 102783
[READ], 95thPercentileLatency(us), 12055
[READ], 99thPercentileLatency(us), 16431
[READ], Return=OK, 2499189
[CLEANUP], Operations, 60
[CLEANUP], AverageLatency(us), 1247.81666
[CLEANUP], MinLatency(us), 2
[CLEANUP], MaxLatency(us), 73471
[CLEANUP], 95thPercentileLatency(us), 72
[CLEANUP], 99thPercentileLatency(us), 605
[SCAN], Operations, 2500811
[SCAN], AverageLatency(us), 13850.626164872116
[SCAN], MinLatency(us), 297
[SCAN], MaxLatency(us), 368383
[SCAN], 95thPercentileLatency(us), 23791
[SCAN], 99thPercentileLatency(us), 30783
[SCAN], Return=OK, 2500811
|

{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project:

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png|height=250|width=250!

{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Properties:
> !Properties.png|height=250|width=250!
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=100|width=100!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png![size:50%]
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=100|width=100!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Properties:
> !Properties.png!
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=250|width=250!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png|height=100|width=100!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png|height=250|width=250!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Properties:
> !Properties.png!
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png![size:50%]
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png![size:50%]
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Properties:
> !Properties.png!
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB
Properties:
!Properties.png!

{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB

{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> Properties:
> !Properties.png!
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}

I also did a YCSB performance test. 
The circumstance is:
Size of BucketCache: 40 GB

{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}

I also did a YCSB performance test. 
The circumstance is:


{panel:title=YCSB Test}
Some text with a title
{panel}




> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Attachment: Properties.png

> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png, Properties.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> Size of BucketCache: 40 GB
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}

I also did a YCSB performance test. 
The circumstance is:


{panel:title=YCSB Test}
Some text with a title
{panel}



  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

 
{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}



> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png!
> {panel}
> I also did a YCSB performance test. 
> The circumstance is:
> {panel:title=YCSB Test}
> Some text with a title
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

 
{panel:title=The performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

 
{panel:title=the performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}



> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
>  
> {panel:title=The performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png!
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

 
{panel:title=the performance of RAMBuffer with its hit ratio is 100%}
!Hit 100%.png!
{panel}


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

 
{panel:title=My title}
!Hit 100%.png!
{panel}



> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
>  
> {panel:title=the performance of RAMBuffer with its hit ratio is 100%}
> !Hit 100%.png!
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

 
{panel:title=My title}
!Hit 100%.png!
{panel}


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:
!Hit 100%.png!
 


> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
>  
> {panel:title=My title}
> !Hit 100%.png!
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:
!Hit 100%.png!
 

  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

 


> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
> !Hit 100%.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Attachment: Hit 100%.png

> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
> Attachments: Hit 100%.png
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

 

  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

 !Screen Shot 2022-01-18 at 22.01.04.png! 


> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Attachment: Screen Shot 2022-01-18 at 22.01.04.png

> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
> ||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with 
> RAMBuffer|||
> |BuckCache without RAMBuffer|BucketCache with RAMBuffer|



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

 !Screen Shot 2022-01-18 at 22.01.04.png! 

  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with 
RAMBuffer|||
|BuckCache without RAMBuffer|BucketCache with RAMBuffer|



> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
>  !Screen Shot 2022-01-18 at 22.01.04.png! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Attachment: (was: Screen Shot 2022-01-18 at 22.01.04.png)

> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
>  !Screen Shot 2022-01-18 at 22.01.04.png! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26681:

Description: 
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.

I first did a YCSB test to check the performance of RAMBuffer with its hit 
ratio is 100%. The result is:

||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with 
RAMBuffer|||
|BuckCache without RAMBuffer|BucketCache with RAMBuffer|


  was:
In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.


> Introduce a little RAMBuffer for bucketcache to reduce gc and improve 
> throughput
> 
>
> Key: HBASE-26681
> URL: https://issues.apache.org/jira/browse/HBASE-26681
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache, Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
> HFileBlock when get cached blocks. This rough allocation increases the GC 
> pressure for those "hot" blocks. 
> Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
> is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level 
> block is read twice, we cache it in the RAMBuffer. When the block timeout in 
> the cache (e.g. 60s), that means the block is not being accessed in 60s, we 
> evict it. Not like LRU, we do not cache block when the whole RAMBuffer size 
> reaches to a threshold (to fit different workload, the threshold is dynamic). 
> This will prevent the RAMBuffer from being churned.
> I first did a YCSB test to check the performance of RAMBuffer with its hit 
> ratio is 100%. The result is:
> ||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with 
> RAMBuffer|||
> |BuckCache without RAMBuffer|BucketCache with RAMBuffer|



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput

2022-01-18 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26681:
---

 Summary: Introduce a little RAMBuffer for bucketcache to reduce gc 
and improve throughput
 Key: HBASE-26681
 URL: https://issues.apache.org/jira/browse/HBASE-26681
 Project: HBase
  Issue Type: Improvement
  Components: BucketCache, Performance
Reporter: Yutong Xiao
Assignee: Yutong Xiao
 Fix For: 1.7.2


In branch-1, BucketCache just allocate new onheap bytebuffer to construct new 
HFileBlock when get cached blocks. This rough allocation increases the GC 
pressure for those "hot" blocks. 
Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought 
is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block 
is read twice, we cache it in the RAMBuffer. When the block timeout in the 
cache (e.g. 60s), that means the block is not being accessed in 60s, we evict 
it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches 
to a threshold (to fit different workload, the threshold is dynamic). This will 
prevent the RAMBuffer from being churned.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26678) Backport HBASE-26579 to branch-1

2022-01-17 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26678:

Description: Our branch-1 cluster also met the storage policy problem in 
usage. Backport the path to branch-1.  (was: Our branch-1 cluster also met the 
storage policy in usage. Backport the path to branch-1.)

> Backport HBASE-26579 to branch-1
> 
>
> Key: HBASE-26678
> URL: https://issues.apache.org/jira/browse/HBASE-26678
> Project: HBase
>  Issue Type: Task
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> Our branch-1 cluster also met the storage policy problem in usage. Backport 
> the path to branch-1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26678) Backport HBASE-26579 to branch-1

2022-01-17 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26678:

Description: Our branch-1 cluster also met the storage policy in usage. 
Backport the path to branch-1.

> Backport HBASE-26579 to branch-1
> 
>
> Key: HBASE-26678
> URL: https://issues.apache.org/jira/browse/HBASE-26678
> Project: HBase
>  Issue Type: Task
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>
> Our branch-1 cluster also met the storage policy in usage. Backport the path 
> to branch-1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26678) Backport HBASE-26579 to branch-1

2022-01-17 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26678:

External issue URL: https://issues.apache.org/jira/browse/HBASE-26579
 Fix Version/s: 1.7.2

> Backport HBASE-26579 to branch-1
> 
>
> Key: HBASE-26678
> URL: https://issues.apache.org/jira/browse/HBASE-26678
> Project: HBase
>  Issue Type: Task
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 1.7.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26678) Backport HBASE-26579 to branch-1

2022-01-17 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26678:
---

 Summary: Backport HBASE-26579 to branch-1
 Key: HBASE-26678
 URL: https://issues.apache.org/jira/browse/HBASE-26678
 Project: HBase
  Issue Type: Task
Reporter: Yutong Xiao
Assignee: Yutong Xiao






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26596) region_mover should gracefully ignore null response from RSGroupAdmin#getRSGroupOfServer

2022-01-16 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476947#comment-17476947
 ] 

Yutong Xiao commented on HBASE-26596:
-

[~vjasani] The PR has been pending for a time.  Not sure if it is OK for the 
latest commit to merge. Could you please have a look then?

> region_mover should gracefully ignore null response from 
> RSGroupAdmin#getRSGroupOfServer
> 
>
> Key: HBASE-26596
> URL: https://issues.apache.org/jira/browse/HBASE-26596
> Project: HBase
>  Issue Type: Bug
>  Components: mover, rsgroup
>Affects Versions: 1.7.1
>Reporter: Viraj Jasani
>Assignee: Yutong Xiao
>Priority: Major
>
> If regionserver has any non-daemon thread running even after it's own 
> shutdown, the running non-daemon thread can prevent clean JVM exit and 
> regionserver could be stuck in the zombie state. We have recently provided a 
> workaround for this in HBASE-26468 for regionserver exit hook to wait 30s for 
> all non-daemon threads to get stopped before terminating JVM abnormally.
> However, if regionserver is stuck in such state, region_mover unload fails 
> with:
> {code:java}
> NoMethodError: undefined method `getName` for nil:NilClass
>   getSameRSGroupServers at /bin/region_mover.rb:503
>  __ensure__ at /bin/region_mover.rb:313 
>   unloadRegions at /bin/region_mover.rb:310   
>  (root) at /bin/region_mover.rb:572   
>  {code}
> This happens if the cluster has RSGroup enabled and the given server is 
> already stopped, hence RSGroupAdmin#getRSGroupOfServer would return null (as 
> the server is not running anymore so it is not part of any RSGroup). 
> region_mover should ride over this null response and gracefully exit from 
> unloadRegions() call.
>  
> We should also check if the fix is applicable to branch-2 and above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26551) Add FastPath feature to HBase RWQueueRpcExecutor

2022-01-12 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475078#comment-17475078
 ] 

Yutong Xiao commented on HBASE-26551:
-

OK, will do it then.

> Add FastPath feature to HBase RWQueueRpcExecutor
> 
>
> Key: HBASE-26551
> URL: https://issues.apache.org/jira/browse/HBASE-26551
> Project: HBase
>  Issue Type: Task
>  Components: rpc, Scheduler
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: QueueTimeComparison.png, QueueTimeComparisonWithMax.png
>
>
> In ticket [HBASE-17808|https://issues.apache.org/jira/browse/HBASE-17808], 
> the author introduced a fastpath implementation for RWQueueRpcExecutor. It 
> aggregated 3 different independent RpcExecutor to implement the mechanism. 
> This redundancy costed more memory and from its own performance test, it 
> cannot outperform the original implementation. This time, I directly extended 
> RWQueueRpcExecutor to implement the fast path mechanism. From my test result, 
> it has a better queue time performance than before.
> YCSB Test:
> Constant Configurations:
> hbase.regionserver.handler.count: 1000
> hbase.ipc.server.callqueue.read.ratio: 0.5
> hbase.ipc.server.callqueue.handler.factor: 0.2
> Test Workload:
> YCSB: 50% Write, 25% Get, 25% Scan. Max Scan length: 1000.
> Client Threads: 100
> ||FastPathRWQueueRpcExecutor||RWQueueRpcExecutor||
> |[OVERALL], RunTime(ms), 909365
> [OVERALL], Throughput(ops/sec), 5498.3422498116815
> [TOTAL_GCS_PS_Scavenge], Count, 1208
> [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 8006
> [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.8803945610398465
> [TOTAL_GCS_PS_MarkSweep], Count, 2
> [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 96
> [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010556817119638429
> [TOTAL_GCs], Count, 1210
> [TOTAL_GC_TIME], Time(ms), 8102
> [TOTAL_GC_TIME_%], Time(%), 0.8909513781594849
> [READ], Operations, 1248885
> [READ], AverageLatency(us), 14080.154160711354
> [READ], MinLatency(us), 269
> [READ], MaxLatency(us), 180735
> [READ], 95thPercentileLatency(us), 29775
> [READ], 99thPercentileLatency(us), 39391
> [READ], Return=OK, 1248885
> [CLEANUP], Operations, 200
> [CLEANUP], AverageLatency(us), 311.78
> [CLEANUP], MinLatency(us), 1
> [CLEANUP], MaxLatency(us), 59647
> [CLEANUP], 95thPercentileLatency(us), 26
> [CLEANUP], 99thPercentileLatency(us), 173
> [INSERT], Operations, 1251067
> [INSERT], AverageLatency(us), 14235.898240461942
> [INSERT], MinLatency(us), 393
> [INSERT], MaxLatency(us), 204159
> [INSERT], 95thPercentileLatency(us), 29919
> [INSERT], 99thPercentileLatency(us), 39647
> [INSERT], Return=OK, 1251067
> [UPDATE], Operations, 1249582
> [UPDATE], AverageLatency(us), 14166.923049467741
> [UPDATE], MinLatency(us), 321
> [UPDATE], MaxLatency(us), 203647
> [UPDATE], 95thPercentileLatency(us), 29855
> [UPDATE], 99thPercentileLatency(us), 39551
> [UPDATE], Return=OK, 1249582
> [SCAN], Operations, 1250466
> [SCAN], AverageLatency(us), 30056.68854251135
> [SCAN], MinLatency(us), 787
> [SCAN], MaxLatency(us), 509183
> [SCAN], 95thPercentileLatency(us), 57823
> [SCAN], 99thPercentileLatency(us), 74751
> [SCAN], Return=OK, 1250466|[OVERALL], RunTime(ms), 958763
> [OVERALL], Throughput(ops/sec), 5215.053146606617
> [TOTAL_GCS_PS_Scavenge], Count, 1264
> [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 8680
> [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9053332262509086
> [TOTAL_GCS_PS_MarkSweep], Count, 1
> [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 38
> [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.00396344039142103
> [TOTAL_GCs], Count, 1265
> [TOTAL_GC_TIME], Time(ms), 8718
> [TOTAL_GC_TIME_%], Time(%), 0.909296423298
> [READ], Operations, 1250961
> [READ], AverageLatency(us), 14663.084518222391
> [READ], MinLatency(us), 320
> [READ], MaxLatency(us), 204415
> [READ], 95thPercentileLatency(us), 30815
> [READ], 99thPercentileLatency(us), 43071
> [READ], Return=OK, 1250961
> [CLEANUP], Operations, 200
> [CLEANUP], AverageLatency(us), 366.845
> [CLEANUP], MinLatency(us), 1
> [CLEANUP], MaxLatency(us), 70719
> [CLEANUP], 95thPercentileLatency(us), 36
> [CLEANUP], 99thPercentileLatency(us), 80
> [INSERT], Operations, 1248183
> [INSERT], AverageLatency(us), 14334.938754974231
> [INSERT], MinLatency(us), 390
> [INSERT], MaxLatency(us), 2828287
> [INSERT], 95thPercentileLatency(us), 30271
> [INSERT], 99thPercentileLatency(us), 41919
> [INSERT], Return=OK, 1248183
> [UPDATE], Operations, 1250212
> [UPDATE], AverageLatency(us), 14283.836318960304
> [UPDATE], MinLatency(us), 337
> [UPDATE], MaxLatency(us), 2828287
> [UPDATE], 95thPercentileLatency(us), 30255
> [UPDATE], 99thPercentileLatency(us), 41855
> [UPDATE], Return=OK, 1250212
> [SCAN], Operations, 1250644
> [SCAN], AverageLatency(us), 33153.01709839091
> [SCAN], MinLatency(us), 742
> [SCAN],

[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.

2022-01-12 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26659:

Description: 
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
this small piece ByteBuffer.
Reasons:
1. In a WriterThread, blocks in doDrain() function are written into IOEngine 
sequentially, there is no multi-threads problem.
2. After IOEngine.write() function, the data in metadata ByteBuffer has been 
transferred into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above. So that it could 
be cleared and reused by the next block's writing process.

  was:
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
this small piece ByteBuffer.
Reasons:
1. In a WriterThread, blocks in doDrain() function are written into IOEngine 
sequentially, there is no multi-thread problem.
2. After IOEngine.write() function, the data in metadata bytebuffer has been 
transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above.


> The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
> ---
>
> Key: HBASE-26659
> URL: https://issues.apache.org/jira/browse/HBASE-26659
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, the process to write HFileBlocks into IOEngine in BucketCache is:
> {code:java}
>  if (data instanceof HFileBlock) {
>   // If an instance of HFileBlock, save on some allocations.
>   HFileBlock block = (HFileBlock) data;
>   ByteBuff sliceBuf = block.getBufferReadOnly();
>   ByteBuffer metadata = block.getMetaData();
>   ioEngine.write(sliceBuf, offset);
>   ioEngine.write(metadata, offset + len - metadata.limit());
> }
> {code}
> The getMetaData() function in HFileBlock is:
> {code:java}
> public ByteBuffer getMetaData() {
> ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
> bb = addMetaData(bb, true);
> bb.flip();
> return bb;
>   }
> {code}
> It will allocate new ByteBuffer every time.
> We could reuse a local variable of WriterThread to reduce the new allocation 
> of this small piece ByteBuffer.
> Reasons:
> 1. In a WriterThread, blocks in doDrain() function are written into IOEngine 
> sequentially, there is no multi-threads problem.
> 2. After IOEngine.write() function, the data in metadata ByteBuffer has been 
> transferred into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
> safely. The lifecycle of it is within the if statement above. So that it 
> could be cleared and reused by the next block's writing process.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.

2022-01-12 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26659:

Description: 
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
this small piece ByteBuffer.
Reasons:
1. In a WriterThread, blocks in doDrain() function are written into IOEngine 
sequentially, there is no multi-thread problem.
2. After IOEngine.write() function, the data in metadata bytebuffer has been 
transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above.

  was:
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
metadata.
Reasons:
1. In a WriterThread, blocks are written into IOEngine sequencially, there is 
no multi-thread problem.
2. After IOEngine.write() function, the data in metadata bytebuffer has been 
transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above.


> The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
> ---
>
> Key: HBASE-26659
> URL: https://issues.apache.org/jira/browse/HBASE-26659
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, the process to write HFileBlocks into IOEngine in BucketCache is:
> {code:java}
>  if (data instanceof HFileBlock) {
>   // If an instance of HFileBlock, save on some allocations.
>   HFileBlock block = (HFileBlock) data;
>   ByteBuff sliceBuf = block.getBufferReadOnly();
>   ByteBuffer metadata = block.getMetaData();
>   ioEngine.write(sliceBuf, offset);
>   ioEngine.write(metadata, offset + len - metadata.limit());
> }
> {code}
> The getMetaData() function in HFileBlock is:
> {code:java}
> public ByteBuffer getMetaData() {
> ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
> bb = addMetaData(bb, true);
> bb.flip();
> return bb;
>   }
> {code}
> It will allocate new ByteBuffer every time.
> We could reuse a local variable of WriterThread to reduce the new allocation 
> of this small piece ByteBuffer.
> Reasons:
> 1. In a WriterThread, blocks in doDrain() function are written into IOEngine 
> sequentially, there is no multi-thread problem.
> 2. After IOEngine.write() function, the data in metadata bytebuffer has been 
> transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
> safely. The lifecycle of it is within the if statement above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.

2022-01-12 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26659:

Description: 
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
metadata.
Reasons:
1. In a WriterThread, blocks are written into IOEngine sequencially, there is 
no multi-thread problem.
2. After IOEngine.write() function, the data in metadata bytebuffer has been 
transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above.

  was:
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
metadata.
Reasons:
1. In a WriterThread, blocks are written into IOEngine serially, there is no 
multi-thread problem.
2. After IOEngine.write() function, the data in metadata bytebuffer has been 
transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above.


> The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
> ---
>
> Key: HBASE-26659
> URL: https://issues.apache.org/jira/browse/HBASE-26659
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, the process to write HFileBlocks into IOEngine in BucketCache is:
> {code:java}
>  if (data instanceof HFileBlock) {
>   // If an instance of HFileBlock, save on some allocations.
>   HFileBlock block = (HFileBlock) data;
>   ByteBuff sliceBuf = block.getBufferReadOnly();
>   ByteBuffer metadata = block.getMetaData();
>   ioEngine.write(sliceBuf, offset);
>   ioEngine.write(metadata, offset + len - metadata.limit());
> }
> {code}
> The getMetaData() function in HFileBlock is:
> {code:java}
> public ByteBuffer getMetaData() {
> ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
> bb = addMetaData(bb, true);
> bb.flip();
> return bb;
>   }
> {code}
> It will allocate new ByteBuffer every time.
> We could reuse a local variable of WriterThread to reduce the new allocation 
> of metadata.
> Reasons:
> 1. In a WriterThread, blocks are written into IOEngine sequencially, there is 
> no multi-thread problem.
> 2. After IOEngine.write() function, the data in metadata bytebuffer has been 
> transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
> safely. The lifecycle of it is within the if statement above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.

2022-01-12 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26659:

Description: 
Currently, the process to write HFileBlocks into IOEngine in BucketCache is:

{code:java}
 if (data instanceof HFileBlock) {
  // If an instance of HFileBlock, save on some allocations.
  HFileBlock block = (HFileBlock) data;
  ByteBuff sliceBuf = block.getBufferReadOnly();
  ByteBuffer metadata = block.getMetaData();
  ioEngine.write(sliceBuf, offset);
  ioEngine.write(metadata, offset + len - metadata.limit());
}
{code}

The getMetaData() function in HFileBlock is:

{code:java}
public ByteBuffer getMetaData() {
ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
bb = addMetaData(bb, true);
bb.flip();
return bb;
  }
{code}
It will allocate new ByteBuffer every time.
We could reuse a local variable of WriterThread to reduce the new allocation of 
metadata.
Reasons:
1. In a WriterThread, blocks are written into IOEngine serially, there is no 
multi-thread problem.
2. After IOEngine.write() function, the data in metadata bytebuffer has been 
transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
safely. The lifecycle of it is within the if statement above.

> The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
> ---
>
> Key: HBASE-26659
> URL: https://issues.apache.org/jira/browse/HBASE-26659
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> Currently, the process to write HFileBlocks into IOEngine in BucketCache is:
> {code:java}
>  if (data instanceof HFileBlock) {
>   // If an instance of HFileBlock, save on some allocations.
>   HFileBlock block = (HFileBlock) data;
>   ByteBuff sliceBuf = block.getBufferReadOnly();
>   ByteBuffer metadata = block.getMetaData();
>   ioEngine.write(sliceBuf, offset);
>   ioEngine.write(metadata, offset + len - metadata.limit());
> }
> {code}
> The getMetaData() function in HFileBlock is:
> {code:java}
> public ByteBuffer getMetaData() {
> ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE);
> bb = addMetaData(bb, true);
> bb.flip();
> return bb;
>   }
> {code}
> It will allocate new ByteBuffer every time.
> We could reuse a local variable of WriterThread to reduce the new allocation 
> of metadata.
> Reasons:
> 1. In a WriterThread, blocks are written into IOEngine serially, there is no 
> multi-thread problem.
> 2. After IOEngine.write() function, the data in metadata bytebuffer has been 
> transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) 
> safely. The lifecycle of it is within the if statement above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.

2022-01-12 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26659:
---

 Summary: The ByteBuffer of metadata in RAMQueueEntry in 
BucketCache could be reused.
 Key: HBASE-26659
 URL: https://issues.apache.org/jira/browse/HBASE-26659
 Project: HBase
  Issue Type: Improvement
Reporter: Yutong Xiao






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.

2022-01-12 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao reassigned HBASE-26659:
---

Assignee: Yutong Xiao

> The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
> ---
>
> Key: HBASE-26659
> URL: https://issues.apache.org/jira/browse/HBASE-26659
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes

2021-12-29 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26635:

Description: 
Currently, when we decode a byte array to a big decimal, there are plenty of 
BigDecimal operations, which result in a lot of only once used BigDecimal 
objects. Furthermore, the BigDecimal calculations are slow. We could boost this 
function by using String concatenation and point movement of BigDecimal. The 
JMH benchmark is uploaded to the attachment.

Also, I added a UT to test the encoding / decoding correctness of 200 random 
test samples.

The logic of the process is same as the Optimization of the encode function in 
the ticket [HBASE-26566|https://issues.apache.org/jira/browse/HBASE-26566]

  was:
Currently, when we decode a byte array to a big decimal, there are plenty of 
BigDecimal operations, which result in a lot of only once used BigDecimal 
objects. Furthermore, the BigDecimal calculations are slow. We could boost this 
function by using String concatenation and point movement of BigDecimal. The 
JMH benchmark is uploaded to the attachment.

Also, I added a UT to test the encoding / decoding correctness of 200 random 
test samples.


> Optimize decodeNumeric in OrderedBytes
> --
>
> Key: HBASE-26635
> URL: https://issues.apache.org/jira/browse/HBASE-26635
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Benchmark-decoding.log, DecodeBenchmark.java
>
>
> Currently, when we decode a byte array to a big decimal, there are plenty of 
> BigDecimal operations, which result in a lot of only once used BigDecimal 
> objects. Furthermore, the BigDecimal calculations are slow. We could boost 
> this function by using String concatenation and point movement of BigDecimal. 
> The JMH benchmark is uploaded to the attachment.
> Also, I added a UT to test the encoding / decoding correctness of 200 random 
> test samples.
> The logic of the process is same as the Optimization of the encode function 
> in the ticket [HBASE-26566|https://issues.apache.org/jira/browse/HBASE-26566]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes

2021-12-29 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26635:

Attachment: DecodeBenchmark.java

> Optimize decodeNumeric in OrderedBytes
> --
>
> Key: HBASE-26635
> URL: https://issues.apache.org/jira/browse/HBASE-26635
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Benchmark-decoding.log, DecodeBenchmark.java
>
>
> Currently, when we decode a byte array to a big decimal, there are plenty of 
> BigDecimal operations, which result in a lot of only once used BigDecimal 
> objects. Furthermore, the BigDecimal calculations are slow. We could boost 
> this function by using String concatenation and point movement of BigDecimal. 
> The JMH benchmark is uploaded to the attachment.
> Also, I added a UT to test the encoding / decoding correctness of 200 random 
> test samples.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes

2021-12-29 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26635:

Description: 
Currently, when we decode a byte array to a big decimal, there are plenty of 
BigDecimal operations, which result in a lot of only once used BigDecimal 
objects. Furthermore, the BigDecimal calculations are slow. We could boost this 
function by using String concatenation and point movement of BigDecimal. The 
JMH benchmark is uploaded to the attachment.

Also, I added a UT to test the encoding / decoding correctness of 200 random 
test samples.

  was:Currently, when we decode a byte array to a big decimal, there are plenty 
of BigDecimal operations, which result in a lot of only once used BigDecimal 
objects. Furthermore, the BigDecimal calculations are slow. We could boost this 
function by using String concatenation and point movement of BigDecimal. The 
JMH benchmark is uploaded to the attachment.


> Optimize decodeNumeric in OrderedBytes
> --
>
> Key: HBASE-26635
> URL: https://issues.apache.org/jira/browse/HBASE-26635
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Benchmark-decoding.log
>
>
> Currently, when we decode a byte array to a big decimal, there are plenty of 
> BigDecimal operations, which result in a lot of only once used BigDecimal 
> objects. Furthermore, the BigDecimal calculations are slow. We could boost 
> this function by using String concatenation and point movement of BigDecimal. 
> The JMH benchmark is uploaded to the attachment.
> Also, I added a UT to test the encoding / decoding correctness of 200 random 
> test samples.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes

2021-12-29 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26635:

Attachment: Benchmark-decoding.log

> Optimize decodeNumeric in OrderedBytes
> --
>
> Key: HBASE-26635
> URL: https://issues.apache.org/jira/browse/HBASE-26635
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Attachments: Benchmark-decoding.log
>
>
> Currently, when we decode a byte array to a big decimal, there are plenty of 
> BigDecimal operations, which result in a lot of only once used BigDecimal 
> objects. Furthermore, the BigDecimal calculations are slow. We could boost 
> this function by using String concatenation and point movement of BigDecimal. 
> The JMH benchmark is uploaded to the attachment.
> Also, I added a UT to test the encoding / decoding correctness of 200 random 
> test samples.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26635) Optimize decodeNumeric in OrderedBytes

2021-12-29 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26635:
---

 Summary: Optimize decodeNumeric in OrderedBytes
 Key: HBASE-26635
 URL: https://issues.apache.org/jira/browse/HBASE-26635
 Project: HBase
  Issue Type: Improvement
  Components: Performance
Reporter: Yutong Xiao
Assignee: Yutong Xiao


Currently, when we decode a byte array to a big decimal, there are plenty of 
BigDecimal operations, which result in a lot of only once used BigDecimal 
objects. Furthermore, the BigDecimal calculations are slow. We could boost this 
function by using String concatenation and point movement of BigDecimal. The 
JMH benchmark is uploaded to the attachment.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26564) Retire the method visitLogEntryBeforeWrite without RegionInfo in WALActionListner

2021-12-27 Thread Yutong Xiao (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17465921#comment-17465921
 ] 

Yutong Xiao commented on HBASE-26564:
-

Have opened a new MR for branch-2. :)

> Retire the method visitLogEntryBeforeWrite without RegionInfo in 
> WALActionListner
> -
>
> Key: HBASE-26564
> URL: https://issues.apache.org/jira/browse/HBASE-26564
> Project: HBase
>  Issue Type: Task
>  Components: wal
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Minor
> Fix For: 3.0.0-alpha-3
>
>
> visitLogEntryBeforeWrite without region info as parameter variable is only 
> used in ReplicationSourceWALActionListener. Has retired it from 
> WALActionListener Interface.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26629) Add expiration for long time vacant scanners in Thrift2

2021-12-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26629:

Summary: Add expiration for long time vacant scanners in Thrift2  (was: Add 
expiration for long time vacant scanners)

> Add expiration for long time vacant scanners in Thrift2
> ---
>
> Key: HBASE-26629
> URL: https://issues.apache.org/jira/browse/HBASE-26629
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, Thrift
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> In thrift1 implementation, the ThriftHBaseSerivceHandler holds a Cache with 
> an expire time after access. This will make the long time vacant scanners be 
> collected by gc. However, in thrift2 we do not have this feature and only use 
> a map to store scanners and we need the client close the scanner manually. If 
> not, the expired scanners will live in memory forever. In this case, I 
> applied the cache expiration to thrift2 service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HBASE-26629) Add expiration for long time vacant scanners

2021-12-27 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26629:

Component/s: Performance
 Thrift

> Add expiration for long time vacant scanners
> 
>
> Key: HBASE-26629
> URL: https://issues.apache.org/jira/browse/HBASE-26629
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, Thrift
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> In thrift1 implementation, the ThriftHBaseSerivceHandler holds a Cache with 
> an expire time after access. This will make the long time vacant scanners be 
> collected by gc. However, in thrift2 we do not have this feature and only use 
> a map to store scanners and we need the client close the scanner manually. If 
> not, the expired scanners will live in memory forever. In this case, I 
> applied the cache expiration to thrift2 service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HBASE-26629) Add expiration for long time vacant scanners

2021-12-27 Thread Yutong Xiao (Jira)

Yutong Xiao created HBASE-26629:
---

 Summary: Add expiration for long time vacant scanners
 Key: HBASE-26629
 URL: https://issues.apache.org/jira/browse/HBASE-26629
 Project: HBase
  Issue Type: Improvement
Reporter: Yutong Xiao
Assignee: Yutong Xiao


In thrift1 implementation, the ThriftHBaseSerivceHandler holds a Cache with an 
expire time after access. This will make the long time vacant scanners be 
collected by gc. However, in thrift2 we do not have this feature and only use a 
map to store scanners and we need the client close the scanner manually. If 
not, the expired scanners will live in memory forever. In this case, I applied 
the cache expiration to thrift2 service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (HBASE-26604) Replace new allocation with ThreadLocal in CellBlockBuilder to reduce GC

2021-12-22 Thread Yutong Xiao (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao resolved HBASE-26604.
-
Resolution: Won't Fix

The threadlocal has a potential sync critic. Won't do this then.

> Replace new allocation with ThreadLocal in CellBlockBuilder to reduce GC
> 
>
> Key: HBASE-26604
> URL: https://issues.apache.org/jira/browse/HBASE-26604
> Project: HBase
>  Issue Type: Task
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Minor
>
> In CellBlockBuilder decompress method, we currently allocate a new 
> ByteBufferOutputStream object each invoke. 
> {code:java}
> try {
>   // TODO: This is ugly. The buffer will be resized on us if we guess 
> wrong.
>   //TODO: reuse buffers.
>   bbos = new ByteBufferOutputStream(osInitialSize);
>   IOUtils.copy(cis, bbos);
>   bbos.close();
>   return bbos.getByteBuffer();
>  } finally {
>   CodecPool.returnDecompressor(poolDecompressor);
> }
> {code}
> We can use a ThreadLocal variable to reuse the buffer in each Thread. As:
> {code:java}
> try {
>   // TODO: This is ugly. The buffer will be resized on us if we guess 
> wrong.
>   if (this.decompressBuff.get() == null) {
> this.decompressBuff.set(new ByteBufferOutputStream(osInitialSize));
>   }
>   ByteBufferOutputStream localBbos = this.decompressBuff.get();
>   localBbos.clear();
>   IOUtils.copy(cis, localBbos);
>   return localBbos.getByteBuffer();
> } finally {
>   CodecPool.returnDecompressor(poolDecompressor);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

1 2 3 >

1 - 100 of 249 matches

Mail list logo