[jira] [Commented] (HBASE-26780) HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in onDiskSizeWithHeader= A != B
[ https://issues.apache.org/jira/browse/HBASE-26780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762398#comment-17762398 ] Yutong Xiao commented on HBASE-26780: - [~ndimiduk] hi, when we got the feedback of the issue, the problematic file is already compacted away as well. The problem cannot be reproduced. So that we only did some reviews about related hbase code. When we met the issue, the B is 33, which is just the header size with checksum used in verifyOnDiskSizeMatchesHeader#getOnDiskSizeWithHeader#headerSize. That means the actual block size is missed. We currently try to re-read the header once again like I did in my mr. We have onlined the changes in my MR, it looks so far so good. If it is actual the corruption in hdfs block, we should meet it once again. But from [~cribbee] 's log, the error value is not the same. The cause may be different. FYI, we are running 1.4.12 + some customised patches. > HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in > onDiskSizeWithHeader= A != B > -- > > Key: HBASE-26780 > URL: https://issues.apache.org/jira/browse/HBASE-26780 > Project: HBase > Issue Type: Bug > Components: BlockCache >Affects Versions: 2.2.2 >Reporter: yuzhang >Priority: Major > Attachments: IOException.png > > > When I scan a region, HBase throw IOException: Passed in > onDiskSizeWithHeader= A != B > The HFile mentioned Error message can be access normally. > it recover by command – move region. I guess that onDiskSizeWithHeader of > HFileBlock has been changed. And RS get the correct BlockHeader Info after > region reopened. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-26780) HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in onDiskSizeWithHeader= A != B
[ https://issues.apache.org/jira/browse/HBASE-26780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754506#comment-17754506 ] Yutong Xiao commented on HBASE-26780: - We also met this issue recently. The cause is that the block size read from the cached header in FSReaderImpl is not correct (But not clear why the header is incorrect without any IOException from hdfs client). As the hfile is not corrupted, read the header again from hdfs should work to avoid this issue. I raised an MR to read the header again when the cached header dose not match the block size. > HFileBlock.verifyOnDiskSizeMatchesHeader throw IOException: Passed in > onDiskSizeWithHeader= A != B > -- > > Key: HBASE-26780 > URL: https://issues.apache.org/jira/browse/HBASE-26780 > Project: HBase > Issue Type: Bug > Components: BlockCache >Affects Versions: 2.2.2 >Reporter: yuzhang >Priority: Major > Attachments: IOException.png > > > When I scan a region, HBase throw IOException: Passed in > onDiskSizeWithHeader= A != B > The HFile mentioned Error message can be access normally. > it recover by command – move region. I guess that onDiskSizeWithHeader of > HFileBlock has been changed. And RS get the correct BlockHeader Info after > region reopened. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743649#comment-17743649 ] Yutong Xiao commented on HBASE-27962: - Added three sub properties to make the ratio configuration more flexible. > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, when we set the ratio to 0.2, if we > have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + > 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 > for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility under various workloads > and guarantee a level of R/W/S isolation for requests processing at the same > time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743132#comment-17743132 ] Yutong Xiao edited comment on HBASE-27962 at 7/14/23 12:24 PM: --- There is one thing I need to figure out that the shared handlers will run the queued requests first (that is to say, a shared write handler will process write requests with higher priority), and it only run other type requests when it is idling, when there is no idling handlers it could be regarded as an RWQueueExecutor. Cannot agree with you that this is a serious problem. And also not get the point that how it make slow requests hard to debug, we have a lot of metrics for debugging. Further, HBASE-27766 has no fastpath feature. The mode need to suffer the cost of queue lock competing. As for the isolate model of reads and write, we can also introduce two ratios, one control shared writers, one control shared readers. This could let the client escalate the handler allocation. HBASE-27766 only consider idling scan handlers run gets, but HBASE-27962 also covers idling write handlers and idling get handlers. From my point of view, HBASE-27766 is redundant when we employed HBASE-27962. Thank you~ was (Author: xytss123): There is one thing I need to figure out that the shared handlers will run the queued requests first (that is to say, a shared write handler will process write requests with high priority), and it only run other type requests when it is idling, when there is no idling handlers it could be regarded as an RWQueueExecutor. Cannot agree with you that this is a serious problem. And also not get the point that how it make slow requests hard to debug, we have a lot of metrics for debugging. Further, HBASE-27766 has no fastpath feature. The mode need to suffer the cost of queue lock competing. As for the isolate model of reads and write, we can also introduce two ratios, one control shared writers, one control shared readers. This could let the client escalate the handler allocation. HBASE-27766 only consider idling scan handlers run gets, but HBASE-27962 also covers idling write handlers and idling get handlers. From my point of view, HBASE-27766 is redundant when we employed HBASE-27962. Thank you~ > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, when we set the ratio to 0.2, if we > have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + > 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 > for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility under various workloads > and guarantee a level of R/W/S isolation for requests processing at the same > time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743132#comment-17743132 ] Yutong Xiao commented on HBASE-27962: - There is one thing I need to figure out that the shared handlers will run the queued requests first (that is to say, a shared write handler will process write requests with high priority), and it only run other type requests when it is idling, when there is no idling handlers it could be regarded as an RWQueueExecutor. Cannot agree with you that this is a serious problem. And also not get the point that how it make slow requests hard to debug, we have a lot of metrics for debugging. Further, HBASE-27766 has no fastpath feature. The mode need to suffer the cost of queue lock competing. As for the isolate model of reads and write, we can also introduce two ratios, one control shared writers, one control shared readers. This could let the client escalate the handler allocation. HBASE-27766 only consider idling scan handlers run gets, but HBASE-27962 also covers idling write handlers and idling get handlers. From my point of view, HBASE-27766 is redundant when we employed HBASE-27962. Thank you~ > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, when we set the ratio to 0.2, if we > have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + > 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 > for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility under various workloads > and guarantee a level of R/W/S isolation for requests processing at the same > time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742999#comment-17742999 ] Yutong Xiao commented on HBASE-27962: - HBASE-27962 is based on the FastPathFeature introduced in HBASE-26551, which is proved having a better performance than original RWQueueExecutor. Furthermore, AdaptiveFastPathRWRpcExecutor also covers the write handlers. Besides, AdaptiveFastPathRWRpcExecutor do not need to calculate the priority, when taking requests. In this case, AdaptiveFastPathRWRpcExecutor outperforms RpcStealQueue theoretically. What do you think [~haxiaolin]? > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, when we set the ratio to 0.2, if we > have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + > 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 > for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility under various workloads > and guarantee a level of R/W/S isolation for requests processing at the same > time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, when we set the ratio to 0.2, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility under various workloads and guarantee a level of R/W/S isolation for requests processing at the same time. was: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, when we set the ratio to 0.2, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility in various workloads and guarantee a level of R/W/S isolation for requests processing at the same time. > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, when we set the ratio to 0.2, if we > have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + > 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 > for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility under various workloads > and guarantee a level of R/W/S isolation for requests processing at the same > time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, when we set the ratio to 0.2, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility in various workloads and guarantee a level of R/W/S isolation for requests processing at the same time. was: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility in various workloads and guarantee a level of R/W/S isolation for requests processing at the same time. > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, when we set the ratio to 0.2, if we > have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + > 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 > for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility in various workloads and > guarantee a level of R/W/S isolation for requests processing at the same time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility in various workloads and guarantee a level of R/W/S isolation for requests processing at the same time. was: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility in various workload and guarantee a level of R/W/S isolation for requests processing at the same time. > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, if we have 100 handlers, 50 for > write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and > 40 isolated handlers for write, 20 for get and 20 for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility in various workloads and > guarantee a level of R/W/S isolation for requests processing at the same time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. This improvement will improve the resource utility in various workload and guarantee a level of R/W/S isolation for requests processing at the same time. was: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, if we have 100 handlers, 50 for > write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and > 40 isolated handlers for write, 20 for get and 20 for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. > This improvement will improve the resource utility in various workload and > guarantee a level of R/W/S isolation for requests processing at the same time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * Shared handler could run all the three kinds of requests. * Shared handler will be shared only when it is idle. * Shared handler is also bounded to a kind of RPCQueue, it will process the requests in that queue first. was: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * When there is no idle fastpath handler for the three groups of handlers respectively, it will try to get an idling shared handlers. * The shared handlers are bounded to their own R/W/S queue, it will process that kind of request first and only be shared after being added to idle handler stacks. > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, if we have 100 handlers, 50 for > write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and > 40 isolated handlers for write, 20 for get and 20 for scan). > * Shared handler could run all the three kinds of requests. > * Shared handler will be shared only when it is idle. > * Shared handler is also bounded to a kind of RPCQueue, it will process the > requests in that queue first. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueueRpcExecutor. But the numbers of read/write handlers are fixed, which make the RegionServer performance not so good in our prod env. The logic of it is described below: * The basic architecture is same as FastPathRWRpcExecutor. * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the number of shared handlers. (for example, if we have 100 handlers, 50 for write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and 40 isolated handlers for write, 20 for get and 20 for scan). * When there is no idle fastpath handler for the three groups of handlers respectively, it will try to get an idling shared handlers. * The shared handlers are bounded to their own R/W/S queue, it will process that kind of request first and only be shared after being added to idle handler stacks. was:We currently use the FastPathRWQueue > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueueRpcExecutor. But the numbers of > read/write handlers are fixed, which make the RegionServer performance not so > good in our prod env. > The logic of it is described below: > * The basic architecture is same as FastPathRWRpcExecutor. > * Introduced a float shared_ratio (0, 1.0), to indicate the ratio of the > number of shared handlers. (for example, if we have 100 handlers, 50 for > write, 25 for get, 25 for scan, there will be 10 + 5 + 5 shared handlers and > 40 isolated handlers for write, 20 for get and 20 for scan). > * When there is no idle fastpath handler for the three groups of handlers > respectively, it will try to get an idling shared handlers. > * The shared handlers are bounded to their own R/W/S queue, it will process > that kind of request first and only be shared after being added to idle > handler stacks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
[ https://issues.apache.org/jira/browse/HBASE-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27962: Description: We currently use the FastPathRWQueue > Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit > various workloads > --- > > Key: HBASE-27962 > URL: https://issues.apache.org/jira/browse/HBASE-27962 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > We currently use the FastPathRWQueue -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27962) Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads
Yutong Xiao created HBASE-27962: --- Summary: Introduce an AdaptiveFastPathRWRpcExecutor to make the W/R/S separations fit various workloads Key: HBASE-27962 URL: https://issues.apache.org/jira/browse/HBASE-27962 Project: HBase Issue Type: Improvement Reporter: Yutong Xiao Assignee: Yutong Xiao -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup has thread safety problem
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Summary: RSGroupMappingScript#getRSGroup has thread safety problem (was: RSGroupMappingScript#getRSGroup Should be Synchronized) > RSGroupMappingScript#getRSGroup has thread safety problem > - > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug > Components: rsgroup >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The master branch also has this problem. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > Besides, I found that this issue is retained in master branch also. > The test code is attached and my rsgroup mapping script is very simple: > {code:java} > #!/bin/bash > namespace=$1 > tablename=$2 > echo default > {code} > The reproduced screenshot is also attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Description: We are using version 1.4.12 and met a problem in table creation phase some time. The master branch also has this problem. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. Besides, I found that this issue is retained in master branch also. The test code is attached and my rsgroup mapping script is very simple: {code:java} #!/bin/bash namespace=$1 tablename=$2 echo default {code} The reproduced screenshot is also attached. was: We are using version 1.4.12 and met a problem in table creation phase some time. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. Besides, I found that this issue is retained in master branch also. The test code is attached and my rsgroup mapping script is very simple: {code:java} #!/bin/bash namespace=$1 tablename=$2 echo default {code} The reproduced screenshot is also attached. > RSGroupMappingScript#getRSGroup Should be Synchronized > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug > Components: rsgroup >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The master branch also has this problem. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Component/s: rsgroup > RSGroupMappingScript#getRSGroup Should be Synchronized > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug > Components: rsgroup >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > Besides, I found that this issue is retained in master branch also. > The test code is attached and my rsgroup mapping script is very simple: > {code:java} > #!/bin/bash > namespace=$1 > tablename=$2 > echo default > {code} > The reproduced screenshot is also attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Description: We are using version 1.4.12 and met a problem in table creation phase some time. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. Besides, I found that this issue is retained in master branch also. The test code is attached and my rsgroup mapping script is very simple: {code:java} #!/bin/bash namespace=$1 tablename=$2 echo default {code} The reproduced screenshot is also attached. was: We are using version 1.4.12 and met a problem in table creation phase some time. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. The test code is attached and my rsgroup mapping script is very simple: {code:java} #!/bin/bash namespace=$1 tablename=$2 echo default {code} The reproduced screenshot is also attached. > RSGroupMappingScript#getRSGroup Should be Synchronized > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup Should be Synchronized
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Summary: RSGroupMappingScript#getRSGroup Should be Synchronized (was: RSGroupMappingScript#getRSGroup should be synchronized) > RSGroupMappingScript#getRSGroup Should be Synchronized > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > The test code is attached and my rsgroup mapping script is very simple: > {code:java} > #!/bin/bash > namespace=$1 > tablename=$2 > echo default > {code} > The reproduced screenshot is also attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronized
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Summary: RSGroupMappingScript#getRSGroup should be synchronized (was: RSGroupMappingScript#getRSGroup should be synchronised) > RSGroupMappingScript#getRSGroup should be synchronized > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > The test code is attached and my rsgroup mapping script is very simple: > {code:java} > #!/bin/bash > namespace=$1 > tablename=$2 > echo default > {code} > The reproduced screenshot is also attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Attachment: result.png > RSGroupMappingScript#getRSGroup should be synchronised > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > The test code is attached and my rsgroup mapping script is very simple: > {code:java} > #!/bin/bash > namespace=$1 > tablename=$2 > echo default > {code} > The reproduced screenshot is also attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Description: We are using version 1.4.12 and met a problem in table creation phase some time. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. The test code is attached and my rsgroup mapping script is very simple: {code:java} #!/bin/bash namespace=$1 tablename=$2 echo default {code} The reproduced screenshot is also attached. was: We are using version 1.4.12 and met a problem in table creation phase some time. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > RSGroupMappingScript#getRSGroup should be synchronised > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { >
[jira] [Created] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised
Yutong Xiao created HBASE-27246: --- Summary: RSGroupMappingScript#getRSGroup should be synchronised Key: HBASE-27246 URL: https://issues.apache.org/jira/browse/HBASE-27246 Project: HBase Issue Type: Bug Reporter: Yutong Xiao Assignee: Yutong Xiao Attachments: Test.java We are using version 1.4.12 and met a problem in table creation phase some time. The error message is: {code:java} 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR HBaseServiceImpl - hbase create table: xxx: failed. (HBaseServiceImpl.java:116) java.lang.RuntimeException: org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup (default default) for this table's namespace does not exist. {code} The rsgroup here should be one 'default' but not two consecutive 'default'. The code to get RSGroup from a mapping script is: {code:java} String getRSGroup(String namespace, String tablename) { if (rsgroupMappingScript == null) { return null; } String[] exec = rsgroupMappingScript.getExecString(); exec[1] = namespace; exec[2] = tablename; try { rsgroupMappingScript.execute(); } catch (IOException e) { // This exception may happen, like process doesn't have permission to run this script. LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), TableName.valueOf(namespace, tablename)); return RSGroupInfo.DEFAULT_GROUP; } return rsgroupMappingScript.getOutput().trim(); } {code} here the rsgourpMappingScript could be executed by multi-threads. To test it is a multi-thread issue, I ran a piece of code locally and found that the hadoop ShellCommandExecutor is not thread-safe (I run the code with hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27246) RSGroupMappingScript#getRSGroup should be synchronised
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-27246: Attachment: Test.java > RSGroupMappingScript#getRSGroup should be synchronised > -- > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Test.java > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HBASE-23158) If KVs are in memstore, small batch get can come across MultiActionResultTooLarge
[ https://issues.apache.org/jira/browse/HBASE-23158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489351#comment-17489351 ] Yutong Xiao edited comment on HBASE-23158 at 2/10/22, 6:00 AM: --- Encountered this problem also. For this problem, could we just use the number of cells times a predefined blocksize to approximate the total block size (one cell, one block)? For cells in memstore, they will be flushed in short time, so that we can just regard them as cells read from HFiles I think. This may solve this annoying exception. The problem is the approximation is rough (different table can have different blocksize). was (Author: xytss123): Encountered this problem also. For this problem, could we just use the number of cells times a predefined blocksize to approximate the total block size (one cell, one block)? For cells in memstore, they will be flushed in short time, so that we can just regard them as cells read from HFiles I think. This may solve this annoying exception. The problem is the approximation is rough granularity (different table can have different blocksize). What do you think? > If KVs are in memstore, small batch get can come across > MultiActionResultTooLarge > -- > > Key: HBASE-23158 > URL: https://issues.apache.org/jira/browse/HBASE-23158 > Project: HBase > Issue Type: Bug > Components: regionserver, rpc > Environment: [^TestMultiRespectsLimitsMemstore.patch] >Reporter: junfei liang >Priority: Minor > Attachments: TestMultiRespectsLimitsMemstore.patch > > > to protect against big scan, we set hbase.server.scanner.max.result.size = > 10MB in our customer hbase cluster, however our clients can meet > MultiActionResultTooLarge even in small batch get (for ex. 15 batch get, and > row size is about 5KB ) . > after [HBASE-14978|https://issues.apache.org/jira/browse/HBASE-14978] hbase > take the data block reference into consideration, but the block size is 64KB > (the default value ), even if all cells are from different block , the block > size retained is less than 1MB, so what's the problem ? > finally i found that HBASE-14978 also consider the cell in memstore, as > MSLAB is enabled default, so if the cell is from memstore, cell backend array > can be large (2MB as default), so even if a small batch can meet this error, > is this reasonable ? > plus: > when throw MultiActionResultTooLarge exception, hbase client should retry > ignore rpc retry num, however if set retry num to zero, client will fail > without retry in this case. > > see attachment TestMultiRespectsLimitsMemstore for details. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-23158) If KVs are in memstore, small batch get can come across MultiActionResultTooLarge
[ https://issues.apache.org/jira/browse/HBASE-23158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489351#comment-17489351 ] Yutong Xiao commented on HBASE-23158: - Encountered this problem also. For this problem, could we just use the number of cells times a predefined blocksize to approximate the total block size (one cell, one block)? For cells in memstore, they will be flushed in short time, so that we can just regard them as cells read from HFiles I think. This may solve this annoying exception. The problem is the approximation is rough granularity (different table can have different blocksize). What do you think? > If KVs are in memstore, small batch get can come across > MultiActionResultTooLarge > -- > > Key: HBASE-23158 > URL: https://issues.apache.org/jira/browse/HBASE-23158 > Project: HBase > Issue Type: Bug > Components: regionserver, rpc > Environment: [^TestMultiRespectsLimitsMemstore.patch] >Reporter: junfei liang >Priority: Minor > Attachments: TestMultiRespectsLimitsMemstore.patch > > > to protect against big scan, we set hbase.server.scanner.max.result.size = > 10MB in our customer hbase cluster, however our clients can meet > MultiActionResultTooLarge even in small batch get (for ex. 15 batch get, and > row size is about 5KB ) . > after [HBASE-14978|https://issues.apache.org/jira/browse/HBASE-14978] hbase > take the data block reference into consideration, but the block size is 64KB > (the default value ), even if all cells are from different block , the block > size retained is less than 1MB, so what's the problem ? > finally i found that HBASE-14978 also consider the cell in memstore, as > MSLAB is enabled default, so if the cell is from memstore, cell backend array > can be large (2MB as default), so even if a small batch can meet this error, > is this reasonable ? > plus: > when throw MultiActionResultTooLarge exception, hbase client should retry > ignore rpc retry num, however if set retry num to zero, client will fail > without retry in this case. > > see attachment TestMultiRespectsLimitsMemstore for details. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483509#comment-17483509 ] Yutong Xiao edited comment on HBASE-26688 at 1/28/22, 1:33 AM: --- For non-empty Result objects, the logic is the same after the change. This only changes the behaviour of Result with empty cell list, which used to have a thread safety bug. A more common usage is to check the return value of Result#advance once first, if it is false, the client should search the next result. This usage is impacted by this bug. Even if the client wants to catch the exception, they may get the exception at the first time of advance() but not the second time, which is not the original logic. By the way, in the original code, if the cell list is null, the method also always returns false and never raises the exception. Result#isEmpty with null cell list and Result#isEmpty with empty list all return true. But in the original code, "empty" Results have different behaviours. Should we also care about this? This is actually a bug. If we want to keep return NoSuchElementException for Result with empty list. My opinion is we can change the Result class thread safe. (e.g. make Result#cellScannerIndex threadlocal?) was (Author: xytss123): For non-empty Result objects, the logic is the same after the change. This only changes the behaviour of Result with empty cell list, which used to have a thread safety bug. A more common usage is to check the return value of Result#advance once first, if it is false, the client should search the next result. This usage is impacted by this bug. Even if the client wants to catch the exception, they may get the exception at the first time of advance() but not the second time, which is not the original logic. By the way, in the original code, if the cell list is null, the method also always returns false and never raises the exception. Result#isEmpty with null cell list and Result#isEmpty with empty list all return true. But in the original code, "empty" Results have different behaviours. This is actually a bug. If we want to keep return NoSuchElementException for Result with empty list. My opinion is we can change the Result class thread safe. (e.g. make Result#cellScannerIndex threadlocal?) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483509#comment-17483509 ] Yutong Xiao edited comment on HBASE-26688 at 1/28/22, 1:33 AM: --- For non-empty Result objects, the logic is the same after the change. This only changes the behaviour of Result with empty cell list, which used to have a thread safety bug. A more common usage is to check the return value of Result#advance once first, if it is false, the client should search the next result. This usage is impacted by this bug. Even if the client wants to catch the exception, they may get the exception at the first time of advance() but not the second time, which is not the original logic. By the way, in the original code, if the cell list is null, the method also always returns false and never raises the exception. Result#isEmpty with null cell list and Result#isEmpty with empty list all return true. But in the original code, "empty" Results have different behaviours. Should we also care about this? If we want to keep return NoSuchElementException for Result with empty list. My opinion is we can change the Result class thread safe. (e.g. make Result#cellScannerIndex threadlocal?) was (Author: xytss123): For non-empty Result objects, the logic is the same after the change. This only changes the behaviour of Result with empty cell list, which used to have a thread safety bug. A more common usage is to check the return value of Result#advance once first, if it is false, the client should search the next result. This usage is impacted by this bug. Even if the client wants to catch the exception, they may get the exception at the first time of advance() but not the second time, which is not the original logic. By the way, in the original code, if the cell list is null, the method also always returns false and never raises the exception. Result#isEmpty with null cell list and Result#isEmpty with empty list all return true. But in the original code, "empty" Results have different behaviours. Should we also care about this? This is actually a bug. If we want to keep return NoSuchElementException for Result with empty list. My opinion is we can change the Result class thread safe. (e.g. make Result#cellScannerIndex threadlocal?) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483509#comment-17483509 ] Yutong Xiao commented on HBASE-26688: - For non-empty Result objects, the logic is the same after the change. This only changes the behaviour of Result with empty cell list, which used to have a thread safety bug. A more common usage is to check the return value of Result#advance once first, if it is false, the client should search the next result. This usage is impacted by this bug. Even if the client wants to catch the exception, they may get the exception at the first time of advance() but not the second time, which is not the original logic. By the way, in the original code, if the cell list is null, the method also always returns false and never raises the exception. Result#isEmpty with null cell list and Result#isEmpty with empty list all return true. But in the original code, "empty" Results have different behaviours. This is actually a bug. If we want to keep return NoSuchElementException for Result with empty list. My opinion is we can change the Result class thread safe. (e.g. make Result#cellScannerIndex threadlocal?) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483139#comment-17483139 ] Yutong Xiao commented on HBASE-26688: - Right. And we have marked in release note. > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483044#comment-17483044 ] Yutong Xiao commented on HBASE-26688: - [~zhangduo] I just wanted to rebase master and revise the commit message. But mixed the commits So that I opened another PR and closed the old one. Sorry for the inconvenience. > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Release Note: Result#advance with empty cell list will always return false but not raise NoSuchElementException when called multiple times. (was: Result#advance with empty cell list will always return false but not raise NoSuchElementException when called twice.) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Release Note: Result#advance with empty cell list will always return false but not raise NoSuchElementException when called twice. (was: Result#advance with empty cell list will always return false but not raise NoSuchElementException.) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Release Note: Result#advance with empty cell list will always return false but not raise NoSuchElementException. (was: Removed unit test case TestResult#testAdvanceTwiceOnEmptyCell.) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Release Note: Removed unit test case TestResult#testAdvanceTwiceOnEmptyCell. > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482484#comment-17482484 ] Yutong Xiao commented on HBASE-26688: - [~zhangduo] OK, will push another PR then. > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482195#comment-17482195 ] Yutong Xiao commented on HBASE-26688: - As a thread-unsafe object but shared among many client threads, it is hard to make it follow the single thread logic without synchronisation. In my opinion, we could just remove TestResult#testAdvanceTwiceOnEmptyCell and add some comments to mark the empty result is different. > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10 > > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26705) Backport HBASE-26688 to branch-1
Yutong Xiao created HBASE-26705: --- Summary: Backport HBASE-26688 to branch-1 Key: HBASE-26705 URL: https://issues.apache.org/jira/browse/HBASE-26705 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.7.1 Reporter: Yutong Xiao Assignee: Yutong Xiao Fix For: 1.7.2 Backport issue [HBASE-26688|https://issues.apache.org/jira/browse/HBASE-26688] to branch-1 to eliminate the client crash when using threads shared EMPTY_RESULT. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to unexpected client job down.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Summary: Threads shared EMPTY_RESULT may lead to unexpected client job down. (was: Threads shared EMPTY_RESULT may lead to ) > Threads shared EMPTY_RESULT may lead to unexpected client job down. > --- > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Threads shared EMPTY_RESULT may lead to
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Summary: Threads shared EMPTY_RESULT may lead to (was: Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.) > Threads shared EMPTY_RESULT may lead to > > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented by multi threads and > throw this exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// Although it is almost impossible, >// We can arrive here when the client threads share the common reused > EMPTY_RESULT. > return this.cells[cellScannerIndex]; > } > {code} > In this case, the client can easily got confusing exceptions even if they use > different connections, tables in different threads. > We should change the if condition cells == null to isEmpty() to avoid the > client crashed from these exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Description: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented by multi threads and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // Although it is almost impossible, // We can arrive here when the client threads share the common reused EMPTY_RESULT. return this.cells[cellScannerIndex]; } {code} In this case, the client can easily got confusing exceptions even if they use different connections, tables in different threads. We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. was: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // Although it is almost impossible, // We can arrive here when the client threads share the common reused EMPTY_RESULT. return this.cells[cellScannerIndex]; } {code} In this case, the client can easily got confusing exceptions even if they use different connections, tables in different threads. We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. > Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT > like objects can be shared by multi client threads. > - > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null,
[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Description: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // Although it is almost impossible, // We can arrive here when the client threads share the common reused EMPTY_RESULT. return this.cells[cellScannerIndex]; } {code} In this case, the client can easily got confusing exceptions even if they use different connections, tables in different threads. We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. was: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // Although it is almost impossible, // We can arrive here when the client threads share the common reused EMPTY_RESULT. return this.cells[cellScannerIndex]; } {code} We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. > Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT > like objects can be shared by multi client threads. > - > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = >
[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Description: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // Although it is almost impossible, // We can arrive here when the client threads share the common reused EMPTY_RESULT. return this.cells[cellScannerIndex]; } {code} We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. was: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // When at the same time another thread invoke cellScanner to reset the index to -1, we will get problem. // although the possibility is small, but it can happen. return this.cells[cellScannerIndex]; } {code} We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. > Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT > like objects can be shared by multi client threads. > - > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public
[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Description: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // When at the same time another thread invoke cellScanner to reset the index to -1, we will get problem. // although the possibility is small, but it can happen. return this.cells[cellScannerIndex]; } {code} We should change the if condition cells == null to isEmpty() to avoid the client crashed from these exception. was: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // When at the same time another thread invoke cellScanner to reset the index to -1, we will get problem. // although the possibility is small, but it can happen. return this.cells[cellScannerIndex]; } {code} We can change the if condition cells == null to isEmpty() > Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT > like objects can be shared by multi client threads. > - > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean
[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Description: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. The precreated objects in ProtoBufUtil.java: {code:java} private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); private final static Result EMPTY_RESULT_STALE = Result.create(EMPTY_CELL_ARRAY, null, true); {code} Result#advance {code:java} public boolean advance() { if (cells == null) return false; cellScannerIndex++; if (cellScannerIndex < this.cells.length) { return true; } else if (cellScannerIndex == this.cells.length) { return false; } // The index of EMPTY_RESULT could be incremented and throw this exception. throw new NoSuchElementException("Cannot advance beyond the last cell"); } {code} Result#current {code:java} public Cell current() { if (cells == null || cellScannerIndex == INITIAL_CELLSCANNER_INDEX || cellScannerIndex >= cells.length) return null; // When at the same time another thread invoke cellScanner to reset the index to -1, we will get problem. // although the possibility is small, but it can happen. return this.cells[cellScannerIndex]; } {code} We can change the if condition cells == null to isEmpty() was:Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. > Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT > like objects can be shared by multi client threads. > - > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. > The precreated objects in ProtoBufUtil.java: > {code:java} > private final static Cell[] EMPTY_CELL_ARRAY = new Cell[]{}; > private final static Result EMPTY_RESULT = Result.create(EMPTY_CELL_ARRAY); > final static Result EMPTY_RESULT_EXISTS_TRUE = Result.create(null, true); > final static Result EMPTY_RESULT_EXISTS_FALSE = Result.create(null, false); > private final static Result EMPTY_RESULT_STALE = > Result.create(EMPTY_CELL_ARRAY, null, true); > {code} > Result#advance > {code:java} > public boolean advance() { > if (cells == null) return false; > cellScannerIndex++; > if (cellScannerIndex < this.cells.length) { > return true; > } else if (cellScannerIndex == this.cells.length) { > return false; > } > // The index of EMPTY_RESULT could be incremented and throw this > exception. > throw new NoSuchElementException("Cannot advance beyond the last cell"); > } > {code} > Result#current > {code:java} > public Cell current() { > if (cells == null > || cellScannerIndex == INITIAL_CELLSCANNER_INDEX > || cellScannerIndex >= cells.length) > return null; >// When at the same time another thread invoke cellScanner to reset the > index to -1, we will get problem. >// although the possibility is small, but it can happen. > return this.cells[cellScannerIndex]; > } > {code} > We can change the if condition cells == null to isEmpty() -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
[ https://issues.apache.org/jira/browse/HBASE-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26688: Description: Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce the object creation. But these objects could be shared by multi client threads. The Result#cellScannerIndex related methods could throw confusing exception and make the client job down. Could refine the logic of these methods. > Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT > like objects can be shared by multi client threads. > - > > Key: HBASE-26688 > URL: https://issues.apache.org/jira/browse/HBASE-26688 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, we use a pre-created EMPTY_RESULT in the ProtoBuf.util to reduce > the object creation. But these objects could be shared by multi client > threads. The Result#cellScannerIndex related methods could throw confusing > exception and make the client job down. Could refine the logic of these > methods. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26688) Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads.
Yutong Xiao created HBASE-26688: --- Summary: Result#advance, current are not thread safe. But the pre-created EMPTY_RESULT like objects can be shared by multi client threads. Key: HBASE-26688 URL: https://issues.apache.org/jira/browse/HBASE-26688 Project: HBase Issue Type: Bug Reporter: Yutong Xiao Assignee: Yutong Xiao -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! The operation distribution of YCSB workload is latest. Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. As the master & branch-2 using ByteBufferAllocator to manage the bucketcache memory allocation, the RAMBuffer may not have GC improvement as much as branch-1. was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! The operation distribution of YCSB workload is latest. Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. As the master & branch-2 using ByteBufferAllocator to manage the bucketcache memory allocation, the RAMBuffer may not have GC improvement as much as branch-1 > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > The operation distribution of YCSB workload is latest. > Client Side Metrics > See the attachment ClientSideMetrics.png > Server Side GC: > The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. > With RAMBuffer, the server side had 210 times GC and 2.56
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! The operation distribution of YCSB workload is latest. Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. As the master & branch-2 using ByteBufferAllocator to manage the bucketcache memory allocation, the RAMBuffer may not have GC improvement as much as branch-1 was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. As the master & branch-2 using ByteBufferAllocator to manage the bucketcache memory allocation, the RAMBuffer may not have GC improvement as much as branch-1 > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > The operation distribution of YCSB workload is latest. > Client Side Metrics > See the attachment ClientSideMetrics.png > Server Side GC: > The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. > With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. > As the master & branch-2 using
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. As the master & branch-2 using ByteBufferAllocator to manage the bucketcache memory allocation, the RAMBuffer may not have GC improvement as much as branch-1 was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > Client Side Metrics > See the attachment ClientSideMetrics.png > Server Side GC: > The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. > With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. > As the master & branch-2 using ByteBufferAllocator to manage the bucketcache > memory allocation, the RAMBuffer may not have GC improvement as much as > branch-1 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics See the attachment ClientSideMetrics.png Server Side GC: The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics See the attachment ClientSideMetrics.png > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > Client Side Metrics > See the attachment ClientSideMetrics.png > Server Side GC: > The current bucket cache triggered 217 GCs, which costs 2.74 minutes in total. > With RAMBuffer, the server side had 210 times GC and 2.56 minutes in total. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics See the attachment ClientSideMetrics.png was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics !ClientSideMetrics.png|height=300|width=300! > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > Client Side Metrics > See the attachment ClientSideMetrics.png -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics !ClientSideMetrics.png|height=250|width=250! was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics !ClientSideMetrics.png|height=250|width=250! {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > Client Side Metrics > !ClientSideMetrics.png|height=250|width=250! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics !ClientSideMetrics.png|height=300|width=300! was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! Client Side Metrics !ClientSideMetrics.png|height=250|width=250! > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > Client Side Metrics > !ClientSideMetrics.png|height=300|width=300! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics !ClientSideMetrics.png|height=250|width=250! {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > {panel:title=YCSB Test} > Client Side Metrics > !ClientSideMetrics.png|height=250|width=250! > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Attachment: ClientSideMetrics.png > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: ClientSideMetrics.png, Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Target table size: 112 GB > Properties: > !Properties.png|height=250|width=250! > {panel:title=YCSB Test} > Client Side Metrics > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics ||BucketCache without RAMBuffer||BucketCache with RAMBuffer|| |[OVERALL], RunTime(ms), 1772005 [OVERALL], Throughput(ops/sec), 2821.6624670923616 [TOTAL_GCS_PS_Scavenge], Count, 2760 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423 [TOTAL_GCS_PS_MarkSweep], Count, 4 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 217 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.012246015107180848 [TOTAL_GCs], Count, 2764 [TOTAL_GC_TIME], Time(ms), 17574 [TOTAL_GC_TIME_%], Time(%), 0.9917579239336233 [READ], Operations, 251 [READ], AverageLatency(us), 6831.8289292684285 [READ], MinLatency(us), 175 [READ], MaxLatency(us), 226431 [READ], 95thPercentileLatency(us), 12863 [READ], 99thPercentileLatency(us), 17823 [READ], Return=OK, 251 [CLEANUP], Operations, 60 [CLEANUP], AverageLatency(us), 961.1 [CLEANUP], MinLatency(us), 2 [CLEANUP], MaxLatency(us), 56191 [CLEANUP], 95thPercentileLatency(us), 73 [CLEANUP], 99thPercentileLatency(us), 541 [SCAN], Operations, 249 [SCAN], AverageLatency(us), 14388.572877029152 [SCAN], MinLatency(us), 320 [SCAN], MaxLatency(us), 441343 [SCAN], 95thPercentileLatency(us), 24751 [SCAN], 99thPercentileLatency(us), 32287 [SCAN], Return=OK, 249 | |[OVERALL], RunTime(ms), 1699253 [OVERALL], Throughput(ops/sec), 2942.4694262714265 [TOTAL_GCS_PS_Scavenge], Count, 2714 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17158 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.0097378083193025 [TOTAL_GCS_PS_MarkSweep], Count, 3 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 172 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010122094826373705 [TOTAL_GCs], Count, 2717 [TOTAL_GC_TIME], Time(ms), 17330 [TOTAL_GC_TIME_%], Time(%), 1.0198599031456763 [READ], Operations, 2499189 [READ], AverageLatency(us), 6507.363253039286 [READ], MinLatency(us), 177 [READ], MaxLatency(us), 102783 [READ], 95thPercentileLatency(us), 12055 [READ], 99thPercentileLatency(us), 16431 [READ], Return=OK, 2499189 [CLEANUP], Operations, 60 [CLEANUP], AverageLatency(us), 1247.81666 [CLEANUP], MinLatency(us), 2 [CLEANUP], MaxLatency(us), 73471 [CLEANUP], 95thPercentileLatency(us), 72 [CLEANUP], 99thPercentileLatency(us), 605 [SCAN], Operations, 2500811 [SCAN], AverageLatency(us), 13850.626164872116 [SCAN], MinLatency(us), 297 [SCAN], MaxLatency(us), 368383 [SCAN], 95thPercentileLatency(us), 23791 [SCAN], 99thPercentileLatency(us), 30783 [SCAN], Return=OK, 2500811 | {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL:
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics ||BucketCache without RAMBuffer||BucketCache with RAMBuffer|| |[OVERALL], RunTime(ms), 1772005 [OVERALL], Throughput(ops/sec), 2821.6624670923616 [TOTAL_GCS_PS_Scavenge], Count, 2760 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423 [TOTAL_GCS_PS_MarkSweep], Count, 4 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 217 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.012246015107180848 [TOTAL_GCs], Count, 2764 [TOTAL_GC_TIME], Time(ms), 17574 [TOTAL_GC_TIME_%], Time(%), 0.9917579239336233 [READ], Operations, 251 [READ], AverageLatency(us), 6831.8289292684285 [READ], MinLatency(us), 175 [READ], MaxLatency(us), 226431 [READ], 95thPercentileLatency(us), 12863 [READ], 99thPercentileLatency(us), 17823 [READ], Return=OK, 251 [CLEANUP], Operations, 60 [CLEANUP], AverageLatency(us), 961.1 [CLEANUP], MinLatency(us), 2 [CLEANUP], MaxLatency(us), 56191 [CLEANUP], 95thPercentileLatency(us), 73 [CLEANUP], 99thPercentileLatency(us), 541 [SCAN], Operations, 249 [SCAN], AverageLatency(us), 14388.572877029152 [SCAN], MinLatency(us), 320 [SCAN], MaxLatency(us), 441343 [SCAN], 95thPercentileLatency(us), 24751 [SCAN], 99thPercentileLatency(us), 32287 [SCAN], Return=OK, 249 | |[OVERALL], RunTime(ms), 1699253 [OVERALL], Throughput(ops/sec), 2942.4694262714265 [TOTAL_GCS_PS_Scavenge], Count, 2714 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17158 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.0097378083193025 [TOTAL_GCS_PS_MarkSweep], Count, 3 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 172 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010122094826373705 [TOTAL_GCs], Count, 2717 [TOTAL_GC_TIME], Time(ms), 17330 [TOTAL_GC_TIME_%], Time(%), 1.0198599031456763 [READ], Operations, 2499189 [READ], AverageLatency(us), 6507.363253039286 [READ], MinLatency(us), 177 [READ], MaxLatency(us), 102783 [READ], 95thPercentileLatency(us), 12055 [READ], 99thPercentileLatency(us), 16431 [READ], Return=OK, 2499189 [CLEANUP], Operations, 60 [CLEANUP], AverageLatency(us), 1247.81666 [CLEANUP], MinLatency(us), 2 [CLEANUP], MaxLatency(us), 73471 [CLEANUP], 95thPercentileLatency(us), 72 [CLEANUP], 99thPercentileLatency(us), 605 [SCAN], Operations, 2500811 [SCAN], AverageLatency(us), 13850.626164872116 [SCAN], MinLatency(us), 297 [SCAN], MaxLatency(us), 368383 [SCAN], 95thPercentileLatency(us), 23791 [SCAN], 99thPercentileLatency(us), 30783 [SCAN], Return=OK, 2500811 | {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics ||BucketCache without RAMBuffer||BucketCache with RAMBuffer|| |[OVERALL], RunTime(ms), 1772005 [OVERALL], Throughput(ops/sec), 2821.6624670923616 [TOTAL_GCS_PS_Scavenge], Count, 2760 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Target table size: 112 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Client Side Metrics ||BucketCache without RAMBuffer||BucketCache with RAMBuffer|| |[OVERALL], RunTime(ms), 1772005 [OVERALL], Throughput(ops/sec), 2821.6624670923616 [TOTAL_GCS_PS_Scavenge], Count, 2760 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17357 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9795119088264423 [TOTAL_GCS_PS_MarkSweep], Count, 4 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 217 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.012246015107180848 [TOTAL_GCs], Count, 2764 [TOTAL_GC_TIME], Time(ms), 17574 [TOTAL_GC_TIME_%], Time(%), 0.9917579239336233 [READ], Operations, 251 [READ], AverageLatency(us), 6831.8289292684285 [READ], MinLatency(us), 175 [READ], MaxLatency(us), 226431 [READ], 95thPercentileLatency(us), 12863 [READ], 99thPercentileLatency(us), 17823 [READ], Return=OK, 251 [CLEANUP], Operations, 60 [CLEANUP], AverageLatency(us), 961.1 [CLEANUP], MinLatency(us), 2 [CLEANUP], MaxLatency(us), 56191 [CLEANUP], 95thPercentileLatency(us), 73 [CLEANUP], 99thPercentileLatency(us), 541 [SCAN], Operations, 249 [SCAN], AverageLatency(us), 14388.572877029152 [SCAN], MinLatency(us), 320 [SCAN], MaxLatency(us), 441343 [SCAN], 95thPercentileLatency(us), 24751 [SCAN], 99thPercentileLatency(us), 32287 [SCAN], Return=OK, 249 |[OVERALL], RunTime(ms), 1699253 [OVERALL], Throughput(ops/sec), 2942.4694262714265 [TOTAL_GCS_PS_Scavenge], Count, 2714 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 17158 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.0097378083193025 [TOTAL_GCS_PS_MarkSweep], Count, 3 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 172 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010122094826373705 [TOTAL_GCs], Count, 2717 [TOTAL_GC_TIME], Time(ms), 17330 [TOTAL_GC_TIME_%], Time(%), 1.0198599031456763 [READ], Operations, 2499189 [READ], AverageLatency(us), 6507.363253039286 [READ], MinLatency(us), 177 [READ], MaxLatency(us), 102783 [READ], 95thPercentileLatency(us), 12055 [READ], 99thPercentileLatency(us), 16431 [READ], Return=OK, 2499189 [CLEANUP], Operations, 60 [CLEANUP], AverageLatency(us), 1247.81666 [CLEANUP], MinLatency(us), 2 [CLEANUP], MaxLatency(us), 73471 [CLEANUP], 95thPercentileLatency(us), 72 [CLEANUP], 99thPercentileLatency(us), 605 [SCAN], Operations, 2500811 [SCAN], AverageLatency(us), 13850.626164872116 [SCAN], MinLatency(us), 297 [SCAN], MaxLatency(us), 368383 [SCAN], 95thPercentileLatency(us), 23791 [SCAN], 99thPercentileLatency(us), 30783 [SCAN], Return=OK, 2500811 | {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project:
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png|height=250|width=250! {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Properties: > !Properties.png|height=250|width=250! > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=100|width=100! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png![size:50%] {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=100|width=100! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Properties: > !Properties.png! > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=250|width=250! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png|height=100|width=100! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png|height=250|width=250! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Properties: > !Properties.png! > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png![size:50%] {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png![size:50%] > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Properties: > !Properties.png! > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB Properties: !Properties.png! {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > Properties: > !Properties.png! > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} I also did a YCSB performance test. The circumstance is: Size of BucketCache: 40 GB {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} I also did a YCSB performance test. The circumstance is: {panel:title=YCSB Test} Some text with a title {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Attachment: Properties.png > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png, Properties.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png! > {panel} > I also did a YCSB performance test. > The circumstance is: > Size of BucketCache: 40 GB > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} I also did a YCSB performance test. The circumstance is: {panel:title=YCSB Test} Some text with a title {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png! > {panel} > I also did a YCSB performance test. > The circumstance is: > {panel:title=YCSB Test} > Some text with a title > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=The performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=the performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > > {panel:title=The performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png! > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. {panel:title=the performance of RAMBuffer with its hit ratio is 100%} !Hit 100%.png! {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: {panel:title=My title} !Hit 100%.png! {panel} > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > > {panel:title=the performance of RAMBuffer with its hit ratio is 100%} > !Hit 100%.png! > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: {panel:title=My title} !Hit 100%.png! {panel} was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: !Hit 100%.png! > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > > {panel:title=My title} > !Hit 100%.png! > {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: !Hit 100%.png! was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > !Hit 100%.png! > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Attachment: Hit 100%.png > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > Attachments: Hit 100%.png > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: !Screen Shot 2022-01-18 at 22.01.04.png! > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Attachment: Screen Shot 2022-01-18 at 22.01.04.png > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > ||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with > RAMBuffer||| > |BuckCache without RAMBuffer|BucketCache with RAMBuffer| -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: !Screen Shot 2022-01-18 at 22.01.04.png! was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: ||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with RAMBuffer||| |BuckCache without RAMBuffer|BucketCache with RAMBuffer| > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > !Screen Shot 2022-01-18 at 22.01.04.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Attachment: (was: Screen Shot 2022-01-18 at 22.01.04.png) > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > !Screen Shot 2022-01-18 at 22.01.04.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
[ https://issues.apache.org/jira/browse/HBASE-26681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26681: Description: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. I first did a YCSB test to check the performance of RAMBuffer with its hit ratio is 100%. The result is: ||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with RAMBuffer||| |BuckCache without RAMBuffer|BucketCache with RAMBuffer| was: In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. > Introduce a little RAMBuffer for bucketcache to reduce gc and improve > throughput > > > Key: HBASE-26681 > URL: https://issues.apache.org/jira/browse/HBASE-26681 > Project: HBase > Issue Type: Improvement > Components: BucketCache, Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > In branch-1, BucketCache just allocate new onheap bytebuffer to construct new > HFileBlock when get cached blocks. This rough allocation increases the GC > pressure for those "hot" blocks. > Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought > is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level > block is read twice, we cache it in the RAMBuffer. When the block timeout in > the cache (e.g. 60s), that means the block is not being accessed in 60s, we > evict it. Not like LRU, we do not cache block when the whole RAMBuffer size > reaches to a threshold (to fit different workload, the threshold is dynamic). > This will prevent the RAMBuffer from being churned. > I first did a YCSB test to check the performance of RAMBuffer with its hit > ratio is 100%. The result is: > ||BuckCache without RAMBufferBuckCache without RAMBuffer|BucketCache with > RAMBuffer||| > |BuckCache without RAMBuffer|BucketCache with RAMBuffer| -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26681) Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput
Yutong Xiao created HBASE-26681: --- Summary: Introduce a little RAMBuffer for bucketcache to reduce gc and improve throughput Key: HBASE-26681 URL: https://issues.apache.org/jira/browse/HBASE-26681 Project: HBase Issue Type: Improvement Components: BucketCache, Performance Reporter: Yutong Xiao Assignee: Yutong Xiao Fix For: 1.7.2 In branch-1, BucketCache just allocate new onheap bytebuffer to construct new HFileBlock when get cached blocks. This rough allocation increases the GC pressure for those "hot" blocks. Here introduce a RAMBuffer for those "hot" blocks in BucketCache. The thought is simple. The RAMBuffer is an timeout expiring cache. When a Multi-level block is read twice, we cache it in the RAMBuffer. When the block timeout in the cache (e.g. 60s), that means the block is not being accessed in 60s, we evict it. Not like LRU, we do not cache block when the whole RAMBuffer size reaches to a threshold (to fit different workload, the threshold is dynamic). This will prevent the RAMBuffer from being churned. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26678) Backport HBASE-26579 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26678: Description: Our branch-1 cluster also met the storage policy problem in usage. Backport the path to branch-1. (was: Our branch-1 cluster also met the storage policy in usage. Backport the path to branch-1.) > Backport HBASE-26579 to branch-1 > > > Key: HBASE-26678 > URL: https://issues.apache.org/jira/browse/HBASE-26678 > Project: HBase > Issue Type: Task >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > Our branch-1 cluster also met the storage policy problem in usage. Backport > the path to branch-1. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26678) Backport HBASE-26579 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26678: Description: Our branch-1 cluster also met the storage policy in usage. Backport the path to branch-1. > Backport HBASE-26579 to branch-1 > > > Key: HBASE-26678 > URL: https://issues.apache.org/jira/browse/HBASE-26678 > Project: HBase > Issue Type: Task >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > > Our branch-1 cluster also met the storage policy in usage. Backport the path > to branch-1. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26678) Backport HBASE-26579 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26678: External issue URL: https://issues.apache.org/jira/browse/HBASE-26579 Fix Version/s: 1.7.2 > Backport HBASE-26579 to branch-1 > > > Key: HBASE-26678 > URL: https://issues.apache.org/jira/browse/HBASE-26678 > Project: HBase > Issue Type: Task >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.7.2 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26678) Backport HBASE-26579 to branch-1
Yutong Xiao created HBASE-26678: --- Summary: Backport HBASE-26579 to branch-1 Key: HBASE-26678 URL: https://issues.apache.org/jira/browse/HBASE-26678 Project: HBase Issue Type: Task Reporter: Yutong Xiao Assignee: Yutong Xiao -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26596) region_mover should gracefully ignore null response from RSGroupAdmin#getRSGroupOfServer
[ https://issues.apache.org/jira/browse/HBASE-26596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476947#comment-17476947 ] Yutong Xiao commented on HBASE-26596: - [~vjasani] The PR has been pending for a time. Not sure if it is OK for the latest commit to merge. Could you please have a look then? > region_mover should gracefully ignore null response from > RSGroupAdmin#getRSGroupOfServer > > > Key: HBASE-26596 > URL: https://issues.apache.org/jira/browse/HBASE-26596 > Project: HBase > Issue Type: Bug > Components: mover, rsgroup >Affects Versions: 1.7.1 >Reporter: Viraj Jasani >Assignee: Yutong Xiao >Priority: Major > > If regionserver has any non-daemon thread running even after it's own > shutdown, the running non-daemon thread can prevent clean JVM exit and > regionserver could be stuck in the zombie state. We have recently provided a > workaround for this in HBASE-26468 for regionserver exit hook to wait 30s for > all non-daemon threads to get stopped before terminating JVM abnormally. > However, if regionserver is stuck in such state, region_mover unload fails > with: > {code:java} > NoMethodError: undefined method `getName` for nil:NilClass > getSameRSGroupServers at /bin/region_mover.rb:503 > __ensure__ at /bin/region_mover.rb:313 > unloadRegions at /bin/region_mover.rb:310 > (root) at /bin/region_mover.rb:572 > {code} > This happens if the cluster has RSGroup enabled and the given server is > already stopped, hence RSGroupAdmin#getRSGroupOfServer would return null (as > the server is not running anymore so it is not part of any RSGroup). > region_mover should ride over this null response and gracefully exit from > unloadRegions() call. > > We should also check if the fix is applicable to branch-2 and above. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26551) Add FastPath feature to HBase RWQueueRpcExecutor
[ https://issues.apache.org/jira/browse/HBASE-26551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475078#comment-17475078 ] Yutong Xiao commented on HBASE-26551: - OK, will do it then. > Add FastPath feature to HBase RWQueueRpcExecutor > > > Key: HBASE-26551 > URL: https://issues.apache.org/jira/browse/HBASE-26551 > Project: HBase > Issue Type: Task > Components: rpc, Scheduler >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: QueueTimeComparison.png, QueueTimeComparisonWithMax.png > > > In ticket [HBASE-17808|https://issues.apache.org/jira/browse/HBASE-17808], > the author introduced a fastpath implementation for RWQueueRpcExecutor. It > aggregated 3 different independent RpcExecutor to implement the mechanism. > This redundancy costed more memory and from its own performance test, it > cannot outperform the original implementation. This time, I directly extended > RWQueueRpcExecutor to implement the fast path mechanism. From my test result, > it has a better queue time performance than before. > YCSB Test: > Constant Configurations: > hbase.regionserver.handler.count: 1000 > hbase.ipc.server.callqueue.read.ratio: 0.5 > hbase.ipc.server.callqueue.handler.factor: 0.2 > Test Workload: > YCSB: 50% Write, 25% Get, 25% Scan. Max Scan length: 1000. > Client Threads: 100 > ||FastPathRWQueueRpcExecutor||RWQueueRpcExecutor|| > |[OVERALL], RunTime(ms), 909365 > [OVERALL], Throughput(ops/sec), 5498.3422498116815 > [TOTAL_GCS_PS_Scavenge], Count, 1208 > [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 8006 > [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.8803945610398465 > [TOTAL_GCS_PS_MarkSweep], Count, 2 > [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 96 > [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.010556817119638429 > [TOTAL_GCs], Count, 1210 > [TOTAL_GC_TIME], Time(ms), 8102 > [TOTAL_GC_TIME_%], Time(%), 0.8909513781594849 > [READ], Operations, 1248885 > [READ], AverageLatency(us), 14080.154160711354 > [READ], MinLatency(us), 269 > [READ], MaxLatency(us), 180735 > [READ], 95thPercentileLatency(us), 29775 > [READ], 99thPercentileLatency(us), 39391 > [READ], Return=OK, 1248885 > [CLEANUP], Operations, 200 > [CLEANUP], AverageLatency(us), 311.78 > [CLEANUP], MinLatency(us), 1 > [CLEANUP], MaxLatency(us), 59647 > [CLEANUP], 95thPercentileLatency(us), 26 > [CLEANUP], 99thPercentileLatency(us), 173 > [INSERT], Operations, 1251067 > [INSERT], AverageLatency(us), 14235.898240461942 > [INSERT], MinLatency(us), 393 > [INSERT], MaxLatency(us), 204159 > [INSERT], 95thPercentileLatency(us), 29919 > [INSERT], 99thPercentileLatency(us), 39647 > [INSERT], Return=OK, 1251067 > [UPDATE], Operations, 1249582 > [UPDATE], AverageLatency(us), 14166.923049467741 > [UPDATE], MinLatency(us), 321 > [UPDATE], MaxLatency(us), 203647 > [UPDATE], 95thPercentileLatency(us), 29855 > [UPDATE], 99thPercentileLatency(us), 39551 > [UPDATE], Return=OK, 1249582 > [SCAN], Operations, 1250466 > [SCAN], AverageLatency(us), 30056.68854251135 > [SCAN], MinLatency(us), 787 > [SCAN], MaxLatency(us), 509183 > [SCAN], 95thPercentileLatency(us), 57823 > [SCAN], 99thPercentileLatency(us), 74751 > [SCAN], Return=OK, 1250466|[OVERALL], RunTime(ms), 958763 > [OVERALL], Throughput(ops/sec), 5215.053146606617 > [TOTAL_GCS_PS_Scavenge], Count, 1264 > [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 8680 > [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9053332262509086 > [TOTAL_GCS_PS_MarkSweep], Count, 1 > [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 38 > [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.00396344039142103 > [TOTAL_GCs], Count, 1265 > [TOTAL_GC_TIME], Time(ms), 8718 > [TOTAL_GC_TIME_%], Time(%), 0.909296423298 > [READ], Operations, 1250961 > [READ], AverageLatency(us), 14663.084518222391 > [READ], MinLatency(us), 320 > [READ], MaxLatency(us), 204415 > [READ], 95thPercentileLatency(us), 30815 > [READ], 99thPercentileLatency(us), 43071 > [READ], Return=OK, 1250961 > [CLEANUP], Operations, 200 > [CLEANUP], AverageLatency(us), 366.845 > [CLEANUP], MinLatency(us), 1 > [CLEANUP], MaxLatency(us), 70719 > [CLEANUP], 95thPercentileLatency(us), 36 > [CLEANUP], 99thPercentileLatency(us), 80 > [INSERT], Operations, 1248183 > [INSERT], AverageLatency(us), 14334.938754974231 > [INSERT], MinLatency(us), 390 > [INSERT], MaxLatency(us), 2828287 > [INSERT], 95thPercentileLatency(us), 30271 > [INSERT], 99thPercentileLatency(us), 41919 > [INSERT], Return=OK, 1248183 > [UPDATE], Operations, 1250212 > [UPDATE], AverageLatency(us), 14283.836318960304 > [UPDATE], MinLatency(us), 337 > [UPDATE], MaxLatency(us), 2828287 > [UPDATE], 95thPercentileLatency(us), 30255 > [UPDATE], 99thPercentileLatency(us), 41855 > [UPDATE], Return=OK, 1250212 > [SCAN], Operations, 1250644 > [SCAN], AverageLatency(us), 33153.01709839091 > [SCAN], MinLatency(us), 742 > [SCAN],
[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
[ https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26659: Description: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of this small piece ByteBuffer. Reasons: 1. In a WriterThread, blocks in doDrain() function are written into IOEngine sequentially, there is no multi-threads problem. 2. After IOEngine.write() function, the data in metadata ByteBuffer has been transferred into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. So that it could be cleared and reused by the next block's writing process. was: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of this small piece ByteBuffer. Reasons: 1. In a WriterThread, blocks in doDrain() function are written into IOEngine sequentially, there is no multi-thread problem. 2. After IOEngine.write() function, the data in metadata bytebuffer has been transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. > The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused. > --- > > Key: HBASE-26659 > URL: https://issues.apache.org/jira/browse/HBASE-26659 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, the process to write HFileBlocks into IOEngine in BucketCache is: > {code:java} > if (data instanceof HFileBlock) { > // If an instance of HFileBlock, save on some allocations. > HFileBlock block = (HFileBlock) data; > ByteBuff sliceBuf = block.getBufferReadOnly(); > ByteBuffer metadata = block.getMetaData(); > ioEngine.write(sliceBuf, offset); > ioEngine.write(metadata, offset + len - metadata.limit()); > } > {code} > The getMetaData() function in HFileBlock is: > {code:java} > public ByteBuffer getMetaData() { > ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); > bb = addMetaData(bb, true); > bb.flip(); > return bb; > } > {code} > It will allocate new ByteBuffer every time. > We could reuse a local variable of WriterThread to reduce the new allocation > of this small piece ByteBuffer. > Reasons: > 1. In a WriterThread, blocks in doDrain() function are written into IOEngine > sequentially, there is no multi-threads problem. > 2. After IOEngine.write() function, the data in metadata ByteBuffer has been > transferred into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) > safely. The lifecycle of it is within the if statement above. So that it > could be cleared and reused by the next block's writing process. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
[ https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26659: Description: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of this small piece ByteBuffer. Reasons: 1. In a WriterThread, blocks in doDrain() function are written into IOEngine sequentially, there is no multi-thread problem. 2. After IOEngine.write() function, the data in metadata bytebuffer has been transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. was: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of metadata. Reasons: 1. In a WriterThread, blocks are written into IOEngine sequencially, there is no multi-thread problem. 2. After IOEngine.write() function, the data in metadata bytebuffer has been transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. > The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused. > --- > > Key: HBASE-26659 > URL: https://issues.apache.org/jira/browse/HBASE-26659 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, the process to write HFileBlocks into IOEngine in BucketCache is: > {code:java} > if (data instanceof HFileBlock) { > // If an instance of HFileBlock, save on some allocations. > HFileBlock block = (HFileBlock) data; > ByteBuff sliceBuf = block.getBufferReadOnly(); > ByteBuffer metadata = block.getMetaData(); > ioEngine.write(sliceBuf, offset); > ioEngine.write(metadata, offset + len - metadata.limit()); > } > {code} > The getMetaData() function in HFileBlock is: > {code:java} > public ByteBuffer getMetaData() { > ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); > bb = addMetaData(bb, true); > bb.flip(); > return bb; > } > {code} > It will allocate new ByteBuffer every time. > We could reuse a local variable of WriterThread to reduce the new allocation > of this small piece ByteBuffer. > Reasons: > 1. In a WriterThread, blocks in doDrain() function are written into IOEngine > sequentially, there is no multi-thread problem. > 2. After IOEngine.write() function, the data in metadata bytebuffer has been > transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) > safely. The lifecycle of it is within the if statement above. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
[ https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26659: Description: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of metadata. Reasons: 1. In a WriterThread, blocks are written into IOEngine sequencially, there is no multi-thread problem. 2. After IOEngine.write() function, the data in metadata bytebuffer has been transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. was: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of metadata. Reasons: 1. In a WriterThread, blocks are written into IOEngine serially, there is no multi-thread problem. 2. After IOEngine.write() function, the data in metadata bytebuffer has been transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. > The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused. > --- > > Key: HBASE-26659 > URL: https://issues.apache.org/jira/browse/HBASE-26659 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, the process to write HFileBlocks into IOEngine in BucketCache is: > {code:java} > if (data instanceof HFileBlock) { > // If an instance of HFileBlock, save on some allocations. > HFileBlock block = (HFileBlock) data; > ByteBuff sliceBuf = block.getBufferReadOnly(); > ByteBuffer metadata = block.getMetaData(); > ioEngine.write(sliceBuf, offset); > ioEngine.write(metadata, offset + len - metadata.limit()); > } > {code} > The getMetaData() function in HFileBlock is: > {code:java} > public ByteBuffer getMetaData() { > ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); > bb = addMetaData(bb, true); > bb.flip(); > return bb; > } > {code} > It will allocate new ByteBuffer every time. > We could reuse a local variable of WriterThread to reduce the new allocation > of metadata. > Reasons: > 1. In a WriterThread, blocks are written into IOEngine sequencially, there is > no multi-thread problem. > 2. After IOEngine.write() function, the data in metadata bytebuffer has been > transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) > safely. The lifecycle of it is within the if statement above. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
[ https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26659: Description: Currently, the process to write HFileBlocks into IOEngine in BucketCache is: {code:java} if (data instanceof HFileBlock) { // If an instance of HFileBlock, save on some allocations. HFileBlock block = (HFileBlock) data; ByteBuff sliceBuf = block.getBufferReadOnly(); ByteBuffer metadata = block.getMetaData(); ioEngine.write(sliceBuf, offset); ioEngine.write(metadata, offset + len - metadata.limit()); } {code} The getMetaData() function in HFileBlock is: {code:java} public ByteBuffer getMetaData() { ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); bb = addMetaData(bb, true); bb.flip(); return bb; } {code} It will allocate new ByteBuffer every time. We could reuse a local variable of WriterThread to reduce the new allocation of metadata. Reasons: 1. In a WriterThread, blocks are written into IOEngine serially, there is no multi-thread problem. 2. After IOEngine.write() function, the data in metadata bytebuffer has been transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) safely. The lifecycle of it is within the if statement above. > The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused. > --- > > Key: HBASE-26659 > URL: https://issues.apache.org/jira/browse/HBASE-26659 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > Currently, the process to write HFileBlocks into IOEngine in BucketCache is: > {code:java} > if (data instanceof HFileBlock) { > // If an instance of HFileBlock, save on some allocations. > HFileBlock block = (HFileBlock) data; > ByteBuff sliceBuf = block.getBufferReadOnly(); > ByteBuffer metadata = block.getMetaData(); > ioEngine.write(sliceBuf, offset); > ioEngine.write(metadata, offset + len - metadata.limit()); > } > {code} > The getMetaData() function in HFileBlock is: > {code:java} > public ByteBuffer getMetaData() { > ByteBuffer bb = ByteBuffer.allocate(BLOCK_METADATA_SPACE); > bb = addMetaData(bb, true); > bb.flip(); > return bb; > } > {code} > It will allocate new ByteBuffer every time. > We could reuse a local variable of WriterThread to reduce the new allocation > of metadata. > Reasons: > 1. In a WriterThread, blocks are written into IOEngine serially, there is no > multi-thread problem. > 2. After IOEngine.write() function, the data in metadata bytebuffer has been > transformed into ByteArray (ByteBufferIOEngine) or FileChannel (FileIOEngine) > safely. The lifecycle of it is within the if statement above. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
Yutong Xiao created HBASE-26659: --- Summary: The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused. Key: HBASE-26659 URL: https://issues.apache.org/jira/browse/HBASE-26659 Project: HBase Issue Type: Improvement Reporter: Yutong Xiao -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HBASE-26659) The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused.
[ https://issues.apache.org/jira/browse/HBASE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao reassigned HBASE-26659: --- Assignee: Yutong Xiao > The ByteBuffer of metadata in RAMQueueEntry in BucketCache could be reused. > --- > > Key: HBASE-26659 > URL: https://issues.apache.org/jira/browse/HBASE-26659 > Project: HBase > Issue Type: Improvement >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes
[ https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26635: Description: Currently, when we decode a byte array to a big decimal, there are plenty of BigDecimal operations, which result in a lot of only once used BigDecimal objects. Furthermore, the BigDecimal calculations are slow. We could boost this function by using String concatenation and point movement of BigDecimal. The JMH benchmark is uploaded to the attachment. Also, I added a UT to test the encoding / decoding correctness of 200 random test samples. The logic of the process is same as the Optimization of the encode function in the ticket [HBASE-26566|https://issues.apache.org/jira/browse/HBASE-26566] was: Currently, when we decode a byte array to a big decimal, there are plenty of BigDecimal operations, which result in a lot of only once used BigDecimal objects. Furthermore, the BigDecimal calculations are slow. We could boost this function by using String concatenation and point movement of BigDecimal. The JMH benchmark is uploaded to the attachment. Also, I added a UT to test the encoding / decoding correctness of 200 random test samples. > Optimize decodeNumeric in OrderedBytes > -- > > Key: HBASE-26635 > URL: https://issues.apache.org/jira/browse/HBASE-26635 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Benchmark-decoding.log, DecodeBenchmark.java > > > Currently, when we decode a byte array to a big decimal, there are plenty of > BigDecimal operations, which result in a lot of only once used BigDecimal > objects. Furthermore, the BigDecimal calculations are slow. We could boost > this function by using String concatenation and point movement of BigDecimal. > The JMH benchmark is uploaded to the attachment. > Also, I added a UT to test the encoding / decoding correctness of 200 random > test samples. > The logic of the process is same as the Optimization of the encode function > in the ticket [HBASE-26566|https://issues.apache.org/jira/browse/HBASE-26566] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes
[ https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26635: Attachment: DecodeBenchmark.java > Optimize decodeNumeric in OrderedBytes > -- > > Key: HBASE-26635 > URL: https://issues.apache.org/jira/browse/HBASE-26635 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Benchmark-decoding.log, DecodeBenchmark.java > > > Currently, when we decode a byte array to a big decimal, there are plenty of > BigDecimal operations, which result in a lot of only once used BigDecimal > objects. Furthermore, the BigDecimal calculations are slow. We could boost > this function by using String concatenation and point movement of BigDecimal. > The JMH benchmark is uploaded to the attachment. > Also, I added a UT to test the encoding / decoding correctness of 200 random > test samples. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes
[ https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26635: Description: Currently, when we decode a byte array to a big decimal, there are plenty of BigDecimal operations, which result in a lot of only once used BigDecimal objects. Furthermore, the BigDecimal calculations are slow. We could boost this function by using String concatenation and point movement of BigDecimal. The JMH benchmark is uploaded to the attachment. Also, I added a UT to test the encoding / decoding correctness of 200 random test samples. was:Currently, when we decode a byte array to a big decimal, there are plenty of BigDecimal operations, which result in a lot of only once used BigDecimal objects. Furthermore, the BigDecimal calculations are slow. We could boost this function by using String concatenation and point movement of BigDecimal. The JMH benchmark is uploaded to the attachment. > Optimize decodeNumeric in OrderedBytes > -- > > Key: HBASE-26635 > URL: https://issues.apache.org/jira/browse/HBASE-26635 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Benchmark-decoding.log > > > Currently, when we decode a byte array to a big decimal, there are plenty of > BigDecimal operations, which result in a lot of only once used BigDecimal > objects. Furthermore, the BigDecimal calculations are slow. We could boost > this function by using String concatenation and point movement of BigDecimal. > The JMH benchmark is uploaded to the attachment. > Also, I added a UT to test the encoding / decoding correctness of 200 random > test samples. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26635) Optimize decodeNumeric in OrderedBytes
[ https://issues.apache.org/jira/browse/HBASE-26635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26635: Attachment: Benchmark-decoding.log > Optimize decodeNumeric in OrderedBytes > -- > > Key: HBASE-26635 > URL: https://issues.apache.org/jira/browse/HBASE-26635 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Attachments: Benchmark-decoding.log > > > Currently, when we decode a byte array to a big decimal, there are plenty of > BigDecimal operations, which result in a lot of only once used BigDecimal > objects. Furthermore, the BigDecimal calculations are slow. We could boost > this function by using String concatenation and point movement of BigDecimal. > The JMH benchmark is uploaded to the attachment. > Also, I added a UT to test the encoding / decoding correctness of 200 random > test samples. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26635) Optimize decodeNumeric in OrderedBytes
Yutong Xiao created HBASE-26635: --- Summary: Optimize decodeNumeric in OrderedBytes Key: HBASE-26635 URL: https://issues.apache.org/jira/browse/HBASE-26635 Project: HBase Issue Type: Improvement Components: Performance Reporter: Yutong Xiao Assignee: Yutong Xiao Currently, when we decode a byte array to a big decimal, there are plenty of BigDecimal operations, which result in a lot of only once used BigDecimal objects. Furthermore, the BigDecimal calculations are slow. We could boost this function by using String concatenation and point movement of BigDecimal. The JMH benchmark is uploaded to the attachment. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26564) Retire the method visitLogEntryBeforeWrite without RegionInfo in WALActionListner
[ https://issues.apache.org/jira/browse/HBASE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17465921#comment-17465921 ] Yutong Xiao commented on HBASE-26564: - Have opened a new MR for branch-2. :) > Retire the method visitLogEntryBeforeWrite without RegionInfo in > WALActionListner > - > > Key: HBASE-26564 > URL: https://issues.apache.org/jira/browse/HBASE-26564 > Project: HBase > Issue Type: Task > Components: wal >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Minor > Fix For: 3.0.0-alpha-3 > > > visitLogEntryBeforeWrite without region info as parameter variable is only > used in ReplicationSourceWALActionListener. Has retired it from > WALActionListener Interface. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26629) Add expiration for long time vacant scanners in Thrift2
[ https://issues.apache.org/jira/browse/HBASE-26629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26629: Summary: Add expiration for long time vacant scanners in Thrift2 (was: Add expiration for long time vacant scanners) > Add expiration for long time vacant scanners in Thrift2 > --- > > Key: HBASE-26629 > URL: https://issues.apache.org/jira/browse/HBASE-26629 > Project: HBase > Issue Type: Improvement > Components: Performance, Thrift >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > In thrift1 implementation, the ThriftHBaseSerivceHandler holds a Cache with > an expire time after access. This will make the long time vacant scanners be > collected by gc. However, in thrift2 we do not have this feature and only use > a map to store scanners and we need the client close the scanner manually. If > not, the expired scanners will live in memory forever. In this case, I > applied the cache expiration to thrift2 service. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HBASE-26629) Add expiration for long time vacant scanners
[ https://issues.apache.org/jira/browse/HBASE-26629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26629: Component/s: Performance Thrift > Add expiration for long time vacant scanners > > > Key: HBASE-26629 > URL: https://issues.apache.org/jira/browse/HBASE-26629 > Project: HBase > Issue Type: Improvement > Components: Performance, Thrift >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > In thrift1 implementation, the ThriftHBaseSerivceHandler holds a Cache with > an expire time after access. This will make the long time vacant scanners be > collected by gc. However, in thrift2 we do not have this feature and only use > a map to store scanners and we need the client close the scanner manually. If > not, the expired scanners will live in memory forever. In this case, I > applied the cache expiration to thrift2 service. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HBASE-26629) Add expiration for long time vacant scanners
Yutong Xiao created HBASE-26629: --- Summary: Add expiration for long time vacant scanners Key: HBASE-26629 URL: https://issues.apache.org/jira/browse/HBASE-26629 Project: HBase Issue Type: Improvement Reporter: Yutong Xiao Assignee: Yutong Xiao In thrift1 implementation, the ThriftHBaseSerivceHandler holds a Cache with an expire time after access. This will make the long time vacant scanners be collected by gc. However, in thrift2 we do not have this feature and only use a map to store scanners and we need the client close the scanner manually. If not, the expired scanners will live in memory forever. In this case, I applied the cache expiration to thrift2 service. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (HBASE-26604) Replace new allocation with ThreadLocal in CellBlockBuilder to reduce GC
[ https://issues.apache.org/jira/browse/HBASE-26604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao resolved HBASE-26604. - Resolution: Won't Fix The threadlocal has a potential sync critic. Won't do this then. > Replace new allocation with ThreadLocal in CellBlockBuilder to reduce GC > > > Key: HBASE-26604 > URL: https://issues.apache.org/jira/browse/HBASE-26604 > Project: HBase > Issue Type: Task >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Minor > > In CellBlockBuilder decompress method, we currently allocate a new > ByteBufferOutputStream object each invoke. > {code:java} > try { > // TODO: This is ugly. The buffer will be resized on us if we guess > wrong. > //TODO: reuse buffers. > bbos = new ByteBufferOutputStream(osInitialSize); > IOUtils.copy(cis, bbos); > bbos.close(); > return bbos.getByteBuffer(); > } finally { > CodecPool.returnDecompressor(poolDecompressor); > } > {code} > We can use a ThreadLocal variable to reuse the buffer in each Thread. As: > {code:java} > try { > // TODO: This is ugly. The buffer will be resized on us if we guess > wrong. > if (this.decompressBuff.get() == null) { > this.decompressBuff.set(new ByteBufferOutputStream(osInitialSize)); > } > ByteBufferOutputStream localBbos = this.decompressBuff.get(); > localBbos.clear(); > IOUtils.copy(cis, localBbos); > return localBbos.getByteBuffer(); > } finally { > CodecPool.returnDecompressor(poolDecompressor); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)