[jira] [Resolved] (HBASE-28467) Integration of time-based priority caching into cacheOnRead read code paths.

2024-05-22 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28467.
--
Resolution: Fixed

Merged into the feature branch. Thanks for the contribution, 
[~janardhan.hungund] !

> Integration of time-based priority caching into cacheOnRead read code paths.
> 
>
> Key: HBASE-28467
> URL: https://issues.apache.org/jira/browse/HBASE-28467
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> This Jira tracks the integration of time-based caching framework APIs into 
> read code paths.
> Thanks,
> Janardhan
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27915) Update hbase_docker with an extra Dockerfile compatible with mac m1 platfrom

2024-05-22 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-27915.
--
Resolution: Fixed

Thanks for reviewing it [~swu] ! I had merged this into master, branch-3, 
branch-2, branch-2.6, branch-2.5 and branch-2.4.

> Update hbase_docker with an extra Dockerfile compatible with mac m1 platfrom
> 
>
> Key: HBASE-27915
> URL: https://issues.apache.org/jira/browse/HBASE-27915
> Project: HBase
>  Issue Type: Bug
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
>  Labels: pull-request-available
>
> When trying to use the current Dockerfile under "./dev-support/hbase_docker" 
> on m1 macs, the docker build fails at the git clone & mvn build stage with 
> below error:
> {noformat}
>  #0 8.214 qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such 
> file or directory
> {noformat}
> It turns out for mac m1, we have to explicitly define the platform flag for 
> the ubuntu image. I thought we could add a note in this readme, together with 
> an "m1" subfolder containing a modified copy of this Dockerfile that works on 
> mac m1s.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28469) Integration of time-based priority caching into compaction paths.

2024-05-22 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28469.
--
Resolution: Fixed

Merged into the feature branch. Thanks for the contribution [~vinayakhegde], 
and for reviewing [~janardhan.hungund] !

> Integration of time-based priority caching into compaction paths.
> -
>
> Key: HBASE-28469
> URL: https://issues.apache.org/jira/browse/HBASE-28469
> Project: HBase
>  Issue Type: Task
>Reporter: Janardhan Hungund
>Assignee: Vinayak Hegde
>Priority: Major
>  Labels: pull-request-available
>
> The time-based priority caching is dependent on the date-tiered compaction 
> that structures store files in date-based tiered layout. This Jira tracks the 
> changes are needed for the integration of this compaction strategy with the 
> data-tiering to enable appropriate caching of hot data in the cache, while 
> the code data can remain the cloud storage.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28596) Optimise BucketCache usage upon regions splits/merges.

2024-05-15 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28596:


 Summary: Optimise BucketCache usage upon regions splits/merges.
 Key: HBASE-28596
 URL: https://issues.apache.org/jira/browse/HBASE-28596
 Project: HBase
  Issue Type: Improvement
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


This proposal aims to give more flexibility for users to decide whether or not 
blocks from a parent region should be evict, and also optimise cache usage by 
resolving file reference blocks to the referred block in the cache.

Some extra context:

1) Originally, the default behaviour on splits was to rely on the 
"hbase.rs.evictblocksonclose" value to decide if the cached blocks from the 
parent split should be evicted or not. Then the resulting split daughters get 
open with refs to the parent file. If hbase.rs.prefetchblocksonopen is set, 
these openings will trigger a prefetch of the blocks from the parent split, now 
with cache keys from the ref path. That means, if "hbase.rs.evictblocksonclose" 
is false and “hbase.rs.prefetchblocksonopen” is true, we will be duplicating 
blocks in the cache. In scenarios where cache usage is at capacity and added 
latency for reading from the file system is high (for example reading from a 
cloud storage), this can have a severe impact, as the prefetch for the refs 
would trigger evictions. Also, the refs tend to be short lived, as compaction 
is triggered on the split daughters soon after it’s open.

2) HBASE-27474 has changed the original behaviour described above, to now 
always evict blocks from the split parent upon split is completed, and skipping 
prefetch for refs (since refs are short lived). The side effect is that the 
daughters blocks would only be cached once compaction is completed, but 
compaction itself will run slower since it needs to read the blocks from the 
file system. On regions as large as 20GB, the performance degradation reported 
by users has been severe.

This proposes a new “hbase.rs.evictblocksonsplit” configuration property that 
makes the eviction over split configurable. Depending on the use case, the 
impact of mass evictions due to cache capacity may be higher, in which case 
users might prefer to keep evicting split parent blocks. Additionally, it 
modifies the way we handle refs when caching. HBASE-27474 behaviour was to skip 
caching refs to avoid duplicate data in the cache as long as compaction was 
enabled, relying on the fact that refs from splits are usually short lived. 
Here, we propose modifying the search for blocks cache keys, so that we always 
resolve the referenced file first and look for the related referenced file 
block in the cache. That way we avoid duplicates in the cache and also expedite 
scan performance on the split daughters, as it’s now resolving the referenced 
file and reading from the cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846674#comment-17846674
 ] 

Wellington Chevreuil commented on HBASE-28595:
--

[~csringhofer] , it seems this still the case for master branch. Could you open 
a PR for master branch instead?

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
> 2nd call (retry of 1st) returns empty results:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846653#comment-17846653
 ] 

Wellington Chevreuil commented on HBASE-28595:
--

Also, it seems we keep closed scanners for the time defined by 
*hbase.client.scanner.timeout.period*, which is the actual scanner lease 
expiration timeout. Maybe it should be a good idea to have a specific property 
for cleaning the closed scanners collection.

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
> 2nd call (retry of 1st) returns empty results:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846652#comment-17846652
 ] 

Wellington Chevreuil commented on HBASE-28595:
--

{quote}
By design, if the server side has already closed the scanner, the retry RPC 
should receive a UnknownScannerException.
{quote}
That was my thought too, until [~csringhofer] pointed me to this point 
[here|https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403],
 where we catch the already closed scanner exception and return empty results.

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
> 2nd call (retry of 1st) returns empty results:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28595:
-
Description: 
This was discovered in Apache Impala using HBase 2.2 based branch hbase client 
and server. It is not clear yet whether other branches are also affected.

The issue happens if the server side of the scan throws an exception and closes 
the scanner, but at the same time, the client gets an rpc connection closed 
error and doesn't process the exception sent by the server. Client then thinks 
it got a network error, which leads to retrying the RPC instead of opening a 
new scanner. But then when the client retry reaches the server, the server 
returns an empty ScanResponse instead of an error, leading to closing the 
scanner on client side without returning any error.

A few pointers to critical parts:
region server:
1st call throws exception leading to closing (but not deleting) scanner:
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
2nd call (retry of 1st) returns empty results:
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]

client:
some exceptions are handled as non-retriable at RPC level and are only handled 
through opening a new scanner:
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]

This mechanism in the client only works if it gets the exception from the 
server. If there are connection issues during the RPC then the client won't 
really know the state of the server.

  was:
This was discovered in Apache Impala using HBase 2.2 based branch hbase client 
and server. It is not clear yet whether other branches are also affected.

The issue happens if the server side of the scan throws an exception and closes 
the scanner, but the client doesn't get the exact exception and it treats it as 
network error, which leads to retrying the RPC instead of opening a new 
scanner. In this case  the server returns an empty ScanResponse instead of an 
error when the RPC is retried, leading to closing the scanner on client side 
without returning any error.

A few pointers to critical parts:
region server:
1st call throws exception leading to closing (but not deleting) scanner:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539
2nd call (retry of 1st) returns empty results:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403

client:
some exceptions are handled as non-retriable at RPC level and are only handled 
through opening a new scanner:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367

This mechanism in the client only works if it gets the exception from the 
server. If there are connection issues during the RPC then the client won't 
really know the state of the server.


> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws 

[jira] [Assigned] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil reassigned HBASE-28595:


Assignee: Csaba Ringhofer

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but the client doesn't get the exact exception and it 
> treats it as network error, which leads to retrying the RPC instead of 
> opening a new scanner. In this case  the server returns an empty ScanResponse 
> instead of an error when the RPC is retried, leading to closing the scanner 
> on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539
> 2nd call (retry of 1st) returns empty results:
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2024-05-09 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28246:
-
Fix Version/s: 2.7.0

> Expose region cached size over JMX metrics and report in the RS UI
> --
>
> Key: HBASE-28246
> URL: https://issues.apache.org/jira/browse/HBASE-28246
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0-beta-1, 2.7.0
>
> Attachments: Screenshot 2023-12-06 at 22.58.17.png
>
>
> With large file based bucket cache, the prefetch executor can take long time 
> to complete cache all of the dataset. It would be useful to report how much % 
> of regions data is already cached, in order to give an idea of how much work 
> prefetch executor has done.
> This PRs adds jmx metrics for region cache % and also reports the same in the 
> RS UI "Store File Metrics" tab as below:
> !Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28535) Implement a region server level configuration to enable/disable data-tiering

2024-05-02 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28535.
--
Resolution: Fixed

Merged into the feature branch. Thanks for the contribution, 
[~janardhan.hungund] !

> Implement a region server level configuration to enable/disable data-tiering
> 
>
> Key: HBASE-28535
> URL: https://issues.apache.org/jira/browse/HBASE-28535
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> Provide the user with the ability to enable and disable the data tiering 
> feature. The time-based data tiering is applicable to a specific set of use 
> cases which write date based records and access to recently written data.
> The feature, in general, should be avoided for use cases which are not 
> dependent on the date-based reads and writes as the code flows which enable 
> data temperature checks can induce performance regressions.
> This Jira is added to track the functionality to optionally enable 
> region-server wide configuration to disable or enable the feature.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28468) Integration of time-based priority caching logic into cache evictions.

2024-04-25 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28468.
--
Resolution: Fixed

Merged into the feature branch.

> Integration of time-based priority caching logic into cache evictions.
> --
>
> Key: HBASE-28468
> URL: https://issues.apache.org/jira/browse/HBASE-28468
> Project: HBase
>  Issue Type: Task
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> When the time-based priority caching is enabled, then, the block evictions 
> triggered when the cache is full, should use the time-based priority caching 
> framework APIs to detect the cold files and evict the blocks of those files 
> first. This ensures that the hot data remains in cache while the cold data is 
> evicted from cache.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback

2024-04-25 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840720#comment-17840720
 ] 

Wellington Chevreuil commented on HBASE-28533:
--

Perfect, I had assigned this Jira to you [~droudy]. I have also added you to 
the list of hbase contributors, so you will be able to assign jiras to yourself 
in the future.

> Region split failure due to region quota limit leaves Hmaster's in memory 
> state for the region in SPLITTING after procedure rollback
> 
>
> Key: HBASE-28533
> URL: https://issues.apache.org/jira/browse/HBASE-28533
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
> Environment: Tested on HBase Version 2.5.8 and latest master branch 
>Reporter: Daniel Roudnitsky
>Assignee: Daniel Roudnitsky
>Priority: Major
>
> When a SplitTableRegionProcedure is run for a region whose namespace is at 
> its maximum region quota limit, the split procedure will fail and rollback, 
> and Hmaster's in memory RegionStateNode for the region is left in a SPLITTING 
> state. Hmaster will then refuse to start any subsequent merge/split/move 
> procedures for that region because it believes the region is not OPEN, until 
> it is restarted and the in memory record of region states is reset.
> In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the 
> parent region's RegionStateNode state is set to SPLITTING, and the transition 
> is not written to the meta table. In the next step 
> SPLIT_TABLE_REGION_PRE_OPERATION the region quota check is done, 
> QuotaExceededException is thrown and the procedure ends in ROLLEDBACK state 
> without reverting the RegionStateNode back to OPEN state. Hmaster is left 
> believing the region is in a SPLITTING state according to its in memory 
> RegionStates, while the region is still online on the assigned region server 
> and according to meta.
> To reproduce in HBase shell:
> {code:java}
> > create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2}
> > create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => 
> > 'UniformSplit'}
> > region_a = 
> > region_b = 
> > split region_a, 'x'
> # HMaster will report: 
> pid=405, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via 
> master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: 
> Region split not possible for : as quota limits are exceeded ; 
> SplitTableRegionProcedure table=test_ns:test_table, parent=...
> > merge_region region_a, region_b
> ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: 
> org.apache.hadoop.hbase.client.DoNotRetryRegionException:  is not 
> OPEN; state=SPLITTING
> > stop_master # trigger hmaster failover 
> > merge_region region_a, region_b # merge now succeeds {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback

2024-04-25 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil reassigned HBASE-28533:


Assignee: Daniel Roudnitsky

> Region split failure due to region quota limit leaves Hmaster's in memory 
> state for the region in SPLITTING after procedure rollback
> 
>
> Key: HBASE-28533
> URL: https://issues.apache.org/jira/browse/HBASE-28533
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
> Environment: Tested on HBase Version 2.5.8 and latest master branch 
>Reporter: Daniel Roudnitsky
>Assignee: Daniel Roudnitsky
>Priority: Major
>
> When a SplitTableRegionProcedure is run for a region whose namespace is at 
> its maximum region quota limit, the split procedure will fail and rollback, 
> and Hmaster's in memory RegionStateNode for the region is left in a SPLITTING 
> state. Hmaster will then refuse to start any subsequent merge/split/move 
> procedures for that region because it believes the region is not OPEN, until 
> it is restarted and the in memory record of region states is reset.
> In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the 
> parent region's RegionStateNode state is set to SPLITTING, and the transition 
> is not written to the meta table. In the next step 
> SPLIT_TABLE_REGION_PRE_OPERATION the region quota check is done, 
> QuotaExceededException is thrown and the procedure ends in ROLLEDBACK state 
> without reverting the RegionStateNode back to OPEN state. Hmaster is left 
> believing the region is in a SPLITTING state according to its in memory 
> RegionStates, while the region is still online on the assigned region server 
> and according to meta.
> To reproduce in HBase shell:
> {code:java}
> > create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2}
> > create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => 
> > 'UniformSplit'}
> > region_a = 
> > region_b = 
> > split region_a, 'x'
> # HMaster will report: 
> pid=405, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via 
> master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: 
> Region split not possible for : as quota limits are exceeded ; 
> SplitTableRegionProcedure table=test_ns:test_table, parent=...
> > merge_region region_a, region_b
> ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: 
> org.apache.hadoop.hbase.client.DoNotRetryRegionException:  is not 
> OPEN; state=SPLITTING
> > stop_master # trigger hmaster failover 
> > merge_region region_a, region_b # merge now succeeds {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28527) Adjust BlockCacheKey to use the file path instead of file name.

2024-04-24 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840527#comment-17840527
 ] 

Wellington Chevreuil commented on HBASE-28527:
--

Can we resolve this as not needed [~janardhan.hungund] ?

> Adjust BlockCacheKey to use the file path instead of file name.
> ---
>
> Key: HBASE-28527
> URL: https://issues.apache.org/jira/browse/HBASE-28527
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> The time-based priority eviction policy relies on the presence of path in the 
> BlockCacheKey to fetch the required metadata to check data hotness and decide 
> whether or not to retain the block in the bucket cache.
> Hence, the constructor of BlockCacheKey is adjusted to take the file path as 
> the input parameter. The code paths that create the blockCacheKey and also 
> the unit tests need to be adjusted to pass the path instead of file name.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28527) Adjust BlockCacheKey to use the file path instead of file name.

2024-04-24 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28527:
-
Parent: (was: HBASE-28463)
Issue Type: Task  (was: Sub-task)

> Adjust BlockCacheKey to use the file path instead of file name.
> ---
>
> Key: HBASE-28527
> URL: https://issues.apache.org/jira/browse/HBASE-28527
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> The time-based priority eviction policy relies on the presence of path in the 
> BlockCacheKey to fetch the required metadata to check data hotness and decide 
> whether or not to retain the block in the bucket cache.
> Hence, the constructor of BlockCacheKey is adjusted to take the file path as 
> the input parameter. The code paths that create the blockCacheKey and also 
> the unit tests need to be adjusted to pass the path instead of file name.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28527) Adjust BlockCacheKey to use the file path instead of file name.

2024-04-24 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28527:
-
Parent: HBASE-28463
Issue Type: Sub-task  (was: Task)

> Adjust BlockCacheKey to use the file path instead of file name.
> ---
>
> Key: HBASE-28527
> URL: https://issues.apache.org/jira/browse/HBASE-28527
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> The time-based priority eviction policy relies on the presence of path in the 
> BlockCacheKey to fetch the required metadata to check data hotness and decide 
> whether or not to retain the block in the bucket cache.
> Hence, the constructor of BlockCacheKey is adjusted to take the file path as 
> the input parameter. The code paths that create the blockCacheKey and also 
> the unit tests need to be adjusted to pass the path instead of file name.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28466) Integration of time-based priority logic of bucket cache in prefetch functionality of HBase.

2024-04-22 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28466.
--
Resolution: Fixed

Merged into feature branch. Thanks for the contribution, [~vinayakhegde] !

> Integration of time-based priority logic of bucket cache in prefetch 
> functionality of HBase.
> 
>
> Key: HBASE-28466
> URL: https://issues.apache.org/jira/browse/HBASE-28466
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Vinayak Hegde
>Priority: Major
>  Labels: pull-request-available
>
> This Jira tracks the integration of the framework of APIs (implemented in 
> HBASE-28465) related to data tiering into prefetch logic of HBase. The 
> implementation should filter out the cold data and enable the prefetching of 
> hot data into bucket cache.
> Thanks,
> Janardhan
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28527) Adjust BlockCacheKey to use the file path instead of file name.

2024-04-18 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838573#comment-17838573
 ] 

Wellington Chevreuil commented on HBASE-28527:
--

{quote}
The time-based priority eviction policy relies on the presence of path in the 
BlockCacheKey to fetch the required metadata to check data hotness and decide 
whether or not to retain the block in the bucket cache.
{quote}
Please explain why that's needed.

> Adjust BlockCacheKey to use the file path instead of file name.
> ---
>
> Key: HBASE-28527
> URL: https://issues.apache.org/jira/browse/HBASE-28527
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> The time-based priority eviction policy relies on the presence of path in the 
> BlockCacheKey to fetch the required metadata to check data hotness and decide 
> whether or not to retain the block in the bucket cache.
> Hence, the constructor of BlockCacheKey is adjusted to take the file path as 
> the input parameter. The code paths that create the blockCacheKey and also 
> the unit tests need to be adjusted to pass the path instead of file name.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback

2024-04-18 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838531#comment-17838531
 ] 

Wellington Chevreuil commented on HBASE-28533:
--

Thanks for reporting this and the detailed troubleshooting explanation. Can you 
confirm this also affects the newer releases as well? Also, please let me know 
if you plan to work on a fix to this, then we can get the jira assigned to you, 
[~droudy].

> Region split failure due to region quota limit leaves Hmaster's in memory 
> state for the region in SPLITTING after procedure rollback
> 
>
> Key: HBASE-28533
> URL: https://issues.apache.org/jira/browse/HBASE-28533
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.5.8
> Environment: HBase Version 2.5.8, 
> r37444de6531b1bdabf2e445c83d0268ab1a6f919, Thu Feb 29 15:37:32 PST 2024
>Reporter: Daniel Roudnitsky
>Priority: Major
>
> When a SplitTableRegionProcedure is run for a region whose namespace is at 
> its maximum region quota limit, the split procedure will fail and rollback, 
> and Hmaster's in memory RegionStateNode for the region is left in a SPLITTING 
> state. Hmaster will then refuse to start any subsequent merge/split/move 
> procedures for that region because it believes the region is not OPEN, until 
> it is restarted and the in memory record of region states is reset.
> In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the 
> parent region's RegionStateNode state is set to SPLITTING, and the transition 
> is not written to the meta table. In the next step 
> SPLIT_TABLE_REGION_PRE_OPERATION the region quota check is done, 
> QuotaExceededException is thrown and the procedure ends in ROLLEDBACK state 
> without reverting the RegionStateNode back to OPEN state. Hmaster is left 
> believing the region is in a SPLITTING state according to its in memory 
> RegionStates, while the region is still online on the assigned region server 
> and according to meta.
> To reproduce in HBase shell:
> {code:java}
> > create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2}
> > create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => 
> > 'UniformSplit'}
> > region_a = 
> > region_b = 
> > split region_a, 'x'
> # HMaster will report: 
> pid=405, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via 
> master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: 
> Region split not possible for : as quota limits are exceeded ; 
> SplitTableRegionProcedure table=test_ns:test_table, parent=...
> > merge_region region_a, region_b
> ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: 
> org.apache.hadoop.hbase.client.DoNotRetryRegionException:  is not 
> OPEN; state=SPLITTING
> > stop_master # trigger hmaster failover 
> > merge_region region_a, region_b # merge now succeeds {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28528) Improvements in HFile prefetch

2024-04-17 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838051#comment-17838051
 ] 

Wellington Chevreuil commented on HBASE-28528:
--

Thanks for the follow up, [~rajeshbabu]. I have some thoughts:

 
{quote}
Currently hfile prefetch on open is configurable cluster wise. Would be better 
to make it table wise configurable.
{quote}
Yeah, I think that should be helpful. Right now, the closest we have to this 
would be to disable BLOCKCACHE entirely in the CF config (see HBASE-28217).

{quote}
Also would be better to have region filters which can allow to specify which 
regions data can be prefetched.
{quote} 
That's interesting indeed, however it would be a bit more sophisticated, as 
regions are a dynamic structure. I guess this would require an additional 
qualifier in the meta table and then additional API/shell commands to set this. 

> Improvements in HFile prefetch
> --
>
> Key: HBASE-28528
> URL: https://issues.apache.org/jira/browse/HBASE-28528
> Project: HBase
>  Issue Type: Improvement
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
>
> Currently hfile prefetch on open is configurable cluster wise. Would be 
> better to make it table wise configurable. Also would be better to have 
> region filters which can allow to specify which regions data can be 
> prefetched. This will be useful when there are hot regions whose data 
> prefetching can help for low latency requirements.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28292) Make Delay prefetch property to be dynamically configured

2024-04-16 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28292.
--
Resolution: Fixed

PR got merged into master branch by [~psomogyi] and I had backported it into 
branch-3, branch-2, branch-2.6, branch-2.5 and branch-2.4. Thanks for the 
contributions, [~kabhishek4] !

> Make Delay prefetch property to be dynamically configured
> -
>
> Key: HBASE-28292
> URL: https://issues.apache.org/jira/browse/HBASE-28292
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0, 2.5.8
>Reporter: Abhishek Kothalikar
>Assignee: Abhishek Kothalikar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0, 4.0.0-alpha-1, 2.7.0, 2.5.9
>
> Attachments: HBASE-28292.docx
>
>
> Make the prefetch delay configurable. The prefetch delay is associated to 
> hbase.hfile.prefetch.delay configuration. There are some cases where 
> configuring hbase.hfile.prefetch.delay would help in achieving better 
> throughput. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28292) Make Delay prefetch property to be dynamically configured

2024-04-16 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28292:
-
Fix Version/s: 2.6.0
   2.4.18
   3.0.0
   2.7.0
   2.5.9

> Make Delay prefetch property to be dynamically configured
> -
>
> Key: HBASE-28292
> URL: https://issues.apache.org/jira/browse/HBASE-28292
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0, 2.5.8
>Reporter: Abhishek Kothalikar
>Assignee: Abhishek Kothalikar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0, 4.0.0-alpha-1, 2.7.0, 2.5.9
>
> Attachments: HBASE-28292.docx
>
>
> Make the prefetch delay configurable. The prefetch delay is associated to 
> hbase.hfile.prefetch.delay configuration. There are some cases where 
> configuring hbase.hfile.prefetch.delay would help in achieving better 
> throughput. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28292) Make Delay prefetch property to be dynamically configured

2024-04-16 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28292:
-
Affects Version/s: 2.5.8
   3.0.0-beta-1
   2.4.17
   2.6.0
   4.0.0-alpha-1
   2.7.0

> Make Delay prefetch property to be dynamically configured
> -
>
> Key: HBASE-28292
> URL: https://issues.apache.org/jira/browse/HBASE-28292
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0, 2.5.8
>Reporter: Abhishek Kothalikar
>Assignee: Abhishek Kothalikar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HBASE-28292.docx
>
>
> Make the prefetch delay configurable. The prefetch delay is associated to 
> hbase.hfile.prefetch.delay configuration. There are some cases where 
> configuring hbase.hfile.prefetch.delay would help in achieving better 
> throughput. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28292) Make Delay prefetch property to be dynamically configured

2024-04-16 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28292:
-
Fix Version/s: 4.0.0-alpha-1

> Make Delay prefetch property to be dynamically configured
> -
>
> Key: HBASE-28292
> URL: https://issues.apache.org/jira/browse/HBASE-28292
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0, 2.5.8
>Reporter: Abhishek Kothalikar
>Assignee: Abhishek Kothalikar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
> Attachments: HBASE-28292.docx
>
>
> Make the prefetch delay configurable. The prefetch delay is associated to 
> hbase.hfile.prefetch.delay configuration. There are some cases where 
> configuring hbase.hfile.prefetch.delay would help in achieving better 
> throughput. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28174) DELETE endpoint in REST API does not support deleting binary row keys/columns

2024-04-15 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28174:
-
Fix Version/s: 4.0.0-alpha-1

> DELETE endpoint in REST API does not support deleting binary row keys/columns
> -
>
> Key: HBASE-28174
> URL: https://issues.apache.org/jira/browse/HBASE-28174
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: James Udiljak
>Assignee: James Udiljak
>Priority: Blocker
> Fix For: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1
>
> Attachments: delete_base64_1.png
>
>
> h2. Notes
> This is the first time I have raised an issue in the ASF Jira. Please let me 
> know if there's anything I need to adjust on the issue to fit in with your 
> development flow.
> I have marked the priority as "blocker" because this issue blocks me as a 
> user of the HBase REST API from deploying an effective solution for our 
> setup. Please feel free to change this if the Priority field has another 
> meaning to you.
> I have also chosen 2.4.17 as the affected version because this is the version 
> I am running, however looking at the source code on GitHub in the default 
> branch, I think many other versions would be affected.
> h2. Description of Issue
> The DELETE operation in the [HBase REST 
> API|https://hbase.apache.org/1.2/apidocs/org/apache/hadoop/hbase/rest/package-summary.html#operation_delete]
>  requires specifying row keys and column families/offsets in the URI (i.e. as 
> UTF-8 text). This makes it impossible to specify a delete operation via the 
> REST API for a binary row key or column family/offset, as single bytes with a 
> decimal value greater than 127 are not valid in UTF-8.
> Percent-encoding these "high" values does not work around the issue, as the 
> HBase REST API uses Java's {{URLDecoder.Decode(percentEncodedString, 
> "UTF-8")}} function, which replaces any percent-encoded byte in the range 
> {{%80}} to {{%FF}} with the [replacement 
> character|https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character].
>  Even if this were not the case, the row-key is ultimately [converted to a 
> byte 
> array|https://github.com/apache/hbase/blob/rel/2.4.17/hbase-rest/src/main/java/org/apache/hadoop/hbase/rest/RowSpec.java#L60-L100]
>  using UTF-8 encoding, wherein code points >127 are encoded across multiple 
> bytes, corrupting the user-supplied row key.
> h2. Proposed Solution
> I do not believe it is possible to allow encoding of arbitrary bytes in the 
> URL for the DELETE endpoint without breaking compatibility for any users who 
> may have been unknowingly UTF-8 encoding their binary row keys. Even if it 
> were possible, the syntax would likely be terse.
> Instead, I propose a new version of the DELETE endpoint that would accept row 
> keys and column families/offsets in the request _body_ (using Base64 encoding 
> for the JSON and XML formats, and bare binary for protobuf). This new 
> endpoint would follow the same conventions as the PUT operations, except that 
> cell values would not need to be specified (unless the user is performing a 
> check-and-delete operation).
> As an additional benefit, using the request body could potentially allow for 
> deleting multiple rows in a single request, which would drastically improve 
> the efficiency of my use case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28505) Implement enforcement to require Date Tiered Compaction for Time Range Data Tiering

2024-04-12 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28505.
--
Resolution: Fixed

Merged into HBASE-28463 feature branch.

> Implement enforcement to require Date Tiered Compaction for Time Range Data 
> Tiering
> ---
>
> Key: HBASE-28505
> URL: https://issues.apache.org/jira/browse/HBASE-28505
> Project: HBase
>  Issue Type: Task
>Reporter: Vinayak Hegde
>Assignee: Vinayak Hegde
>Priority: Major
>  Labels: pull-request-available
>
> The implementation should enforce the requirement of enabling Date Tiered 
> Compaction for Time Range Data Tiering. This restriction ensures that users 
> can fully benefit from Time Range Data Tiering functionality by disallowing 
> its usage unless Date Tiered Compaction is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28465) Implementation of framework for time-based priority bucket-cache.

2024-04-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28465.
--
Resolution: Fixed

Merged into the [HBASE-28463|https://github.com/apache/hbase/tree/HBASE-28463] 
feature branch.

> Implementation of framework for time-based priority bucket-cache.
> -
>
> Key: HBASE-28465
> URL: https://issues.apache.org/jira/browse/HBASE-28465
> Project: HBase
>  Issue Type: Task
>Reporter: Janardhan Hungund
>Assignee: Vinayak Hegde
>Priority: Major
>  Labels: pull-request-available
>
> In this Jira, we track the implementation of framework for the time-based 
> priority cache.
> This framework would help us to get the required metadata of the HFiles and 
> helps us make the decision about the hotness or coldness of data.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-04-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28458.
--
Resolution: Fixed

Merged into master, branch-3, branch-2 and branch-2.6. Thanks for reviewing it 
[~zhangduo] [~psomogyi] !

> BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully 
> cached
> ---
>
> Key: HBASE-28458
> URL: https://issues.apache.org/jira/browse/HBASE-28458
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 3.0.0, 4.0.0-alpha-1, 2.7.0
>
>
> Noticed that 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
> flakey, failing whenever the block eviction happened while prefetch was still 
> ongoing.
> In the test, we pass an instance of BucketCache directly to the cache config, 
> so the test is actually placing both data and meta blocks in the bucket 
> cache. So sometimes, the test call BucketCache.notifyFileCachingCompleted 
> after the it has already evicted two blocks.  
> Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
> backingMap entry set, counting number of blocks for the given file. Then, to 
> consider whether the file is fully cached or not, we do the following 
> validation:
> {noformat}
> if (dataBlockCount == count.getValue() || totalBlockCount == 
> count.getValue()) {
>   LOG.debug("File {} has now been fully cached.", fileName);
>   fileCacheCompleted(fileName, size);
> }  {noformat}
> But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
> two blocks and asserts that the file hasn't been considered fully cached. 
> When these evictions happen while prefetch is still going, we'll pass that 
> check, as the the number of blocks for the file in the backingMap would still 
> be 55, which is what we pass as dataBlockCount.
> As BucketCache is intended for storing data blocks only, I believe we should 
> make sure BucketCache.notifyFileCachingCompleted only accounts for data 
> blocks. Also, the 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
> be updated to consistently reproduce the eviction concurrent to the prefetch. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-04-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28458:
-
Affects Version/s: (was: 2.6.1)

> BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully 
> cached
> ---
>
> Key: HBASE-28458
> URL: https://issues.apache.org/jira/browse/HBASE-28458
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 3.0.0, 4.0.0-alpha-1, 2.7.0
>
>
> Noticed that 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
> flakey, failing whenever the block eviction happened while prefetch was still 
> ongoing.
> In the test, we pass an instance of BucketCache directly to the cache config, 
> so the test is actually placing both data and meta blocks in the bucket 
> cache. So sometimes, the test call BucketCache.notifyFileCachingCompleted 
> after the it has already evicted two blocks.  
> Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
> backingMap entry set, counting number of blocks for the given file. Then, to 
> consider whether the file is fully cached or not, we do the following 
> validation:
> {noformat}
> if (dataBlockCount == count.getValue() || totalBlockCount == 
> count.getValue()) {
>   LOG.debug("File {} has now been fully cached.", fileName);
>   fileCacheCompleted(fileName, size);
> }  {noformat}
> But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
> two blocks and asserts that the file hasn't been considered fully cached. 
> When these evictions happen while prefetch is still going, we'll pass that 
> check, as the the number of blocks for the file in the backingMap would still 
> be 55, which is what we pass as dataBlockCount.
> As BucketCache is intended for storing data blocks only, I believe we should 
> make sure BucketCache.notifyFileCachingCompleted only accounts for data 
> blocks. Also, the 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
> be updated to consistently reproduce the eviction concurrent to the prefetch. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-04-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28458:
-
Fix Version/s: 2.6.0
   3.0.0
   2.7.0

> BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully 
> cached
> ---
>
> Key: HBASE-28458
> URL: https://issues.apache.org/jira/browse/HBASE-28458
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0, 2.6.1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 3.0.0, 4.0.0-alpha-1, 2.7.0
>
>
> Noticed that 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
> flakey, failing whenever the block eviction happened while prefetch was still 
> ongoing.
> In the test, we pass an instance of BucketCache directly to the cache config, 
> so the test is actually placing both data and meta blocks in the bucket 
> cache. So sometimes, the test call BucketCache.notifyFileCachingCompleted 
> after the it has already evicted two blocks.  
> Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
> backingMap entry set, counting number of blocks for the given file. Then, to 
> consider whether the file is fully cached or not, we do the following 
> validation:
> {noformat}
> if (dataBlockCount == count.getValue() || totalBlockCount == 
> count.getValue()) {
>   LOG.debug("File {} has now been fully cached.", fileName);
>   fileCacheCompleted(fileName, size);
> }  {noformat}
> But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
> two blocks and asserts that the file hasn't been considered fully cached. 
> When these evictions happen while prefetch is still going, we'll pass that 
> check, as the the number of blocks for the file in the backingMap would still 
> be 55, which is what we pass as dataBlockCount.
> As BucketCache is intended for storing data blocks only, I believe we should 
> make sure BucketCache.notifyFileCachingCompleted only accounts for data 
> blocks. Also, the 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
> be updated to consistently reproduce the eviction concurrent to the prefetch. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-04-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28458:
-
Affects Version/s: 2.6.1

> BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully 
> cached
> ---
>
> Key: HBASE-28458
> URL: https://issues.apache.org/jira/browse/HBASE-28458
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0, 2.6.1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>
> Noticed that 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
> flakey, failing whenever the block eviction happened while prefetch was still 
> ongoing.
> In the test, we pass an instance of BucketCache directly to the cache config, 
> so the test is actually placing both data and meta blocks in the bucket 
> cache. So sometimes, the test call BucketCache.notifyFileCachingCompleted 
> after the it has already evicted two blocks.  
> Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
> backingMap entry set, counting number of blocks for the given file. Then, to 
> consider whether the file is fully cached or not, we do the following 
> validation:
> {noformat}
> if (dataBlockCount == count.getValue() || totalBlockCount == 
> count.getValue()) {
>   LOG.debug("File {} has now been fully cached.", fileName);
>   fileCacheCompleted(fileName, size);
> }  {noformat}
> But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
> two blocks and asserts that the file hasn't been considered fully cached. 
> When these evictions happen while prefetch is still going, we'll pass that 
> check, as the the number of blocks for the file in the backingMap would still 
> be 55, which is what we pass as dataBlockCount.
> As BucketCache is intended for storing data blocks only, I believe we should 
> make sure BucketCache.notifyFileCachingCompleted only accounts for data 
> blocks. Also, the 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
> be updated to consistently reproduce the eviction concurrent to the prefetch. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28466) Integration of time-based priority logic of bucket cache in prefetch functionality of HBase.

2024-04-03 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28466:
-
Issue Type: Task  (was: New Feature)

> Integration of time-based priority logic of bucket cache in prefetch 
> functionality of HBase.
> 
>
> Key: HBASE-28466
> URL: https://issues.apache.org/jira/browse/HBASE-28466
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Priority: Major
>
> This Jira tracks the integration of the framework of APIs (implemented in 
> HBASE-28465) related to data tiering into prefetch logic of HBase. The 
> implementation should filter out the cold data and enable the prefetching of 
> hot data into bucket cache.
> Thanks,
> Janardhan
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28467) Integration of time-based priority caching into write and read code paths.

2024-04-03 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28467:
-
Issue Type: Task  (was: New Feature)

> Integration of time-based priority caching into write and read code paths.
> --
>
> Key: HBASE-28467
> URL: https://issues.apache.org/jira/browse/HBASE-28467
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Priority: Major
>
> This Jira tracks the integration of time-based caching framework APIs into 
> read and write code paths. These code paths are:
> 1. cache-on-read: where the data blocks are cached when they are read during 
> scans.
> 2. cache-on-writes: where the data blocks are cached when they are 
> created/written.
> There could be other code paths which need to be fixed.
> Thanks,
> Janardhan
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28469) Integration of time-based priority caching into compaction paths.

2024-04-03 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28469:
-
Issue Type: Task  (was: New Feature)

> Integration of time-based priority caching into compaction paths.
> -
>
> Key: HBASE-28469
> URL: https://issues.apache.org/jira/browse/HBASE-28469
> Project: HBase
>  Issue Type: Task
>Reporter: Janardhan Hungund
>Priority: Major
>
> The time-based priority caching is dependent on the date-tiered compaction 
> that structures store files in date-based tiered layout. This Jira tracks the 
> changes are needed for the integration of this compaction strategy with the 
> data-tiering to enable appropriate caching of hot data in the cache, while 
> the code data can remain the cloud storage.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28468) Integration of time-based priority caching logic into cache evictions.

2024-04-03 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28468:
-
Issue Type: Task  (was: New Feature)

> Integration of time-based priority caching logic into cache evictions.
> --
>
> Key: HBASE-28468
> URL: https://issues.apache.org/jira/browse/HBASE-28468
> Project: HBase
>  Issue Type: Task
>Reporter: Janardhan Hungund
>Priority: Major
>
> When the time-based priority caching is enabled, then, the block evictions 
> triggered when the cache is full, should use the time-based priority caching 
> framework APIs to detect the cold files and evict the blocks of those files 
> first. This ensures that the hot data remains in cache while the cold data is 
> evicted from cache.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28465) Implementation of framework for time-based priority bucket-cache.

2024-04-03 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28465:
-
Issue Type: Task  (was: New Feature)

> Implementation of framework for time-based priority bucket-cache.
> -
>
> Key: HBASE-28465
> URL: https://issues.apache.org/jira/browse/HBASE-28465
> Project: HBase
>  Issue Type: Task
>Reporter: Janardhan Hungund
>Assignee: Vinayak Hegde
>Priority: Major
>
> In this Jira, we track the implementation of framework for the time-based 
> priority cache.
> This framework would help us to get the required metadata of the HFiles and 
> helps us make the decision about the hotness or coldness of data.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-04-02 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28458:
-
Fix Version/s: 4.0.0-alpha-1

> BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully 
> cached
> ---
>
> Key: HBASE-28458
> URL: https://issues.apache.org/jira/browse/HBASE-28458
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>
> Noticed that 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
> flakey, failing whenever the block eviction happened while prefetch was still 
> ongoing.
> In the test, we pass an instance of BucketCache directly to the cache config, 
> so the test is actually placing both data and meta blocks in the bucket 
> cache. So sometimes, the test call BucketCache.notifyFileCachingCompleted 
> after the it has already evicted two blocks.  
> Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
> backingMap entry set, counting number of blocks for the given file. Then, to 
> consider whether the file is fully cached or not, we do the following 
> validation:
> {noformat}
> if (dataBlockCount == count.getValue() || totalBlockCount == 
> count.getValue()) {
>   LOG.debug("File {} has now been fully cached.", fileName);
>   fileCacheCompleted(fileName, size);
> }  {noformat}
> But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
> two blocks and asserts that the file hasn't been considered fully cached. 
> When these evictions happen while prefetch is still going, we'll pass that 
> check, as the the number of blocks for the file in the backingMap would still 
> be 55, which is what we pass as dataBlockCount.
> As BucketCache is intended for storing data blocks only, I believe we should 
> make sure BucketCache.notifyFileCachingCompleted only accounts for data 
> blocks. Also, the 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
> be updated to consistently reproduce the eviction concurrent to the prefetch. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-04-02 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28458:
-
Affects Version/s: 3.0.0-beta-1
   2.6.0
   4.0.0-alpha-1
   2.7.0

> BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully 
> cached
> ---
>
> Key: HBASE-28458
> URL: https://issues.apache.org/jira/browse/HBASE-28458
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
>
> Noticed that 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
> flakey, failing whenever the block eviction happened while prefetch was still 
> ongoing.
> In the test, we pass an instance of BucketCache directly to the cache config, 
> so the test is actually placing both data and meta blocks in the bucket 
> cache. So sometimes, the test call BucketCache.notifyFileCachingCompleted 
> after the it has already evicted two blocks.  
> Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
> backingMap entry set, counting number of blocks for the given file. Then, to 
> consider whether the file is fully cached or not, we do the following 
> validation:
> {noformat}
> if (dataBlockCount == count.getValue() || totalBlockCount == 
> count.getValue()) {
>   LOG.debug("File {} has now been fully cached.", fileName);
>   fileCacheCompleted(fileName, size);
> }  {noformat}
> But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
> two blocks and asserts that the file hasn't been considered fully cached. 
> When these evictions happen while prefetch is still going, we'll pass that 
> check, as the the number of blocks for the file in the backingMap would still 
> be 55, which is what we pass as dataBlockCount.
> As BucketCache is intended for storing data blocks only, I believe we should 
> make sure BucketCache.notifyFileCachingCompleted only accounts for data 
> blocks. Also, the 
> TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
> be updated to consistently reproduce the eviction concurrent to the prefetch. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28463) Time Based Priority for BucketCache

2024-03-28 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28463:
-
Description: 
This Jira introduces the feature of time-based data tiering in HBase to 
optimize storage efficiency and access performance by segregating data based on 
its recency. By keeping recent data in the bucket cache (backed by faster 
storage types like SSDs) and evicting older data, the system aims to provide a 
more flexible control over the cache allocation and eviction logic via 
configuration, allowing for defining time priorities for cached data. 

The need for a more extensive cache allocation mechanism becomes even more 
critical on HBase deployments where cache access reflects on significant 
performance gains, such as when using cloud storage as the underlying file 
system.

The data is segregated into hot or cold categories based on its age. The recent 
data within a specific time range (configured as hot-data-age) is treated as 
hot and is stored in the cache, while the older data is stored and accessed 
from the file system.

This feature intends to provide the TCO gains by optimizing the utilization of 
high cost bucket cache. Perfect fit for the use cases that have the date-based 
data writes while the scans focus on the recently written data.

Please find the detailed design document of the feature attached with the Jira.

Thanks,

Janardhan

  was:
This Jira introduces the feature of time-based data tiering in HBase to 
optimize storage efficiency and access performance by segregating data based on 
its recency. By keeping recent data in the bucket cache (backed by faster 
storage types like SSDs) and evicting older data, the system aims to provide a 
more flexible control over the cache allocation and eviction logic via 
configuration, allowing for defining time priorities for cached data. 

The need for a more extensive cache allocation mechanism becomes even more 
critical on HBase deployments where cache access reflects on significant 
performance gains, such as when using cloud storage as the underlying file 
system.

The data is segregated into hot or cold categories based on its age. The recent 
data within a specific time range (configured as hot-data-age) is treated as 
hot and is stored in the ephemeral cache, while the older data is stored and 
accessed from the cloud storage.

This feature intends to provide the TCO gains by optimizing the utilization of 
high cost bucket cache. Perfect fit for the use cases that have the date-based 
data writes while the scans focus on the recently written data.

Please find the detailed design document of the feature attached with the Jira.



Thanks,

Janardhan


> Time Based Priority for BucketCache
> ---
>
> Key: HBASE-28463
> URL: https://issues.apache.org/jira/browse/HBASE-28463
> Project: HBase
>  Issue Type: New Feature
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Rahul Agarkar
>Priority: Major
>
> This Jira introduces the feature of time-based data tiering in HBase to 
> optimize storage efficiency and access performance by segregating data based 
> on its recency. By keeping recent data in the bucket cache (backed by faster 
> storage types like SSDs) and evicting older data, the system aims to provide 
> a more flexible control over the cache allocation and eviction logic via 
> configuration, allowing for defining time priorities for cached data. 
> The need for a more extensive cache allocation mechanism becomes even more 
> critical on HBase deployments where cache access reflects on significant 
> performance gains, such as when using cloud storage as the underlying file 
> system.
> The data is segregated into hot or cold categories based on its age. The 
> recent data within a specific time range (configured as hot-data-age) is 
> treated as hot and is stored in the cache, while the older data is stored and 
> accessed from the file system.
> This feature intends to provide the TCO gains by optimizing the utilization 
> of high cost bucket cache. Perfect fit for the use cases that have the 
> date-based data writes while the scans focus on the recently written data.
> Please find the detailed design document of the feature attached with the 
> Jira.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28450) BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file

2024-03-28 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28450.
--
Resolution: Fixed

Merged to master, branch-3, branch-2 and branch-2.6. Thanks for the reviews, 
[~psomogyi]  [~ankit.jhil]!

> BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file
> -
>
> Key: HBASE-28450
> URL: https://issues.apache.org/jira/browse/HBASE-28450
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1
>
>
> HBASE-27313, HBASE-27686 and HBASE-27743 have extended BucketCache persistent 
> cache capabilities to make it resilient to RS crashes or non graceful stops, 
> when using file based ioengine for BucketCache.
> BucketCache maintains two main collections for tracking blocks in the cache: 
> backingMap and blocksByHFile. The former is used as the main index of blocks 
> for the actual cache, whilst the latter is a set of all blocks in the cache 
> ordered by name, in order to conveniently and efficiently retrieve the list 
> of all blocks from a single file in the BucketCache.evictBlocksByHfile method.
>  
> The problem is that at cache recovery time, we are populating the 
> blocksByHFile set, which causes any calls to BucketCache.evictBlocksByHfile 
> method to not evict any blocks, once we have recovered the cache from the 
> cache persistence file (for instance, after a n RS restart).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28450) BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file

2024-03-27 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28450:
-
Fix Version/s: 2.6.1

> BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file
> -
>
> Key: HBASE-28450
> URL: https://issues.apache.org/jira/browse/HBASE-28450
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1
>
>
> HBASE-27313, HBASE-27686 and HBASE-27743 have extended BucketCache persistent 
> cache capabilities to make it resilient to RS crashes or non graceful stops, 
> when using file based ioengine for BucketCache.
> BucketCache maintains two main collections for tracking blocks in the cache: 
> backingMap and blocksByHFile. The former is used as the main index of blocks 
> for the actual cache, whilst the latter is a set of all blocks in the cache 
> ordered by name, in order to conveniently and efficiently retrieve the list 
> of all blocks from a single file in the BucketCache.evictBlocksByHfile method.
>  
> The problem is that at cache recovery time, we are populating the 
> blocksByHFile set, which causes any calls to BucketCache.evictBlocksByHfile 
> method to not evict any blocks, once we have recovered the cache from the 
> cache persistence file (for instance, after a n RS restart).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28450) BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file

2024-03-27 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28450:
-
Affects Version/s: 2.6.0

> BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file
> -
>
> Key: HBASE-28450
> URL: https://issues.apache.org/jira/browse/HBASE-28450
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> HBASE-27313, HBASE-27686 and HBASE-27743 have extended BucketCache persistent 
> cache capabilities to make it resilient to RS crashes or non graceful stops, 
> when using file based ioengine for BucketCache.
> BucketCache maintains two main collections for tracking blocks in the cache: 
> backingMap and blocksByHFile. The former is used as the main index of blocks 
> for the actual cache, whilst the latter is a set of all blocks in the cache 
> ordered by name, in order to conveniently and efficiently retrieve the list 
> of all blocks from a single file in the BucketCache.evictBlocksByHfile method.
>  
> The problem is that at cache recovery time, we are populating the 
> blocksByHFile set, which causes any calls to BucketCache.evictBlocksByHfile 
> method to not evict any blocks, once we have recovered the cache from the 
> cache persistence file (for instance, after a n RS restart).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28450) BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file

2024-03-27 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28450:
-
Fix Version/s: 4.0.0-alpha-1
   2.7.0
   3.0.0-beta-2

> BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file
> -
>
> Key: HBASE-28450
> URL: https://issues.apache.org/jira/browse/HBASE-28450
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> HBASE-27313, HBASE-27686 and HBASE-27743 have extended BucketCache persistent 
> cache capabilities to make it resilient to RS crashes or non graceful stops, 
> when using file based ioengine for BucketCache.
> BucketCache maintains two main collections for tracking blocks in the cache: 
> backingMap and blocksByHFile. The former is used as the main index of blocks 
> for the actual cache, whilst the latter is a set of all blocks in the cache 
> ordered by name, in order to conveniently and efficiently retrieve the list 
> of all blocks from a single file in the BucketCache.evictBlocksByHfile method.
>  
> The problem is that at cache recovery time, we are populating the 
> blocksByHFile set, which causes any calls to BucketCache.evictBlocksByHfile 
> method to not evict any blocks, once we have recovered the cache from the 
> cache persistence file (for instance, after a n RS restart).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28450) BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file

2024-03-27 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28450:
-
Affects Version/s: 3.0.0-beta-1
   4.0.0-alpha-1
   2.7.0

> BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file
> -
>
> Key: HBASE-28450
> URL: https://issues.apache.org/jira/browse/HBASE-28450
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
>
> HBASE-27313, HBASE-27686 and HBASE-27743 have extended BucketCache persistent 
> cache capabilities to make it resilient to RS crashes or non graceful stops, 
> when using file based ioengine for BucketCache.
> BucketCache maintains two main collections for tracking blocks in the cache: 
> backingMap and blocksByHFile. The former is used as the main index of blocks 
> for the actual cache, whilst the latter is a set of all blocks in the cache 
> ordered by name, in order to conveniently and efficiently retrieve the list 
> of all blocks from a single file in the BucketCache.evictBlocksByHfile method.
>  
> The problem is that at cache recovery time, we are populating the 
> blocksByHFile set, which causes any calls to BucketCache.evictBlocksByHfile 
> method to not evict any blocks, once we have recovered the cache from the 
> cache persistence file (for instance, after a n RS restart).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28458) BucketCache.notifyFileCachingCompleted may incorrectly consider a file fully cached

2024-03-26 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28458:


 Summary: BucketCache.notifyFileCachingCompleted may incorrectly 
consider a file fully cached
 Key: HBASE-28458
 URL: https://issues.apache.org/jira/browse/HBASE-28458
 Project: HBase
  Issue Type: Bug
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


Noticed that 
TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning was 
flakey, failing whenever the block eviction happened while prefetch was still 
ongoing.

In the test, we pass an instance of BucketCache directly to the cache config, 
so the test is actually placing both data and meta blocks in the bucket cache. 
So sometimes, the test call BucketCache.notifyFileCachingCompleted after the it 
has already evicted two blocks.  

Inside BucketCache.notifyFileCachingCompleted, we iterate through the 
backingMap entry set, counting number of blocks for the given file. Then, to 
consider whether the file is fully cached or not, we do the following 
validation:
{noformat}
if (dataBlockCount == count.getValue() || totalBlockCount == count.getValue()) {
  LOG.debug("File {} has now been fully cached.", fileName);
  fileCacheCompleted(fileName, size);
}  {noformat}
But the test generates 57 total blocks, 55 data and 2 meta blocks. It evicts 
two blocks and asserts that the file hasn't been considered fully cached. When 
these evictions happen while prefetch is still going, we'll pass that check, as 
the the number of blocks for the file in the backingMap would still be 55, 
which is what we pass as dataBlockCount.

As BucketCache is intended for storing data blocks only, I believe we should 
make sure BucketCache.notifyFileCachingCompleted only accounts for data blocks. 
Also, the 
TestBucketCachePersister.testPrefetchBlockEvictionWhilePrefetchRunning should 
be updated to consistently reproduce the eviction concurrent to the prefetch. 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28450) BuckeCache.evictBlocksByHfileName won't work after a cache recovery from file

2024-03-20 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28450:


 Summary: BuckeCache.evictBlocksByHfileName won't work after a 
cache recovery from file
 Key: HBASE-28450
 URL: https://issues.apache.org/jira/browse/HBASE-28450
 Project: HBase
  Issue Type: Bug
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


HBASE-27313, HBASE-27686 and HBASE-27743 have extended BucketCache persistent 
cache capabilities to make it resilient to RS crashes or non graceful stops, 
when using file based ioengine for BucketCache.

BucketCache maintains two main collections for tracking blocks in the cache: 
backingMap and blocksByHFile. The former is used as the main index of blocks 
for the actual cache, whilst the latter is a set of all blocks in the cache 
ordered by name, in order to conveniently and efficiently retrieve the list of 
all blocks from a single file in the BucketCache.evictBlocksByHfile method.

 

The problem is that at cache recovery time, we are populating the blocksByHFile 
set, which causes any calls to BucketCache.evictBlocksByHfile method to not 
evict any blocks, once we have recovered the cache from the cache persistence 
file (for instance, after a n RS restart).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-20 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828954#comment-17828954
 ] 

Wellington Chevreuil commented on HBASE-28447:
--

[~bbeaudreault] , I think [~gourab.taparia] means a global config property that 
would go on hbase-site.xml. IIRC, currently we just make it configurable at 
table or CF level, meaning if you want to change it for all your schema, you 
need to update that individually for all tables/CFs. Is that right, 
[~gourab.taparia] ?

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27826) Region split and merge time while offline is O(n) with respect to number of store files

2024-03-13 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826794#comment-17826794
 ] 

Wellington Chevreuil commented on HBASE-27826:
--

{quote}

Like createLink(), deleteLink(), createReference(), deleteReference(), and so 
on. The SFT becomes responsible for listing the link and reference files among 
the store contents. Today we sometimes go directly to the filesystem for 
listing stores, still. We do direct filesystem access for making and 
discovering link and reference files. This is wrong. SFT should be the 
exclusive way we track and discover store contents.

{quote}

Yeah, I underestimated what splitFile meant, was thinking it merely as creating 
refs on each of the daughter regions. I agree, we should make SFT central point 
for these FS interactions.

> Region split and merge time while offline is O(n) with respect to number of 
> store files
> ---
>
> Key: HBASE-27826
> URL: https://issues.apache.org/jira/browse/HBASE-27826
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.5.4
>Reporter: Andrew Kyle Purtell
>Priority: Major
>
> This is a significant availability issue when HFiles are on S3. =
> HBASE-26079 ({_}Use StoreFileTracker when splitting and merging{_}) changed 
> the split and merge table procedure implementations to indirect through the 
> StoreFileTracker implementation when selecting HFiles to be merged or split, 
> rather than directly listing those using file system APIs. It also changed 
> the commit logic in HRegionFileSystem to add the link/ref files on resulting 
> split or merged regions to the StoreFileTracker. However, the creation of a 
> link file is still a filesystem operation and creating a “file” on S3 can 
> take well over a second. If, for example there are 20 store files in a 
> region, which is not uncommon, after the region is taken offline for a split 
> (or merge) it may require more than 20 seconds to create the link files 
> before the results can be brought back online, creating a severe availability 
> problem. Splits and merges are supposed to be fast, completing in less than a 
> second, certainly less than a few seconds. This has been true when HFiles are 
> stored on HDFS only because file creation operations there are nearly 
> instantaneous. 
> There are two issues but both can be handled with modifications to the store 
> file tracker interface and the file based store file tracker implementation. 
> When the file based store file file tracker is enabled the HFile links should 
> be virtual entities that only exist in the file manifest. We do not require 
> physical files in the filesystem to serve as links now. That is the magic of 
> the this file tracker, the manifest file replaces requirements to list the 
> filesystem.
> Then, when splitting or merging, the HFile links should be collected into a 
> list and committed in one batch using a new FILE file tracker interface, 
> requiring only one update of the manifest file in S3, bringing the time 
> requirement for this operation to O(1) down from O[n].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27826) Region split and merge time while offline is O(n) with respect to number of store files

2024-03-13 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826761#comment-17826761
 ] 

Wellington Chevreuil commented on HBASE-27826:
--

Thanks for the headsup, [~zhangduo] and for picking this up, [~prathyu6]!

 

Summarising my understanding from the discussion:

1) We will define a splitFiles method in StoreFileTracker interface, so that 
everywhere we do split logic currently (like in SplitTableRegionProcedure) 
would now delegate to the StoreFileTracker implementation;

2) DefaultStoreFileTracker implementation would still create actual ref/link 
files in the split daughter regions, whilst FileBasedTracker impl would keep a 
link/ref at its metadata only.

3) We would need to change the format for the meta files of FileBasedTracker to 
include the parent region location for split daughters "inherited" files.

 

Seems reasonable to me, looking forward for the design doc/initial PR.

 

 

 

> Region split and merge time while offline is O(n) with respect to number of 
> store files
> ---
>
> Key: HBASE-27826
> URL: https://issues.apache.org/jira/browse/HBASE-27826
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.5.4
>Reporter: Andrew Kyle Purtell
>Priority: Major
>
> This is a significant availability issue when HFiles are on S3. =
> HBASE-26079 ({_}Use StoreFileTracker when splitting and merging{_}) changed 
> the split and merge table procedure implementations to indirect through the 
> StoreFileTracker implementation when selecting HFiles to be merged or split, 
> rather than directly listing those using file system APIs. It also changed 
> the commit logic in HRegionFileSystem to add the link/ref files on resulting 
> split or merged regions to the StoreFileTracker. However, the creation of a 
> link file is still a filesystem operation and creating a “file” on S3 can 
> take well over a second. If, for example there are 20 store files in a 
> region, which is not uncommon, after the region is taken offline for a split 
> (or merge) it may require more than 20 seconds to create the link files 
> before the results can be brought back online, creating a severe availability 
> problem. Splits and merges are supposed to be fast, completing in less than a 
> second, certainly less than a few seconds. This has been true when HFiles are 
> stored on HDFS only because file creation operations there are nearly 
> instantaneous. 
> There are two issues but both can be handled with modifications to the store 
> file tracker interface and the file based store file tracker implementation. 
> When the file based store file file tracker is enabled the HFile links should 
> be virtual entities that only exist in the file manifest. We do not require 
> physical files in the filesystem to serve as links now. That is the magic of 
> the this file tracker, the manifest file replaces requirements to list the 
> filesystem.
> Then, when splitting or merging, the HFile links should be collected into a 
> list and committed in one batch using a new FILE file tracker interface, 
> requiring only one update of the manifest file in S3, bringing the time 
> requirement for this operation to O(1) down from O[n].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28364) Warn: Cache key had block type null, but was found in L1 cache

2024-02-16 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817888#comment-17817888
 ] 

Wellington Chevreuil commented on HBASE-28364:
--

{quote}Do you think it’s an actually problem we should track down?
{quote}
Yeah, I had verified that already, it's not actual an issue. When we are 
reading blocks sequentially, we don't have how to know what's the actual next 
block type, so we pass null to the blockType param 
[here|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java#L758].
 So it's expected, that logging should be removed. I'm gonna revert it.

> Warn: Cache key had block type null, but was found in L1 cache
> --
>
> Key: HBASE-28364
> URL: https://issues.apache.org/jira/browse/HBASE-28364
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Priority: Major
>
> I'm ITBLL testing branch-2.6 and am seeing lots of these warns. This is new 
> to me. I would expect a warn to be on the rare side or be indicative of a 
> problem, but unclear from the code.
> cc [~wchevreuil] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28364) Warn: Cache key had block type null, but was found in L1 cache

2024-02-13 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817178#comment-17817178
 ] 

Wellington Chevreuil commented on HBASE-28364:
--

Indeed, I have added that logging. I thought this would be less common than it 
apparently is. Let me adjust the log level to DEBUG.

> Warn: Cache key had block type null, but was found in L1 cache
> --
>
> Key: HBASE-28364
> URL: https://issues.apache.org/jira/browse/HBASE-28364
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Priority: Major
>
> I'm ITBLL testing branch-2.6 and am seeing lots of these warns. This is new 
> to me. I would expect a warn to be on the rare side or be indicative of a 
> problem, but unclear from the code.
> cc [~wchevreuil] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28337) Positive connection test in TestShadeSaslAuthenticationProvider runs with Kerberos instead of Shade authentication

2024-02-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28337:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged into master, branch-3, branch-2, branch-2.6, branch-2.5 and branch-2.4. 
Thanks for the contribution, [~andor] !

> Positive connection test in TestShadeSaslAuthenticationProvider runs with 
> Kerberos instead of Shade authentication
> --
>
> Key: HBASE-28337
> URL: https://issues.apache.org/jira/browse/HBASE-28337
> Project: HBase
>  Issue Type: Test
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 4.0.0-alpha-1, 2.7.0, 2.5.8, 3.0.0-beta-2
>
>
> The positive test (testPositiveAuthentication) in 
> TestShadeSaslAuthenticationProvider doesn't create a new user in 
> user1.doAs(), so it will use the already Kerberos authenticated user instead 
> of re-authenticating with the token. 
> As a consequence it doesn't reveal a problem introduced with HBASE-23881 
> which will cause clients to timeout if authenticated with a SASL mech which 
> doesn't create a reply token in case of successful authentication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28337) Positive connection test in TestShadeSaslAuthenticationProvider runs with Kerberos instead of Shade authentication

2024-02-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28337:
-
Fix Version/s: 2.4.18

> Positive connection test in TestShadeSaslAuthenticationProvider runs with 
> Kerberos instead of Shade authentication
> --
>
> Key: HBASE-28337
> URL: https://issues.apache.org/jira/browse/HBASE-28337
> Project: HBase
>  Issue Type: Test
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 4.0.0-alpha-1, 2.7.0, 2.5.8, 3.0.0-beta-2
>
>
> The positive test (testPositiveAuthentication) in 
> TestShadeSaslAuthenticationProvider doesn't create a new user in 
> user1.doAs(), so it will use the already Kerberos authenticated user instead 
> of re-authenticating with the token. 
> As a consequence it doesn't reveal a problem introduced with HBASE-23881 
> which will cause clients to timeout if authenticated with a SASL mech which 
> doesn't create a reply token in case of successful authentication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28337) Positive connection test in TestShadeSaslAuthenticationProvider runs with Kerberos instead of Shade authentication

2024-02-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28337:
-
Fix Version/s: 2.6.0
   2.5.8

> Positive connection test in TestShadeSaslAuthenticationProvider runs with 
> Kerberos instead of Shade authentication
> --
>
> Key: HBASE-28337
> URL: https://issues.apache.org/jira/browse/HBASE-28337
> Project: HBase
>  Issue Type: Test
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 4.0.0-alpha-1, 2.7.0, 2.5.8, 3.0.0-beta-2
>
>
> The positive test (testPositiveAuthentication) in 
> TestShadeSaslAuthenticationProvider doesn't create a new user in 
> user1.doAs(), so it will use the already Kerberos authenticated user instead 
> of re-authenticating with the token. 
> As a consequence it doesn't reveal a problem introduced with HBASE-23881 
> which will cause clients to timeout if authenticated with a SASL mech which 
> doesn't create a reply token in case of successful authentication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28337) Positive connection test in TestShadeSaslAuthenticationProvider runs with Kerberos instead of Shade authentication

2024-02-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28337:
-
Fix Version/s: 2.7.0

> Positive connection test in TestShadeSaslAuthenticationProvider runs with 
> Kerberos instead of Shade authentication
> --
>
> Key: HBASE-28337
> URL: https://issues.apache.org/jira/browse/HBASE-28337
> Project: HBase
>  Issue Type: Test
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> The positive test (testPositiveAuthentication) in 
> TestShadeSaslAuthenticationProvider doesn't create a new user in 
> user1.doAs(), so it will use the already Kerberos authenticated user instead 
> of re-authenticating with the token. 
> As a consequence it doesn't reveal a problem introduced with HBASE-23881 
> which will cause clients to timeout if authenticated with a SASL mech which 
> doesn't create a reply token in case of successful authentication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28337) Positive connection test in TestShadeSaslAuthenticationProvider runs with Kerberos instead of Shade authentication

2024-02-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28337:
-
Fix Version/s: 3.0.0-beta-2

> Positive connection test in TestShadeSaslAuthenticationProvider runs with 
> Kerberos instead of Shade authentication
> --
>
> Key: HBASE-28337
> URL: https://issues.apache.org/jira/browse/HBASE-28337
> Project: HBase
>  Issue Type: Test
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 3.0.0-beta-2
>
>
> The positive test (testPositiveAuthentication) in 
> TestShadeSaslAuthenticationProvider doesn't create a new user in 
> user1.doAs(), so it will use the already Kerberos authenticated user instead 
> of re-authenticating with the token. 
> As a consequence it doesn't reveal a problem introduced with HBASE-23881 
> which will cause clients to timeout if authenticated with a SASL mech which 
> doesn't create a reply token in case of successful authentication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28337) Positive connection test in TestShadeSaslAuthenticationProvider runs with Kerberos instead of Shade authentication

2024-02-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28337:
-
Fix Version/s: 4.0.0-alpha-1

> Positive connection test in TestShadeSaslAuthenticationProvider runs with 
> Kerberos instead of Shade authentication
> --
>
> Key: HBASE-28337
> URL: https://issues.apache.org/jira/browse/HBASE-28337
> Project: HBase
>  Issue Type: Test
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>
> The positive test (testPositiveAuthentication) in 
> TestShadeSaslAuthenticationProvider doesn't create a new user in 
> user1.doAs(), so it will use the already Kerberos authenticated user instead 
> of re-authenticating with the token. 
> As a consequence it doesn't reveal a problem introduced with HBASE-23881 
> which will cause clients to timeout if authenticated with a SASL mech which 
> doesn't create a reply token in case of successful authentication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-02-06 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28303.
--
Resolution: Fixed

Merged into master, branch-3 and branch-2.

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0, 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-02-06 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28303:
-
Fix Version/s: 2.7.0

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0, 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-01-30 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28303:
-
Fix Version/s: 3.0.0

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0, 4.0.0-alpha-1
>
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-01-30 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28303:
-
Fix Version/s: 3.0.0-beta-2

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0, 4.0.0-alpha-1, 3.0.0-beta-2
>
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-01-26 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28303:
-
Release Note: 
This change adds a new configuration property, "hbase.rs.prefetchheapusage", 
which specifies a heap usage limit for the prefetch threads to be allowed to 
execute. This is to prevent an overhead heap usage by bucket cache mapping 
objects when using file or offheap engines.

This value should be defined as a double representing the heap usage 
percentage, and it's default set to 1d (100% heap), disabling this feature. For 
example, to set an 80% heap usage threshold, set "hbase.rs.prefetchheapusage" 
to "0.8" in the RegionServer hbase-site.xml configuration.

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-01-26 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28303:
-
Fix Version/s: 4.0.0-alpha-1

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-01-26 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28303:
-
Affects Version/s: 2.5.7
   3.0.0-beta-1
   2.4.17
   2.6.0
   4.0.0-alpha-1
   2.7.0

> Interrupt cache prefetch thread when a heap usage threshold is reached
> --
>
> Key: HBASE-28303
> URL: https://issues.apache.org/jira/browse/HBASE-28303
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> Mostly critical when using non heap cache implementations, such as offheap or 
> file based. If the cache medium is too large and there are many blocks to be 
> cached, it may create a lot of cache index object in the RegionServer heap. 
> We should have guardrails to preventing caching from exhausting available 
> heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28303) Interrupt cache prefetch thread when a heap usage threshold is reached

2024-01-10 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28303:


 Summary: Interrupt cache prefetch thread when a heap usage 
threshold is reached
 Key: HBASE-28303
 URL: https://issues.apache.org/jira/browse/HBASE-28303
 Project: HBase
  Issue Type: Improvement
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


Mostly critical when using non heap cache implementations, such as offheap or 
file based. If the cache medium is too large and there are many blocks to be 
cached, it may create a lot of cache index object in the RegionServer heap. We 
should have guardrails to preventing caching from exhausting available heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28259) Add java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags

2024-01-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28259:
-
Fix Version/s: 2.6.0
   2.4.18
   2.5.8

> Add  java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags
> --
>
> Key: HBASE-28259
> URL: https://issues.apache.org/jira/browse/HBASE-28259
> Project: HBase
>  Issue Type: Bug
>  Components: java
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
> Fix For: 2.6.0, 2.4.18, 4.0.0-alpha-1, 2.7.0, 2.5.8, 3.0.0-beta-2
>
>
> hbase shell
> 2023-12-13T23:49:50.846+08:00 [main] WARN FilenoUtil : Native subprocess 
> control requires open access to the JDK IO subsystem
> Pass '--add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens 
> java.base/java.io=ALL-UNNAMED' to enable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28259) Add java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags

2024-01-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28259:
-
Affects Version/s: 2.7.0

> Add  java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags
> --
>
> Key: HBASE-28259
> URL: https://issues.apache.org/jira/browse/HBASE-28259
> Project: HBase
>  Issue Type: Bug
>  Components: java
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 4.0.0-alpha-1, 2.5.7, 2.7.0
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> hbase shell
> 2023-12-13T23:49:50.846+08:00 [main] WARN FilenoUtil : Native subprocess 
> control requires open access to the JDK IO subsystem
> Pass '--add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens 
> java.base/java.io=ALL-UNNAMED' to enable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28259) Add java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags

2024-01-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28259:
-
Affects Version/s: 2.5.7
   2.4.17
   3.0.0-alpha-4
   2.6.0
   4.0.0-alpha-1

> Add  java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags
> --
>
> Key: HBASE-28259
> URL: https://issues.apache.org/jira/browse/HBASE-28259
> Project: HBase
>  Issue Type: Bug
>  Components: java
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 4.0.0-alpha-1, 2.5.7
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
>
> hbase shell
> 2023-12-13T23:49:50.846+08:00 [main] WARN FilenoUtil : Native subprocess 
> control requires open access to the JDK IO subsystem
> Pass '--add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens 
> java.base/java.io=ALL-UNNAMED' to enable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28259) Add java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags

2024-01-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28259:
-
Fix Version/s: 4.0.0-alpha-1
   2.7.0
   3.0.0-beta-2

> Add  java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags
> --
>
> Key: HBASE-28259
> URL: https://issues.apache.org/jira/browse/HBASE-28259
> Project: HBase
>  Issue Type: Bug
>  Components: java
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 4.0.0-alpha-1, 2.5.7
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> hbase shell
> 2023-12-13T23:49:50.846+08:00 [main] WARN FilenoUtil : Native subprocess 
> control requires open access to the JDK IO subsystem
> Pass '--add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens 
> java.base/java.io=ALL-UNNAMED' to enable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28259) Add java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags

2024-01-05 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28259.
--
Resolution: Fixed

Merged to master, branch-3, branch-2, branch-2.6, branch-2.5 and branch-2.4. 
Thanks for the contribution, [~mrzhao] !

> Add  java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags
> --
>
> Key: HBASE-28259
> URL: https://issues.apache.org/jira/browse/HBASE-28259
> Project: HBase
>  Issue Type: Bug
>  Components: java
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
>
> hbase shell
> 2023-12-13T23:49:50.846+08:00 [main] WARN FilenoUtil : Native subprocess 
> control requires open access to the JDK IO subsystem
> Pass '--add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens 
> java.base/java.io=ALL-UNNAMED' to enable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28259) Add java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags

2023-12-15 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil reassigned HBASE-28259:


Assignee: Moran

> Add  java.base/java.io=ALL-UNNAMED open to jdk11_jvm_flags
> --
>
> Key: HBASE-28259
> URL: https://issues.apache.org/jira/browse/HBASE-28259
> Project: HBase
>  Issue Type: Bug
>  Components: java
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
>
> hbase shell
> 2023-12-13T23:49:50.846+08:00 [main] WARN FilenoUtil : Native subprocess 
> control requires open access to the JDK IO subsystem
> Pass '--add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens 
> java.base/java.io=ALL-UNNAMED' to enable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2023-12-14 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28246.
--
Resolution: Fixed

Thanks for reviewing it, [~psomogyi]! Merged into master, branch-3 and branch-2.

> Expose region cached size over JMX metrics and report in the RS UI
> --
>
> Key: HBASE-28246
> URL: https://issues.apache.org/jira/browse/HBASE-28246
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0-beta-1, 4.0.0-alpha-1
>
> Attachments: Screenshot 2023-12-06 at 22.58.17.png
>
>
> With large file based bucket cache, the prefetch executor can take long time 
> to complete cache all of the dataset. It would be useful to report how much % 
> of regions data is already cached, in order to give an idea of how much work 
> prefetch executor has done.
> This PRs adds jmx metrics for region cache % and also reports the same in the 
> RS UI "Store File Metrics" tab as below:
> !Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2023-12-14 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28246:
-
Fix Version/s: 3.0.0-beta-1

> Expose region cached size over JMX metrics and report in the RS UI
> --
>
> Key: HBASE-28246
> URL: https://issues.apache.org/jira/browse/HBASE-28246
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0-beta-1, 4.0.0-alpha-1
>
> Attachments: Screenshot 2023-12-06 at 22.58.17.png
>
>
> With large file based bucket cache, the prefetch executor can take long time 
> to complete cache all of the dataset. It would be useful to report how much % 
> of regions data is already cached, in order to give an idea of how much work 
> prefetch executor has done.
> This PRs adds jmx metrics for region cache % and also reports the same in the 
> RS UI "Store File Metrics" tab as below:
> !Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2023-12-14 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28246:
-
Affects Version/s: 2.5.6
   2.4.17
   3.0.0-alpha-4
   2.6.0

> Expose region cached size over JMX metrics and report in the RS UI
> --
>
> Key: HBASE-28246
> URL: https://issues.apache.org/jira/browse/HBASE-28246
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
> Attachments: Screenshot 2023-12-06 at 22.58.17.png
>
>
> With large file based bucket cache, the prefetch executor can take long time 
> to complete cache all of the dataset. It would be useful to report how much % 
> of regions data is already cached, in order to give an idea of how much work 
> prefetch executor has done.
> This PRs adds jmx metrics for region cache % and also reports the same in the 
> RS UI "Store File Metrics" tab as below:
> !Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2023-12-14 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28246:
-
Affects Version/s: 4.0.0-alpha-1

> Expose region cached size over JMX metrics and report in the RS UI
> --
>
> Key: HBASE-28246
> URL: https://issues.apache.org/jira/browse/HBASE-28246
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Attachments: Screenshot 2023-12-06 at 22.58.17.png
>
>
> With large file based bucket cache, the prefetch executor can take long time 
> to complete cache all of the dataset. It would be useful to report how much % 
> of regions data is already cached, in order to give an idea of how much work 
> prefetch executor has done.
> This PRs adds jmx metrics for region cache % and also reports the same in the 
> RS UI "Store File Metrics" tab as below:
> !Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2023-12-14 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28246:
-
Fix Version/s: 4.0.0-alpha-1

> Expose region cached size over JMX metrics and report in the RS UI
> --
>
> Key: HBASE-28246
> URL: https://issues.apache.org/jira/browse/HBASE-28246
> Project: HBase
>  Issue Type: Improvement
>  Components: BucketCache
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
> Attachments: Screenshot 2023-12-06 at 22.58.17.png
>
>
> With large file based bucket cache, the prefetch executor can take long time 
> to complete cache all of the dataset. It would be useful to report how much % 
> of regions data is already cached, in order to give an idea of how much work 
> prefetch executor has done.
> This PRs adds jmx metrics for region cache % and also reports the same in the 
> RS UI "Store File Metrics" tab as below:
> !Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28251.
--
Resolution: Fixed

Merged into master branch. Thanks for reviewing this, [~psomogyi] , 
[~nihaljain.cs] and [~zhangduo] !

> [SFT] Add description for specifying SFT impl during snapshot recovery
> --
>
> Key: HBASE-28251
> URL: https://issues.apache.org/jira/browse/HBASE-28251
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> HBASE-26286 added an option to clone_snapshot command that allows for 
> specifying the SFT implementation during the snapshot recovery. This really 
> useful when recovering snapshots imported from clusters not using the same 
> SFT impl as the one where we are cloning it. Without this, the cloned/restore 
> table will get created with the the SFT impl of its original cluster, 
> requiring extra conversion steps using the MIGRATION tracker.
> This also fix formatting for the "Bulk Data Generator Tool", which is 
> currently displayed as a sub-topic of the SFT chapter. It should have it's 
> own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28251:
-
Fix Version/s: 4.0.0-alpha-1

> [SFT] Add description for specifying SFT impl during snapshot recovery
> --
>
> Key: HBASE-28251
> URL: https://issues.apache.org/jira/browse/HBASE-28251
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> HBASE-26286 added an option to clone_snapshot command that allows for 
> specifying the SFT implementation during the snapshot recovery. This really 
> useful when recovering snapshots imported from clusters not using the same 
> SFT impl as the one where we are cloning it. Without this, the cloned/restore 
> table will get created with the the SFT impl of its original cluster, 
> requiring extra conversion steps using the MIGRATION tracker.
> This also fix formatting for the "Bulk Data Generator Tool", which is 
> currently displayed as a sub-topic of the SFT chapter. It should have it's 
> own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28251:
-
Affects Version/s: 4.0.0-alpha-1

> [SFT] Add description for specifying SFT impl during snapshot recovery
> --
>
> Key: HBASE-28251
> URL: https://issues.apache.org/jira/browse/HBASE-28251
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> HBASE-26286 added an option to clone_snapshot command that allows for 
> specifying the SFT implementation during the snapshot recovery. This really 
> useful when recovering snapshots imported from clusters not using the same 
> SFT impl as the one where we are cloning it. Without this, the cloned/restore 
> table will get created with the the SFT impl of its original cluster, 
> requiring extra conversion steps using the MIGRATION tracker.
> This also fix formatting for the "Bulk Data Generator Tool", which is 
> currently displayed as a sub-topic of the SFT chapter. It should have it's 
> own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-28251 started by Wellington Chevreuil.

> [SFT] Add description for specifying SFT impl during snapshot recovery
> --
>
> Key: HBASE-28251
> URL: https://issues.apache.org/jira/browse/HBASE-28251
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> HBASE-26286 added an option to clone_snapshot command that allows for 
> specifying the SFT implementation during the snapshot recovery. This really 
> useful when recovering snapshots imported from clusters not using the same 
> SFT impl as the one where we are cloning it. Without this, the cloned/restore 
> table will get created with the the SFT impl of its original cluster, 
> requiring extra conversion steps using the MIGRATION tracker.
> This also fix formatting for the "Bulk Data Generator Tool", which is 
> currently displayed as a sub-topic of the SFT chapter. It should have it's 
> own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28251:
-
Parent: HBASE-26584
Issue Type: Sub-task  (was: Task)

> [SFT] Add description for specifying SFT impl during snapshot recovery
> --
>
> Key: HBASE-28251
> URL: https://issues.apache.org/jira/browse/HBASE-28251
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> HBASE-26286 added an option to clone_snapshot command that allows for 
> specifying the SFT implementation during the snapshot recovery. This really 
> useful when recovering snapshots imported from clusters not using the same 
> SFT impl as the one where we are cloning it. Without this, the cloned/restore 
> table will get created with the the SFT impl of its original cluster, 
> requiring extra conversion steps using the MIGRATION tracker.
> This also fix formatting for the "Bulk Data Generator Tool", which is 
> currently displayed as a sub-topic of the SFT chapter. It should have it's 
> own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28251:
-
Issue Type: Task  (was: Improvement)

> [SFT] Add description for specifying SFT impl during snapshot recovery
> --
>
> Key: HBASE-28251
> URL: https://issues.apache.org/jira/browse/HBASE-28251
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> HBASE-26286 added an option to clone_snapshot command that allows for 
> specifying the SFT implementation during the snapshot recovery. This really 
> useful when recovering snapshots imported from clusters not using the same 
> SFT impl as the one where we are cloning it. Without this, the cloned/restore 
> table will get created with the the SFT impl of its original cluster, 
> requiring extra conversion steps using the MIGRATION tracker.
> This also fix formatting for the "Bulk Data Generator Tool", which is 
> currently displayed as a sub-topic of the SFT chapter. It should have it's 
> own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28251) [SFT] Add description for specifying SFT impl during snapshot recovery

2023-12-08 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28251:


 Summary: [SFT] Add description for specifying SFT impl during 
snapshot recovery
 Key: HBASE-28251
 URL: https://issues.apache.org/jira/browse/HBASE-28251
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


HBASE-26286 added an option to clone_snapshot command that allows for 
specifying the SFT implementation during the snapshot recovery. This really 
useful when recovering snapshots imported from clusters not using the same SFT 
impl as the one where we are cloning it. Without this, the cloned/restore table 
will get created with the the SFT impl of its original cluster, requiring extra 
conversion steps using the MIGRATION tracker.

This also fix formatting for the "Bulk Data Generator Tool", which is currently 
displayed as a sub-topic of the SFT chapter. It should have it's own chapter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28209) Create a jmx metrics to expose the oldWALs directory size

2023-12-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28209.
--
Resolution: Fixed

I have now merged the branch-2 PR and cherry-picked into branch-2.6, branch-2.5 
and branch-2.4. Thanks for the contribution, [~vinayakhegde] !

> Create a jmx metrics to expose the oldWALs directory size
> -
>
> Key: HBASE-28209
> URL: https://issues.apache.org/jira/browse/HBASE-28209
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Vinayak Hegde
>Assignee: Vinayak Hegde
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>
>
> Create a jmx metrics that can return the size of the old WALs in bytes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28209) Create a jmx metrics to expose the oldWALs directory size

2023-12-08 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28209:
-
Fix Version/s: 2.6.0
   2.4.18
   2.5.7
   2.7.0

> Create a jmx metrics to expose the oldWALs directory size
> -
>
> Key: HBASE-28209
> URL: https://issues.apache.org/jira/browse/HBASE-28209
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Vinayak Hegde
>Assignee: Vinayak Hegde
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7, 2.7.0
>
>
> Create a jmx metrics that can return the size of the old WALs in bytes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28246) Expose region cached size over JMX metrics and report in the RS UI

2023-12-06 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28246:


 Summary: Expose region cached size over JMX metrics and report in 
the RS UI
 Key: HBASE-28246
 URL: https://issues.apache.org/jira/browse/HBASE-28246
 Project: HBase
  Issue Type: Improvement
  Components: BucketCache
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil
 Attachments: Screenshot 2023-12-06 at 22.58.17.png

With large file based bucket cache, the prefetch executor can take long time to 
complete cache all of the dataset. It would be useful to report how much % of 
regions data is already cached, in order to give an idea of how much work 
prefetch executor has done.

This PRs adds jmx metrics for region cache % and also reports the same in the 
RS UI "Store File Metrics" tab as below:

!Screenshot 2023-12-06 at 22.58.17.png|width=658,height=114!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28209) Create a jmx metrics to expose the oldWALs directory size

2023-12-04 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792780#comment-17792780
 ] 

Wellington Chevreuil commented on HBASE-28209:
--

Thanks for this contribution, [~vinayakhegde] , I had now merged it to master 
and branch-3 branches. I'm getting some conflicts when cherry-picking to base 2 
branches, mind submit a new PR targeting branch-2?

> Create a jmx metrics to expose the oldWALs directory size
> -
>
> Key: HBASE-28209
> URL: https://issues.apache.org/jira/browse/HBASE-28209
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Vinayak Hegde
>Assignee: Vinayak Hegde
>Priority: Major
> Fix For: 3.0.0-beta-1, 4.0.0-alpha-1
>
>
> Create a jmx metrics that can return the size of the old WALs in bytes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28209) Create a jmx metrics to expose the oldWALs directory size

2023-12-04 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28209:
-
Affects Version/s: 2.5.6
   2.4.17
   3.0.0-alpha-4
   2.6.0
   4.0.0-alpha-1

> Create a jmx metrics to expose the oldWALs directory size
> -
>
> Key: HBASE-28209
> URL: https://issues.apache.org/jira/browse/HBASE-28209
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Vinayak Hegde
>Assignee: Vinayak Hegde
>Priority: Major
>
> Create a jmx metrics that can return the size of the old WALs in bytes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28209) Create a jmx metrics to expose the oldWALs directory size

2023-12-04 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28209:
-
Fix Version/s: 3.0.0-beta-1
   4.0.0-alpha-1

> Create a jmx metrics to expose the oldWALs directory size
> -
>
> Key: HBASE-28209
> URL: https://issues.apache.org/jira/browse/HBASE-28209
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Vinayak Hegde
>Assignee: Vinayak Hegde
>Priority: Major
> Fix For: 3.0.0-beta-1, 4.0.0-alpha-1
>
>
> Create a jmx metrics that can return the size of the old WALs in bytes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28186) Rebase CacheAwareBalance related commits into master branch

2023-11-30 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28186:
-
Affects Version/s: 2.5.6
   2.4.17
   3.0.0-alpha-4
   2.6.0
   4.0.0-alpha-1

> Rebase CacheAwareBalance related commits into master branch
> ---
>
> Key: HBASE-28186
> URL: https://issues.apache.org/jira/browse/HBASE-28186
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28186) Rebase CacheAwareBalance related commits into master branch

2023-11-30 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28186.
--
Resolution: Fixed

Thanks for helping with the backport to branch-2, [~ragarkar] ! 

> Rebase CacheAwareBalance related commits into master branch
> ---
>
> Key: HBASE-28186
> URL: https://issues.apache.org/jira/browse/HBASE-28186
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28186) Rebase CacheAwareBalance related commits into master branch

2023-11-30 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28186:
-
Fix Version/s: 2.7.0
   2.6.0
   3.0.0-beta-1
   4.0.0-alpha-1

> Rebase CacheAwareBalance related commits into master branch
> ---
>
> Key: HBASE-28186
> URL: https://issues.apache.org/jira/browse/HBASE-28186
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27891) Report heap used by BucketCache as a jmx metric

2023-11-30 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791650#comment-17791650
 ] 

Wellington Chevreuil commented on HBASE-27891:
--

Are you guys currently working on this, [~bbeaudreault]? As our cloud storage 
solution strongly relies on bucket cache for performance, we are very 
interested on this.
{quote}For example, we have a server with 500k blocks in the bucket cache, and 
according to a heap dump it was holding around 260mb
{quote}
Yep. Pay special attention to blocks sizes if you are using compression. This 
is because by default, we consider the uncompressed size when delimiting a 
block, so we may reach the configured size (say, the default 64KB), then 
compress it and write to disk. In some of our customer's deployments, this was 
resulting in blocks as small as 5KB, thus increasing the number of blocks (and 
therefore, the number of objects held by BucketCache in the RSes' heaps). We 
had implemented the BlockCompressedSizePredicator to mitigate this in 
HBASE-27264, with the PreviousBlockCompressionRatePredicator that calculates 
the compression ratio of previous block, when considering what should be the 
raw size boundary.
{quote}The major contributors I saw were the offsetLock, blocksByHFile set, and 
backingMap.
{quote}
We also recently found potential leak on blocksByHFile. Mostly fixed by 
HBASE-26305, but if you are on a version that doesn't include this, you might 
face this problem. HBASE-26305 on its own doesn't solve potential leaks in the 
case of allocation failures, so we also had submitted HBASE-28211 for this.

> Report heap used by BucketCache as a jmx metric
> ---
>
> Key: HBASE-27891
> URL: https://issues.apache.org/jira/browse/HBASE-27891
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Priority: Major
>
> The BucketCache can take a non-trivial amount of heap, especially for very 
> large cache sizes. For example, we have a server with 500k blocks in the 
> bucket cache, and according to a heap dump it was holding around 260mb. One 
> needs to account for this when determining the size of heap to use, so we 
> should report it.
> The major contributors I saw were the offsetLock, blocksByHFile set, and 
> backingMap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28211) BucketCache.blocksByHFile may leak on allocationFailure or if we reach io errors tolerated

2023-11-29 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28211.
--
Resolution: Fixed

Thanks for reviewing it, [~zhangduo]! I have now merged it to master, branch-3, 
branch-2, branch-2.5 and branch-2.6.

> BucketCache.blocksByHFile may leak on allocationFailure or if we reach io 
> errors tolerated
> --
>
> Key: HBASE-28211
> URL: https://issues.apache.org/jira/browse/HBASE-28211
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7
>
>
> We add blocks to BucketCache.blocksByHFile on doDrain before we actually had 
> successfully added the block to the cache. We may still fail to cache the 
> block if it is too big to fit any of the configured bucket sizes, or if we 
> fail to write it in the ioengine and reach the tolerated io errors threshold. 
> In such cases, the related block would remain in the 
> BucketCache.blocksByHFile indefinitely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28211) BucketCache.blocksByHFile may leak on allocationFailure or if we reach io errors tolerated

2023-11-29 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28211:
-
Affects Version/s: 2.5.6
   2.4.17
   3.0.0-alpha-4
   2.6.0
   4.0.0-alpha-1

> BucketCache.blocksByHFile may leak on allocationFailure or if we reach io 
> errors tolerated
> --
>
> Key: HBASE-28211
> URL: https://issues.apache.org/jira/browse/HBASE-28211
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> We add blocks to BucketCache.blocksByHFile on doDrain before we actually had 
> successfully added the block to the cache. We may still fail to cache the 
> block if it is too big to fit any of the configured bucket sizes, or if we 
> fail to write it in the ioengine and reach the tolerated io errors threshold. 
> In such cases, the related block would remain in the 
> BucketCache.blocksByHFile indefinitely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28211) BucketCache.blocksByHFile may leak on allocationFailure or if we reach io errors tolerated

2023-11-29 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28211:
-
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-1
   4.0.0-alpha-1
   2.5.7

> BucketCache.blocksByHFile may leak on allocationFailure or if we reach io 
> errors tolerated
> --
>
> Key: HBASE-28211
> URL: https://issues.apache.org/jira/browse/HBASE-28211
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7
>
>
> We add blocks to BucketCache.blocksByHFile on doDrain before we actually had 
> successfully added the block to the cache. We may still fail to cache the 
> block if it is too big to fit any of the configured bucket sizes, or if we 
> fail to write it in the ioengine and reach the tolerated io errors threshold. 
> In such cases, the related block would remain in the 
> BucketCache.blocksByHFile indefinitely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28217) PrefetchExecutor should not run for files from CFs that have disabled BLOCKCACHE

2023-11-28 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28217.
--
Resolution: Fixed

Thanks for reviewing it, [~psomogyi]. I have merged it to master, branch-3, 
branch-2, branch-2.5 and branch-2.4.

> PrefetchExecutor should not run for files from CFs that have disabled 
> BLOCKCACHE
> 
>
> Key: HBASE-28217
> URL: https://issues.apache.org/jira/browse/HBASE-28217
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7
>
>
> HFilePReadReader relies on the return of CacheConfig.shouldPrefetchOnOpen 
> return to decide if it should run the PrefetchExecutor for the files. 
> Currently, CacheConfig.shouldPrefetchOnOpen returns true if 
> "hbase.rs.prefetchblocksonopen" is set to true at the config, OR 
> PREFETCH_BLOCKS_ON_OPEN is set to true at CF level.
> There's also the CacheConfig.shouldCacheDataOnRead, which returns true if 
> both hbase.block.data.cacheonread is set to true at the config AND BLOCKCACHE 
> is set to true at CF level.
> If BLOCKCACHE is set to false at CF level, HFilePReadReader will still run 
> the PrefetchExecutor to read all the file's blocks from the FileSystem, but 
> then would find out the given block shouldn't be cached. 
> I believe we should change CacheConfig.shouldPrefetchOnOpen to return true 
> only if CacheConfig.shouldCacheDataOnRead is also true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28217) PrefetchExecutor should not run for files from CFs that have disabled BLOCKCACHE

2023-11-28 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28217:
-
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-1
   4.0.0-alpha-1
   2.5.7

> PrefetchExecutor should not run for files from CFs that have disabled 
> BLOCKCACHE
> 
>
> Key: HBASE-28217
> URL: https://issues.apache.org/jira/browse/HBASE-28217
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 4.0.0-alpha-1, 2.5.7
>
>
> HFilePReadReader relies on the return of CacheConfig.shouldPrefetchOnOpen 
> return to decide if it should run the PrefetchExecutor for the files. 
> Currently, CacheConfig.shouldPrefetchOnOpen returns true if 
> "hbase.rs.prefetchblocksonopen" is set to true at the config, OR 
> PREFETCH_BLOCKS_ON_OPEN is set to true at CF level.
> There's also the CacheConfig.shouldCacheDataOnRead, which returns true if 
> both hbase.block.data.cacheonread is set to true at the config AND BLOCKCACHE 
> is set to true at CF level.
> If BLOCKCACHE is set to false at CF level, HFilePReadReader will still run 
> the PrefetchExecutor to read all the file's blocks from the FileSystem, but 
> then would find out the given block shouldn't be cached. 
> I believe we should change CacheConfig.shouldPrefetchOnOpen to return true 
> only if CacheConfig.shouldCacheDataOnRead is also true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28217) PrefetchExecutor should not run for files from CFs that have disabled BLOCKCACHE

2023-11-28 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28217:
-
Affects Version/s: 2.5.6
   2.4.17
   3.0.0-alpha-4
   2.6.0
   4.0.0-alpha-1

> PrefetchExecutor should not run for files from CFs that have disabled 
> BLOCKCACHE
> 
>
> Key: HBASE-28217
> URL: https://issues.apache.org/jira/browse/HBASE-28217
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.6, 4.0.0-alpha-1
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> HFilePReadReader relies on the return of CacheConfig.shouldPrefetchOnOpen 
> return to decide if it should run the PrefetchExecutor for the files. 
> Currently, CacheConfig.shouldPrefetchOnOpen returns true if 
> "hbase.rs.prefetchblocksonopen" is set to true at the config, OR 
> PREFETCH_BLOCKS_ON_OPEN is set to true at CF level.
> There's also the CacheConfig.shouldCacheDataOnRead, which returns true if 
> both hbase.block.data.cacheonread is set to true at the config AND BLOCKCACHE 
> is set to true at CF level.
> If BLOCKCACHE is set to false at CF level, HFilePReadReader will still run 
> the PrefetchExecutor to read all the file's blocks from the FileSystem, but 
> then would find out the given block shouldn't be cached. 
> I believe we should change CacheConfig.shouldPrefetchOnOpen to return true 
> only if CacheConfig.shouldCacheDataOnRead is also true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >