[jira] [Commented] (HBASE-28084) Incremental backups should be forbidden after deleting backups

2024-05-07 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844462#comment-17844462
 ] 

Ray Mattingly commented on HBASE-28084:
---

{quote}Imagine I have a set of backups (Full1, Incr2, Incr3), delete the last 
backup (Incr3), and then create a new incremental backup (Incr4).

This backup history will now show: Full1, Incr2, Incr4.
{quote}
In this case shouldn't Incr4 recognize that the last existing backup is Incr2, 
so Incr4 would cover from the end of Incr2 to now if possible (WALs are still 
around?) or throw if this is not possible?

I'm concerned that this proposal would mean that any incremental backup 
corruption or ungraceful failures would necessitate a new full backup. A nicer 
UX would be to delete the bad incremental backup, and try again 

> Incremental backups should be forbidden after deleting backups
> --
>
> Key: HBASE-28084
> URL: https://issues.apache.org/jira/browse/HBASE-28084
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore
>Reporter: Dieter De Paepe
>Priority: Major
>
> Imagine I have a set of backups (Full1, Incr2, Incr3), delete the last backup 
> (Incr3), and then create a new incremental backup (Incr4).
> This backup history will now show: Full1, Incr2, Incr4.
> However, restoring Incr4 will not contain the data that was captured in 
> Incr3, effectively leading to data loss. This will certainly surprise some 
> users.
> I suggest to add some internal bookkeeping to prevent incremental backups in 
> case the most recent backup was deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28562) Ancestor calculation of backups is wrong

2024-05-02 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843012#comment-17843012
 ] 

Ray Mattingly commented on HBASE-28562:
---

I like the simplicity, but I think we're missing some necessary complexity. 
Backups may apply to some tables, but not others, and we need to fetch every 
ancestor, rather than just one.

> Ancestor calculation of backups is wrong
> 
>
> Key: HBASE-28562
> URL: https://issues.apache.org/jira/browse/HBASE-28562
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore
>Affects Versions: 2.6.0, 3.0.0
>Reporter: Dieter De Paepe
>Priority: Major
>  Labels: pull-request-available
>
> This is the same issue as HBASE-25870, but I think the fix there was wrong.
> This issue can prevent creation of (incremental) backups when data of 
> unrelated backups was damaged on backup storage.
> Minimal example to reproduce from source:
>  * Add following to `conf/hbase-site.xml` to enable backups:
> {code:java}
> 
> hbase.backup.enable
> true
>   
>   
> hbase.master.logcleaner.plugins
> 
> org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner
>   
>   
> hbase.procedure.master.classes
> 
> org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager
>   
>   
> hbase.procedure.regionserver.classes
> 
> org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager
>   
>   
>   hbase.coprocessor.region.classes
>   org.apache.hadoop.hbase.backup.BackupObserver
> 
>   
> hbase.fs.tmp.dir
> file:/tmp/hbase-tmp
>{code}
>  * Start HBase and open a shell: {{{}bin/start-hbase.sh{}}}, {{bin/hbase 
> shell}}
>  * Execute following commands ("put" & "create" commands in hbase shell, 
> other commands in commandline):
>  * 
> {code:java}
> create 'experiment', 'fam' 
> put 'experiment', 'row1', 'fam:b', 'value1'
> bin/hbase backup create full file:/tmp/hbasebackup
> Backup session backup_1714649896776 finished. Status: SUCCESS
> put 'experiment', 'row2', 'fam:b', 'value2'
> bin/hbase backup create incremental file:/tmp/hbasebackup
> Backup session backup_1714649920488 finished. Status: SUCCESS
> put 'experiment', 'row3', 'fam:b', 'value3'
> bin/hbase backup create incremental file:/tmp/hbasebackup
> Backup session backup_1714650054960 finished. Status: SUCCESS
> (Delete the files corresponding to the first incremental backup - 
> backup_1714649920488 in this example)
> put 'experiment', 'row4', 'fam:a', 'value4'
> bin/hbase backup create full file:/tmp/hbasebackup
> Backup session backup_1714650236911 finished. Status: SUCCESS
> put 'experiment', 'row5', 'fam:a', 'value5'
> bin/hbase backup create incremental file:/tmp/hbasebackup
> Backup session backup_1714650289957 finished. Status: SUCCESS
> put 'experiment', 'row6', 'fam:a', 'value6'
> bin/hbase backup create incremental 
> file:/tmp/hbasebackup2024-05-02T13:45:27,534 ERROR [main {}] 
> impl.BackupManifest: file:/tmp/hbasebackup/backup_1714649920488 does not exist
> 2024-05-02T13:45:27,534 ERROR [main {}] impl.TableBackupClient: Unexpected 
> Exception : file:/tmp/hbasebackup/backup_1714649920488 does not exist
> org.apache.hadoop.hbase.backup.impl.BackupException: 
> file:/tmp/hbasebackup/backup_1714649920488 does not exist
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupManifest.(BackupManifest.java:451)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupManifest.(BackupManifest.java:402)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:331)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:353)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.TableBackupClient.addManifest(TableBackupClient.java:286)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.TableBackupClient.completeBackup(TableBackupClient.java:351)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:314)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603)
>  ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
>     at 
> 

[jira] [Commented] (HBASE-28562) Ancestor calculation of backups is wrong

2024-05-02 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843008#comment-17843008
 ] 

Ray Mattingly commented on HBASE-28562:
---

Yes, we've experience huge backup manifests due to some bugs in the 
getAncestors call, and its underlying BackupManifest#canCoverImage method.

The BackupManifest#canCoverImage method specifies that its fullImages parameter 
is intended to only be full backup images, not incremental. Its name implies 
this, and [a comment makes that 
clear|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManifest.java#L614]:

"each image of fullImages must not be an incremental image"

But we pass in all ancestors, including incremental images, to this method. For 
example: 
[https://github.com/apache/hbase/blob/6b672cc0717e762ecaad203714099b962c035ef0/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java#L320]

And the BackupManifest#canCoverImage does not assert the precondition well — 
instead of throwing an IllegalArgumentException, it proceeds and will just 
return false [if any of the given ancestors are incremental 
backups|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManifest.java#L619]!
 This means that, once an incremental backup ancestor has been found, all 
subsequent backup images will also be considered ancestors and this will 
balloon the backup manifest size. This could also be a factor in why checking 
the entirety of backup history is problematic for you. We probably need to 
largely refactor getAncestors and/or canCoverImage

> Ancestor calculation of backups is wrong
> 
>
> Key: HBASE-28562
> URL: https://issues.apache.org/jira/browse/HBASE-28562
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore
>Affects Versions: 2.6.0, 3.0.0
>Reporter: Dieter De Paepe
>Priority: Major
>  Labels: pull-request-available
>
> This is the same issue as HBASE-25870, but I think the fix there was wrong.
> This issue can prevent creation of (incremental) backups when data of 
> unrelated backups was damaged on backup storage.
> Minimal example to reproduce from source:
>  * Add following to `conf/hbase-site.xml` to enable backups:
> {code:java}
> 
> hbase.backup.enable
> true
>   
>   
> hbase.master.logcleaner.plugins
> 
> org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner
>   
>   
> hbase.procedure.master.classes
> 
> org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager
>   
>   
> hbase.procedure.regionserver.classes
> 
> org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager
>   
>   
>   hbase.coprocessor.region.classes
>   org.apache.hadoop.hbase.backup.BackupObserver
> 
>   
> hbase.fs.tmp.dir
> file:/tmp/hbase-tmp
>{code}
>  * Start HBase and open a shell: {{{}bin/start-hbase.sh{}}}, {{bin/hbase 
> shell}}
>  * Execute following commands ("put" & "create" commands in hbase shell, 
> other commands in commandline):
>  * 
> {code:java}
> create 'experiment', 'fam' 
> put 'experiment', 'row1', 'fam:b', 'value1'
> bin/hbase backup create full file:/tmp/hbasebackup
> Backup session backup_1714649896776 finished. Status: SUCCESS
> put 'experiment', 'row2', 'fam:b', 'value2'
> bin/hbase backup create incremental file:/tmp/hbasebackup
> Backup session backup_1714649920488 finished. Status: SUCCESS
> put 'experiment', 'row3', 'fam:b', 'value3'
> bin/hbase backup create incremental file:/tmp/hbasebackup
> Backup session backup_1714650054960 finished. Status: SUCCESS
> (Delete the files corresponding to the first incremental backup - 
> backup_1714649920488 in this example)
> put 'experiment', 'row4', 'fam:a', 'value4'
> bin/hbase backup create full file:/tmp/hbasebackup
> Backup session backup_1714650236911 finished. Status: SUCCESS
> put 'experiment', 'row5', 'fam:a', 'value5'
> bin/hbase backup create incremental file:/tmp/hbasebackup
> Backup session backup_1714650289957 finished. Status: SUCCESS
> put 'experiment', 'row6', 'fam:a', 'value6'
> bin/hbase backup create incremental 
> file:/tmp/hbasebackup2024-05-02T13:45:27,534 ERROR [main {}] 
> impl.BackupManifest: file:/tmp/hbasebackup/backup_1714649920488 does not exist
> 2024-05-02T13:45:27,534 ERROR [main {}] impl.TableBackupClient: Unexpected 
> Exception : file:/tmp/hbasebackup/backup_1714649920488 does not exist
> org.apache.hadoop.hbase.backup.impl.BackupException: 

[jira] [Created] (HBASE-28513) Secondary replica balancing squashes all other cost considerations

2024-04-10 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28513:
-

 Summary: Secondary replica balancing squashes all other cost 
considerations
 Key: HBASE-28513
 URL: https://issues.apache.org/jira/browse/HBASE-28513
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


I have a larger write up available 
[here.|https://git.hubteam.com/gist/rmattingly/8bc9cbe7c422db12ffc9cd1825069bd7]

Basically there are a few cost functions with relatively huge default 
multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default 
multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a 
multiplier of 5. Having any multiplier of 100k, while others are single digit, 
basically makes the latter category totally irrelevant from balancer 
considerations.

I understand that it's critical to distribute a region's replicas across 
multiple hosts/racks, but I don't think we should do this at the expense of all 
other balancer considerations.

For example, maybe we could have two types of balancer considerations: costs 
(as we do now), and conditionals (for the more discrete considerations, like 
">1 replica of the same region should not exist on a single host"). This would 
allow us to prioritize replica distribution _and_ maintain consideration for 
things like storefile balance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28429) Quotas should have a configurable minimum wait interval

2024-03-31 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly resolved HBASE-28429.
---
Resolution: Won't Fix

https://issues.apache.org/jira/browse/HBASE-28453 is a better solution

> Quotas should have a configurable minimum wait interval
> ---
>
> Key: HBASE-28429
> URL: https://issues.apache.org/jira/browse/HBASE-28429
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> At my day job we're attempting to rollout read size throttling by default for 
> thousands of distinct users across hundreds of multi-tenant clusters.
> During our rollout we've observed that throttles with a 1 second refill 
> interval will yield relatively tiny wait intervals disproportionately often. 
> From what we've seen, wait intervals are <=5ms on approximately 20-50% of our 
> RpcThrottlingExceptions; this could sound theoretically promising if latency 
> is your top priority. But, in reality, this makes it very difficult to 
> configure a throttle tolerant HBase client because retries become very prone 
> to near-immediate exhaustion, and throttled clients quickly saturate the 
> cluster's RPC layer with rapid-fire retries.
> One can combat this with the FixedIntervalRateLimiter, but that's a very 
> heavy handed approach from latency's perspective, and can still yield tiny 
> intervals that exhaust retries and erroneously fail client operations under 
> significant load.
> With this in mind, I'm proposing that we introduce a configurable minimum 
> wait interval for quotas, defaulted to 0. This would make quotas much more 
> usable at scale from our perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28453) Support a middle ground between the Average and Fixed interval rate limiters

2024-03-21 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly updated HBASE-28453:
--
Description: 
h3. Background

HBase quotas support two rate limiters: a "fixed" and an "average" interval 
rate limiter.
h4. FixedIntervalRateLimiter

The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, 
and it refills a resource allotment on the recurring interval. So you may get 
10 resources every second, and if you exhaust all 10 resources in the first 
millisecond of an interval then you will need to wait 999ms to acquire even 1 
more resource.
h4. AverageIntervalRateLimiter

The average interval rate limiter, HBase's default, allows for more flexibly 
timed refilling of the resource allotment. Extending our previous example, say 
you have a 10 reads/sec quota and you have exhausted all 10 resources within 
1ms of the last full refill. If you request 1 more read then, rather than 
returning a 999ms wait interval indicating the next full refill time, the rate 
limiter will recognize that you only need to wait 99ms before 1 read can be 
available. After 100ms has passed in aggregate since the last full refill, it 
will support the refilling of 1/10th the limit to facilitate the request for 
1/10th the resources.
h3. The Problems with Current RateLimiters

The problem with the fixed interval rate limiter is that it is too strict from 
a latency perspective. It results in quota limits to which we cannot fully 
subscribe with any consistency.

The problem with the average interval rate limiter is that, in practice, it is 
far too optimistic. For example, a real rate limiter might limit to 100MB/sec 
of read IO per machine. Any multigets that come in will require only a tiny 
fraction of this limit; for example, a 64kb block is only 0.06% of the total. 
As a result, the vast majority of wait intervals end up being tiny — like <5ms. 
This can actually cause an inverse of your intention, where setting up a 
throttle causes a DDOS of your RPC layer via continuous throttling and 
~immediate retrying. I've discussed this problem in 
https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait 
interval as the solution there; after some more thinking, I believe this new 
rate limiter would be a less hacky solution to this deficit so I'd like to 
close that Jira in favor of this one.

See the attached chart where I put in place a 10k req/sec/machine throttle for 
this user at 10:43 to try to curb this high traffic, and it resulted in a huge 
spike of req/sec due to the throttle/retry loop created by the 
AverageIntervalRateLimiter.
h3. Original Proposal: PartialIntervalRateLimiter as a Solution

I've implemented a RateLimiter which allows for partial chunks of the overall 
interval to be refilled, by default these chunks are 10% (or 100ms of a 1s 
interval). I've deployed this to a test cluster at my day job and have seen 
this really help our ability to full subscribe to a quota limit without 
executing superfluous retries. See the other attached chart which shows a 
cluster undergoing a rolling restart from using FixedIntervalRateLimiter to my 
new PartialIntervalRateLimiter and how it is then able to fully subscribe to 
its allotted 25MB/sec/machine read IO quota.
h3. Updated Proposal: Improving FixedIntervalRateLimiter

Rather than implement a new rate limiter, we can make a lower touch change 
which just adds support for a refill interval that is less than the time unit 
on a FixedIntervalRateLimiter. This can be a no-op change for those who have 
not opted into the feature by having the refill interval default to the time 
unit. For clarity, see [my branch 
here|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-28453]
 which I will PR soon

  was:
h3. Background

HBase quotas support two rate limiters: a "fixed" and an "average" interval 
rate limiter.
h4. FixedIntervalRateLimiter

The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, 
and it refills a resource allotment on the recurring interval. So you may get 
10 resources every second, and if you exhaust all 10 resources in the first 
millisecond of an interval then you will need to wait 999ms to acquire even 1 
more resource.
h4. AverageIntervalRateLimiter

The average interval rate limiter, HBase's default, allows for more flexibly 
timed refilling of the resource allotment. Extending our previous example, say 
you have a 10 reads/sec quota and you have exhausted all 10 resources within 
1ms of the last full refill. If you request 1 more read then, rather than 
returning a 999ms wait interval indicating the next full refill time, the rate 
limiter will recognize that you only need to wait 99ms before 1 read can be 
available. After 100ms has passed in aggregate since the last full refill, it 
will support the refilling of 1/10th the limit to facilitate the request for 

[jira] [Assigned] (HBASE-28453) Support a middle ground between the Average and Fixed interval rate limiters

2024-03-21 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28453:
-

Assignee: Ray Mattingly

> Support a middle ground between the Average and Fixed interval rate limiters
> 
>
> Key: HBASE-28453
> URL: https://issues.apache.org/jira/browse/HBASE-28453
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
> Attachments: Screenshot 2024-03-21 at 2.08.51 PM.png, Screenshot 
> 2024-03-21 at 2.30.01 PM.png
>
>
> h3. Background
> HBase quotas support two rate limiters: a "fixed" and an "average" interval 
> rate limiter.
> h4. FixedIntervalRateLimiter
> The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, 
> and it refills a resource allotment on the recurring interval. So you may get 
> 10 resources every second, and if you exhaust all 10 resources in the first 
> millisecond of an interval then you will need to wait 999ms to acquire even 1 
> more resource.
> h4. AverageIntervalRateLimiter
> The average interval rate limiter, HBase's default, allows for more flexibly 
> timed refilling of the resource allotment. Extending our previous example, 
> say you have a 10 reads/sec quota and you have exhausted all 10 resources 
> within 1ms of the last full refill. If you request 1 more read then, rather 
> than returning a 999ms wait interval indicating the next full refill time, 
> the rate limiter will recognize that you only need to wait 99ms before 1 read 
> can be available. After 100ms has passed in aggregate since the last full 
> refill, it will support the refilling of 1/10th the limit to facilitate the 
> request for 1/10th the resources.
> h3. The Problems with Current RateLimiters
> The problem with the fixed interval rate limiter is that it is too strict 
> from a latency perspective. It results in quota limits to which we cannot 
> fully subscribe with any consistency.
> The problem with the average interval rate limiter is that, in practice, it 
> is far too optimistic. For example, a real rate limiter might limit to 
> 100MB/sec of read IO per machine. Any multigets that come in will require 
> only a tiny fraction of this limit; for example, a 64kb block is only 0.06% 
> of the total. As a result, the vast majority of wait intervals end up being 
> tiny — like <5ms. This can actually cause an inverse of your intention, where 
> setting up a throttle causes a DDOS of your RPC layer via continuous 
> throttling and ~immediate retrying. I've discussed this problem in 
> https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait 
> interval as the solution there; after some more thinking, I believe this new 
> rate limiter would be a less hacky solution to this deficit so I'd like to 
> close that Jira in favor of this one.
> See the attached chart where I put in place a 10k req/sec/machine throttle 
> for this user at 10:43 to try to curb this high traffic, and it resulted in a 
> huge spike of req/sec due to the throttle/retry loop created by the 
> AverageIntervalRateLimiter.
> h3. PartialIntervalRateLimiter as a Solution
> I've implemented a RateLimiter which allows for partial chunks of the overall 
> interval to be refilled, by default these chunks are 10% (or 100ms of a 1s 
> interval). I've deployed this to a test cluster at my day job and have seen 
> this really help our ability to full subscribe to a quota limit without 
> executing superfluous retries. See the other attached chart which shows a 
> cluster undergoing a rolling restart from using FixedIntervalRateLimiter to 
> my new PartialIntervalRateLimiter and how it is then able to fully subscribe 
> to its allotted 25MB/sec/machine read IO quota.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28429) Quotas should have a configurable minimum wait interval

2024-03-21 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829667#comment-17829667
 ] 

Ray Mattingly commented on HBASE-28429:
---

I think we should close this and instead favor 
https://issues.apache.org/jira/browse/HBASE-28453

> Quotas should have a configurable minimum wait interval
> ---
>
> Key: HBASE-28429
> URL: https://issues.apache.org/jira/browse/HBASE-28429
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> At my day job we're attempting to rollout read size throttling by default for 
> thousands of distinct users across hundreds of multi-tenant clusters.
> During our rollout we've observed that throttles with a 1 second refill 
> interval will yield relatively tiny wait intervals disproportionately often. 
> From what we've seen, wait intervals are <=5ms on approximately 20-50% of our 
> RpcThrottlingExceptions; this could sound theoretically promising if latency 
> is your top priority. But, in reality, this makes it very difficult to 
> configure a throttle tolerant HBase client because retries become very prone 
> to near-immediate exhaustion, and throttled clients quickly saturate the 
> cluster's RPC layer with rapid-fire retries.
> One can combat this with the FixedIntervalRateLimiter, but that's a very 
> heavy handed approach from latency's perspective, and can still yield tiny 
> intervals that exhaust retries and erroneously fail client operations under 
> significant load.
> With this in mind, I'm proposing that we introduce a configurable minimum 
> wait interval for quotas, defaulted to 0. This would make quotas much more 
> usable at scale from our perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28453) Support a middle ground between the Average and Fixed interval rate limiters

2024-03-21 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28453:
-

 Summary: Support a middle ground between the Average and Fixed 
interval rate limiters
 Key: HBASE-28453
 URL: https://issues.apache.org/jira/browse/HBASE-28453
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Ray Mattingly
 Attachments: Screenshot 2024-03-21 at 2.08.51 PM.png, Screenshot 
2024-03-21 at 2.30.01 PM.png

h3. Background

HBase quotas support two rate limiters: a "fixed" and an "average" interval 
rate limiter.
h4. FixedIntervalRateLimiter

The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, 
and it refills a resource allotment on the recurring interval. So you may get 
10 resources every second, and if you exhaust all 10 resources in the first 
millisecond of an interval then you will need to wait 999ms to acquire even 1 
more resource.
h4. AverageIntervalRateLimiter

The average interval rate limiter, HBase's default, allows for more flexibly 
timed refilling of the resource allotment. Extending our previous example, say 
you have a 10 reads/sec quota and you have exhausted all 10 resources within 
1ms of the last full refill. If you request 1 more read then, rather than 
returning a 999ms wait interval indicating the next full refill time, the rate 
limiter will recognize that you only need to wait 99ms before 1 read can be 
available. After 100ms has passed in aggregate since the last full refill, it 
will support the refilling of 1/10th the limit to facilitate the request for 
1/10th the resources.
h3. The Problems with Current RateLimiters

The problem with the fixed interval rate limiter is that it is too strict from 
a latency perspective. It results in quota limits to which we cannot fully 
subscribe with any consistency.

The problem with the average interval rate limiter is that, in practice, it is 
far too optimistic. For example, a real rate limiter might limit to 100MB/sec 
of read IO per machine. Any multigets that come in will require only a tiny 
fraction of this limit; for example, a 64kb block is only 0.06% of the total. 
As a result, the vast majority of wait intervals end up being tiny — like <5ms. 
This can actually cause an inverse of your intention, where setting up a 
throttle causes a DDOS of your RPC layer via continuous throttling and 
~immediate retrying. I've discussed this problem in 
https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait 
interval as the solution there; after some more thinking, I believe this new 
rate limiter would be a less hacky solution to this deficit so I'd like to 
close that Jira in favor of this one.

See the attached chart where I put in place a 10k req/sec/machine throttle for 
this user at 10:43 to try to curb this high traffic, and it resulted in a huge 
spike of req/sec due to the throttle/retry loop created by the 
AverageIntervalRateLimiter.
h3. PartialIntervalRateLimiter as a Solution

I've implemented a RateLimiter which allows for partial chunks of the overall 
interval to be refilled, by default these chunks are 10% (or 100ms of a 1s 
interval). I've deployed this to a test cluster at my day job and have seen 
this really help our ability to full subscribe to a quota limit without 
executing superfluous retries. See the other attached chart which shows a 
cluster undergoing a rolling restart from using FixedIntervalRateLimiter to my 
new PartialIntervalRateLimiter and how it is then able to fully subscribe to 
its allotted 25MB/sec/machine read IO quota.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28430) RpcThrottlingException messages should describe the throttled access pattern

2024-03-08 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28430:
-

Assignee: Ray Mattingly

> RpcThrottlingException messages should describe the throttled access pattern
> 
>
> Key: HBASE-28430
> URL: https://issues.apache.org/jira/browse/HBASE-28430
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> Right now we catch RpcThrottlingExceptions and have some debug logging in 
> [RegionServerRpcQuotaManager|https://github.com/apache/hbase/blob/98eb3e01b352684de3c647a6fda6208a657c4607/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RegionServerRpcQuotaManager.java#L234-L236]
>  — this is okay support for in depth explanations, but is not trivially 
> transparent to users.
> For example, at my day job we have proxy APIs which sit between HBase and 
> many microservices. We throttle these microservices in isolation, but the 
> RpcThrottlingExceptions appear to be indiscriminate in the stdout of the 
> proxy API.
> If we added the given username, table, and namespace to 
> RpcThrottlingException messages then understanding the nature and specificity 
> of any given throttle violation should be much more straightforward. Given 
> that quotas/throttling is most useful in a multi-tenant environment, I would 
> anticipate this being a pretty universal usability pain point.
> It would be a bit more complicated, but we should also consider including 
> more information about the rate limiter which has been violated. For example, 
> what is the current configured read size limit that we've exceeded?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28430) RpcThrottlingException messages should describe the throttled access pattern

2024-03-08 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28430:
-

 Summary: RpcThrottlingException messages should describe the 
throttled access pattern
 Key: HBASE-28430
 URL: https://issues.apache.org/jira/browse/HBASE-28430
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


Right now we catch RpcThrottlingExceptions and have some debug logging in 
[RegionServerRpcQuotaManager|https://github.com/apache/hbase/blob/98eb3e01b352684de3c647a6fda6208a657c4607/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RegionServerRpcQuotaManager.java#L234-L236]
 — this is okay support for in depth explanations, but is not trivially 
transparent to users.

For example, at my day job we have proxy APIs which sit between HBase and many 
microservices. We throttle these microservices in isolation, but the 
RpcThrottlingExceptions appear to be indiscriminate in the stdout of the proxy 
API.

If we added the given username, table, and namespace to RpcThrottlingException 
messages then understanding the nature and specificity of any given throttle 
violation should be much more straightforward. Given that quotas/throttling is 
most useful in a multi-tenant environment, I would anticipate this being a 
pretty universal usability pain point.

It would be a bit more complicated, but we should also consider including more 
information about the rate limiter which has been violated. For example, what 
is the current configured read size limit that we've exceeded?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28429) Quotas should have a configurable minimum wait interval

2024-03-08 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28429:
-

Assignee: Ray Mattingly

> Quotas should have a configurable minimum wait interval
> ---
>
> Key: HBASE-28429
> URL: https://issues.apache.org/jira/browse/HBASE-28429
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> At my day job we're attempting to rollout read size throttling by default for 
> thousands of distinct users across hundreds of multi-tenant clusters.
> During our rollout we've observed that throttles with a 1 second refill 
> interval will yield relatively tiny wait intervals disproportionately often. 
> From what we've seen, wait intervals are <=5ms on approximately 20-50% of our 
> RpcThrottlingExceptions; this could sound theoretically promising if latency 
> is your top priority. But, in reality, this makes it very difficult to 
> configure a throttle tolerant HBase client because retries become very prone 
> to near-immediate exhaustion, and throttled clients quickly saturate the 
> cluster's RPC layer with rapid-fire retries.
> One can combat this with the FixedIntervalRateLimiter, but that's a very 
> heavy handed approach from latency's perspective, and can still yield tiny 
> intervals that exhaust retries and erroneously fail client operations under 
> significant load.
> With this in mind, I'm proposing that we introduce a configurable minimum 
> wait interval for quotas, defaulted to 0. This would make quotas much more 
> usable at scale from our perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28429) Quotas should have a configurable minimum wait interval

2024-03-08 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28429:
-

 Summary: Quotas should have a configurable minimum wait interval
 Key: HBASE-28429
 URL: https://issues.apache.org/jira/browse/HBASE-28429
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


At my day job we're attempting to rollout read size throttling by default for 
thousands of distinct users across hundreds of multi-tenant clusters.

During our rollout we've observed that throttles with a 1 second refill 
interval will yield relatively tiny wait intervals disproportionately often. 
From what we've seen, wait intervals are <=5ms on approximately 20-50% of our 
RpcThrottlingExceptions; this could sound theoretically promising if latency is 
your top priority. But, in reality, this makes it very difficult to configure a 
throttle tolerant HBase client because retries become very prone to 
near-immediate exhaustion, and throttled clients quickly saturate the cluster's 
RPC layer with rapid-fire retries.

One can combat this with the FixedIntervalRateLimiter, but that's a very heavy 
handed approach from latency's perspective, and can still yield tiny intervals 
that exhaust retries and erroneously fail client operations under significant 
load.

With this in mind, I'm proposing that we introduce a configurable minimum wait 
interval for quotas, defaulted to 0. This would make quotas much more usable at 
scale from our perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28385) Quota estimates are too optimistic for large scans

2024-02-20 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28385:
-

 Summary: Quota estimates are too optimistic for large scans
 Key: HBASE-28385
 URL: https://issues.apache.org/jira/browse/HBASE-28385
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly
 Fix For: 2.6.0


Let's say you're running a table scan with a throttle of 100MB/sec per 
RegionServer. Ideally your scans are going to pull down large results, often 
containing hundreds or thousands of blocks.

You will estimate each scan as costing a single block of read capacity, and if 
your quota is already exhausted then the server will evaluate the backoff 
required for your estimated consumption (1 block) to be available. This will 
often be ~1ms, causing your retries to basically be immediate.

Obviously it will routinely take much longer than 1ms for 100MB of IO to become 
available in the given configuration, so your retries will be destined to fail. 
At worst this can cause a saturation of your server's RPC layer, and at best 
this causes erroneous exhaustion of the client's retries.

We should find a way to make these estimates a bit smarter for large scans.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28385) Quota estimates are too optimistic for large scans

2024-02-20 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28385:
-

Assignee: Ray Mattingly

> Quota estimates are too optimistic for large scans
> --
>
> Key: HBASE-28385
> URL: https://issues.apache.org/jira/browse/HBASE-28385
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
> Fix For: 2.6.0
>
>
> Let's say you're running a table scan with a throttle of 100MB/sec per 
> RegionServer. Ideally your scans are going to pull down large results, often 
> containing hundreds or thousands of blocks.
> You will estimate each scan as costing a single block of read capacity, and 
> if your quota is already exhausted then the server will evaluate the backoff 
> required for your estimated consumption (1 block) to be available. This will 
> often be ~1ms, causing your retries to basically be immediate.
> Obviously it will routinely take much longer than 1ms for 100MB of IO to 
> become available in the given configuration, so your retries will be destined 
> to fail. At worst this can cause a saturation of your server's RPC layer, and 
> at best this causes erroneous exhaustion of the client's retries.
> We should find a way to make these estimates a bit smarter for large scans.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28370) Default user quotas are refreshing too frequently

2024-02-15 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28370:
-

 Summary: Default user quotas are refreshing too frequently
 Key: HBASE-28370
 URL: https://issues.apache.org/jira/browse/HBASE-28370
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


In [https://github.com/apache/hbase/pull/5666] we introduced default user 
quotas, but I accidentally called UserQuotaState's default constructor rather 
than passing in the current timestamp. The consequence is that we're constantly 
refreshing these default user quotas, and this can be a bottleneck for 
horizontal cluster scalability.

This should be a 1 line fix in QuotaUtil's buildDefaultUserQuotaState method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28370) Default user quotas are refreshing too frequently

2024-02-15 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28370:
-

Assignee: Ray Mattingly

> Default user quotas are refreshing too frequently
> -
>
> Key: HBASE-28370
> URL: https://issues.apache.org/jira/browse/HBASE-28370
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> In [https://github.com/apache/hbase/pull/5666] we introduced default user 
> quotas, but I accidentally called UserQuotaState's default constructor rather 
> than passing in the current timestamp. The consequence is that we're 
> constantly refreshing these default user quotas, and this can be a bottleneck 
> for horizontal cluster scalability.
> This should be a 1 line fix in QuotaUtil's buildDefaultUserQuotaState method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28359) Improve quota RateLimiter synchronization

2024-02-14 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28359:
-

Assignee: Ray Mattingly

> Improve quota RateLimiter synchronization
> -
>
> Key: HBASE-28359
> URL: https://issues.apache.org/jira/browse/HBASE-28359
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> We've been experiencing RpcThrottlingException with 0ms waitInterval. This 
> seems odd and wasteful, since the client side will immediately retry without 
> backoff. I think the problem is related to the synchronization of RateLimiter.
> The TimeBasedLimiter checkQuota method does the following:
> {code:java}
> if (!reqSizeLimiter.canExecute(estimateWriteSize + estimateReadSize)) {
>   RpcThrottlingException.throwRequestSizeExceeded(
> reqSizeLimiter.waitInterval(estimateWriteSize + estimateReadSize));
> } {code}
> Both canExecute and waitInterval are synchronized, but we're calling them 
> independently. So it's possible under high concurrency for canExecute to 
> return false, but then waitInterval returns 0 (would have been true)
> I think we should simplify the API to have a single synchronized call:
> {code:java}
> long waitInterval = reqSizeLimiter.tryAcquire(estimateWriteSize + 
> estimateReadSize);
> if (waitInterval > 0) {
>   RpcThrottlingException.throwRequestSizeExceeded(waitInterval);
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28349) Atomic requests should increment read usage in quotas

2024-02-07 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28349:
-

 Summary: Atomic requests should increment read usage in quotas
 Key: HBASE-28349
 URL: https://issues.apache.org/jira/browse/HBASE-28349
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


Right now atomic operations are just treated as a single write from the quota 
perspective. Since an atomic operation also encompasses a read, it would make 
sense to increment readNum and readSize counts appropriately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28349) Atomic requests should increment read usage in quotas

2024-02-07 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28349:
-

Assignee: Ray Mattingly

> Atomic requests should increment read usage in quotas
> -
>
> Key: HBASE-28349
> URL: https://issues.apache.org/jira/browse/HBASE-28349
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> Right now atomic operations are just treated as a single write from the quota 
> perspective. Since an atomic operation also encompasses a read, it would make 
> sense to increment readNum and readSize counts appropriately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28346) Expose checkQuota to Coprocessor Endpoints

2024-02-06 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28346:
-

Assignee: Ray Mattingly

> Expose checkQuota to Coprocessor Endpoints
> --
>
> Key: HBASE-28346
> URL: https://issues.apache.org/jira/browse/HBASE-28346
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Coprocessor endpoints may do non-trivial amounts of work, yet quotas do not 
> throttle them. We can't generically apply quotas to coprocessors because we 
> have no information on what a particular endpoint might do. One thing we 
> could do is expose checkQuota to the RegionCoprocessorEnvironment. This way, 
> coprocessor authors have the tools to ensure that quotas cover their 
> implementations.
> While adding this, we can update AggregationImplementation to call checkQuota 
> since those endpoints can be quite expensive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27800) Add support for default user quotas using USER => 'all'

2024-02-06 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814887#comment-17814887
 ] 

Ray Mattingly commented on HBASE-27800:
---

I've decided to add support here via some new configuration options, and to 
only support a handful of user throttles. PR coming soon

> Add support for default user quotas using USER => 'all' 
> 
>
> Key: HBASE-27800
> URL: https://issues.apache.org/jira/browse/HBASE-27800
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> If someone sets a quota with USER => 'all' (or maybe '*'), treat that as a 
> default quota for each individual user. When a request comes from a user, it 
> will lookup current QuotaState based on username. If one doesn't exist, it 
> will be pre-filled with whatever the 'all' quota was set to. Otherwise, if 
> you then define a quota for a specific user that will override whatever 
> default you have set for that user only.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28215) Region reopen procedure should support some sort of throttling

2023-11-22 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28215:
-

Assignee: Ray Mattingly

> Region reopen procedure should support some sort of throttling
> --
>
> Key: HBASE-28215
> URL: https://issues.apache.org/jira/browse/HBASE-28215
> Project: HBase
>  Issue Type: Improvement
>  Components: master, proc-v2
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> The mass reopening of regions caused by a table descriptor modification can 
> be quite disruptive. For latency/error sensitive workloads, like our user 
> facing traffic, we need to be very careful about when we modify table 
> descriptors, and it can be virtually impossible to do it painlessly for busy 
> tables.
> It would be nice if we supported configurable batching/throttling of 
> reopenings so that the amplitude of any disruption can be kept relatively 
> small.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28215) Region reopen procedure should support some sort of throttling

2023-11-22 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28215:
-

 Summary: Region reopen procedure should support some sort of 
throttling
 Key: HBASE-28215
 URL: https://issues.apache.org/jira/browse/HBASE-28215
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


The mass reopening of regions caused by a table descriptor modification can be 
quite disruptive. For latency/error sensitive workloads, like our user facing 
traffic, we need to be very careful about when we modify table descriptors, and 
it can be virtually impossible to do it painlessly for busy tables.

It would be nice if we supported configurable batching/throttling of reopenings 
so that the amplitude of any disruption can be kept relatively small.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-28175) RpcLogDetails' Message can become corrupt before log is consumed

2023-10-25 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779686#comment-17779686
 ] 

Ray Mattingly edited comment on HBASE-28175 at 10/25/23 9:54 PM:
-

I believe I've confirmed that deep copying the Message field in the 
RpcLogDetails' constructor is an effective solution here. I'm heading on 
vacation soon, but will open up a PR here either tomorrow before I leave, or 
later next week when I'm back.


was (Author: JIRAUSER286879):
I believe I've confirmed that deep copying the RpcLogDetails Message field is 
an effective solution here. I'm heading on vacation soon, but will open up a PR 
here either tomorrow before I leave, or later next week when I'm back.

> RpcLogDetails' Message can become corrupt before log is consumed
> 
>
> Key: HBASE-28175
> URL: https://issues.apache.org/jira/browse/HBASE-28175
> Project: HBase
>  Issue Type: Bug
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> The RpcLogDetails class represents a slow (or large) log event which will 
> later be consumed by the SlowLogQueueService.
> Right now the RpcLogDetails' param field points to the slow call's Message, 
> and this Message is backed by a CodedInputStream which may be overwritten 
> before the given log is consumed. This overwriting of the Message may result 
> in slow query payloads for which the metadata derived post-consumption is 
> inaccurate.
> To solve this bug I think we need to copy the Message in the RpcLogDetails 
> constructor. I have this bug reproduced in a QA environment and will test out 
> this idea and open a PR shortly if the test results are promising.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28175) RpcLogDetails' Message can become corrupt before log is consumed

2023-10-25 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779686#comment-17779686
 ] 

Ray Mattingly commented on HBASE-28175:
---

I believe I've confirmed that deep copying the RpcLogDetails Message field is 
an effective solution here. I'm heading on vacation soon, but will open up a PR 
here either tomorrow before I leave, or later next week when I'm back.

> RpcLogDetails' Message can become corrupt before log is consumed
> 
>
> Key: HBASE-28175
> URL: https://issues.apache.org/jira/browse/HBASE-28175
> Project: HBase
>  Issue Type: Bug
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> The RpcLogDetails class represents a slow (or large) log event which will 
> later be consumed by the SlowLogQueueService.
> Right now the RpcLogDetails' param field points to the slow call's Message, 
> and this Message is backed by a CodedInputStream which may be overwritten 
> before the given log is consumed. This overwriting of the Message may result 
> in slow query payloads for which the metadata derived post-consumption is 
> inaccurate.
> To solve this bug I think we need to copy the Message in the RpcLogDetails 
> constructor. I have this bug reproduced in a QA environment and will test out 
> this idea and open a PR shortly if the test results are promising.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28175) RpcLogDetails' Message can become corrupt before log is consumed

2023-10-23 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28175:
-

 Summary: RpcLogDetails' Message can become corrupt before log is 
consumed
 Key: HBASE-28175
 URL: https://issues.apache.org/jira/browse/HBASE-28175
 Project: HBase
  Issue Type: Bug
Reporter: Ray Mattingly
Assignee: Ray Mattingly


The RpcLogDetails class represents a slow (or large) log event which will later 
be consumed by the SlowLogQueueService.

Right now the RpcLogDetails' param field points to the slow call's Message, and 
this Message is backed by a CodedInputStream which may be overwritten before 
the given log is consumed. This overwriting of the Message may result in slow 
query payloads for which the metadata derived post-consumption is inaccurate.

To solve this bug I think we need to copy the Message in the RpcLogDetails 
constructor. I have this bug reproduced in a QA environment and will test out 
this idea and open a PR shortly if the test results are promising.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28146) ServerManager's rsAdmins map should be thread safe

2023-10-11 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28146:
-

 Summary: ServerManager's rsAdmins map should be thread safe
 Key: HBASE-28146
 URL: https://issues.apache.org/jira/browse/HBASE-28146
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.5.5
Reporter: Ray Mattingly
Assignee: Ray Mattingly


On 2.x [the ServerManager registers admins in a 
HashMap|https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java].
 This can result in thread safety issues — we recently observed an exception 
which caused a region to be indefinitely stuck in transition until we could 
manually intervene. We saw the following exception in the HMaster logs:
{code:java}
2023-10-11 02:20:05.213 [RSProcedureDispatcher-pool-325] ERROR 
org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher: Unexpected 
error caught, this may cause the procedure to hang forever
    java.lang.ClassCastException: class java.util.HashMap$Node cannot be cast 
to class java.util.HashMap$TreeNode (java.util.HashMap$Node and 
java.util.HashMap$TreeNode are in module java.base of loader 'bootstrap')
        at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1900) ~[?:?]
        at java.util.HashMap$TreeNode.treeify(HashMap.java:2016) ~[?:?]
        at java.util.HashMap.treeifyBin(HashMap.java:768) ~[?:?]
        at java.util.HashMap.putVal(HashMap.java:640) ~[?:?]
        at java.util.HashMap.put(HashMap.java:608) ~[?:?]
        at 
org.apache.hadoop.hbase.master.ServerManager.getRsAdmin(ServerManager.java:723){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27800) Add support for default user quotas using USER => 'all'

2023-10-03 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771554#comment-17771554
 ] 

Ray Mattingly commented on HBASE-27800:
---

A small detail, I think we should prefer 'all' to '*' as our wildcard here 
because the precedent has already been set for RegionServer quotas.

> Add support for default user quotas using USER => 'all' 
> 
>
> Key: HBASE-27800
> URL: https://issues.apache.org/jira/browse/HBASE-27800
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> If someone sets a quota with USER => 'all' (or maybe '*'), treat that as a 
> default quota for each individual user. When a request comes from a user, it 
> will lookup current QuotaState based on username. If one doesn't exist, it 
> will be pre-filled with whatever the 'all' quota was set to. Otherwise, if 
> you then define a quota for a specific user that will override whatever 
> default you have set for that user only.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27784) Custom quota groupings

2023-09-20 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766940#comment-17766940
 ] 

Ray Mattingly edited comment on HBASE-27784 at 9/20/23 12:16 PM:
-

It isn't exactly what's described in the issue, but I want to propose [this 
draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft].
 Basically we add support for a request attribute {{quota.user.override}} 
which, when configured, takes precedent when determining which user quota to 
apply to the given request. This allows for us to throttle distinct requests 
from a shared connection as is demonstrated by the added unit test.

One could achieve the "quota group" idea described above by submitting hadoop 
jobs as a single user override (e.g., {{{}hadoop{}}}). It could also satisfy 
upstream caller distinctions within a proxy API's shared connection object by 
configuring the user override based on some identifying characteristic of the 
upstream caller.

It also implicitly solves the conflict between user and group quotas because 
they're one in the same here — it's the requests that are different.


was (Author: JIRAUSER286879):
It isn't exactly what's described in the issue, but I want to propose [this 
draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft].
 Basically we add support for a request attribute {{quota.user.override}} 
which, when configured, takes precedent when determining which user quota to 
apply to the given request. This allows for us to throttle distinct requests 
from a shared connection as is demonstrated by the added unit test.

One could achieve the "quota group" idea described above by submitting hadoop 
jobs as a single user override (e.g., {{{}hadoop{}}}). It could also satisfy 
upstream caller distinctions within a proxy API's shared connection object by 
configuring the user override based on some identifying characteristic of the 
upstream caller.

> Custom quota groupings
> --
>
> Key: HBASE-27784
> URL: https://issues.apache.org/jira/browse/HBASE-27784
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we provide the ability to define quotas for namespaces, tables, or 
> users. On multi-tenant clusters, users may be broken down into groups based 
> on their use-case. For us this comes down to 2 main cases:
>  # Hadoop jobs – it would be good to be able to limit all hadoop jobs in 
> aggregate
>  # Proxy APIs - this is common where upstream callers don't hit hbase 
> directly, instead they go through one of many proxy api's.  For us we have a 
> custom auth plugin which sets the username to the upstream caller name. But 
> it would still be useful to be able to limit all usage from some particular 
> proxy API in aggregate.
> I think this could build upon the idea for Connection attributes in 
> HBASE-27657. Basically when a Connection is established we can set an 
> attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI).  In 
> QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also 
> allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => 
> 'hadoop', LIMIT => '100M/sec'}}
> I need to do more investigation into whether we'd want to return a simple 
> group limiter (more similar to table/namespace handling) or treat it more 
> like the USER limiters which returns a QuotaState (so you can limit 
> by-group-by-table).
> We need to consider how GROUP quotas interact with USER quotas. If a user has 
> a quota defined, and that user is also part of a group with a quota defined, 
> does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS 
> setting, similar to GLOBAL_BYPASS?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27784) Custom quota groupings

2023-09-19 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766940#comment-17766940
 ] 

Ray Mattingly edited comment on HBASE-27784 at 9/19/23 11:18 PM:
-

It isn't exactly what's described in the issue, but I want to propose [this 
draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft].
 Basically we add support for a request attribute {{quota.user.override}} 
which, when configured, takes precedent when determining which user quota to 
apply to the given request. This allows for us to throttle distinct requests 
from a shared connection as is demonstrated by the added unit test.

One could achieve the "quota group" idea described above by submitting hadoop 
jobs as a single user override (e.g., {{{}hadoop{}}}). It could also satisfy 
upstream caller distinctions within a proxy API's shared connection object by 
configuring the user override based on some identifying characteristic of the 
upstream caller.


was (Author: JIRAUSER286879):
It isn't exactly what's described in the issue, but I want to propose [this 
draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft].
 Basically we add support for a request attribute {{quota.user.override}} 
which, when configured, takes precedent when determining which user quota to 
apply to the given request. This allows for us to throttle distinct requests 
from a shared connection as is demonstrated by the added unit test.

> Custom quota groupings
> --
>
> Key: HBASE-27784
> URL: https://issues.apache.org/jira/browse/HBASE-27784
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we provide the ability to define quotas for namespaces, tables, or 
> users. On multi-tenant clusters, users may be broken down into groups based 
> on their use-case. For us this comes down to 2 main cases:
>  # Hadoop jobs – it would be good to be able to limit all hadoop jobs in 
> aggregate
>  # Proxy APIs - this is common where upstream callers don't hit hbase 
> directly, instead they go through one of many proxy api's.  For us we have a 
> custom auth plugin which sets the username to the upstream caller name. But 
> it would still be useful to be able to limit all usage from some particular 
> proxy API in aggregate.
> I think this could build upon the idea for Connection attributes in 
> HBASE-27657. Basically when a Connection is established we can set an 
> attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI).  In 
> QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also 
> allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => 
> 'hadoop', LIMIT => '100M/sec'}}
> I need to do more investigation into whether we'd want to return a simple 
> group limiter (more similar to table/namespace handling) or treat it more 
> like the USER limiters which returns a QuotaState (so you can limit 
> by-group-by-table).
> We need to consider how GROUP quotas interact with USER quotas. If a user has 
> a quota defined, and that user is also part of a group with a quota defined, 
> does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS 
> setting, similar to GLOBAL_BYPASS?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27784) Custom quota groupings

2023-09-19 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766940#comment-17766940
 ] 

Ray Mattingly commented on HBASE-27784:
---

It isn't exactly what's described in the issue, but I want to propose [this 
draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft].
 Basically we add support for a request attribute {{quota.user.override}} 
which, when configured, takes precedent when determining which user quota to 
apply to the given request. This allows for us to throttle distinct requests 
from a shared connection as is demonstrated by the added unit test.

> Custom quota groupings
> --
>
> Key: HBASE-27784
> URL: https://issues.apache.org/jira/browse/HBASE-27784
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we provide the ability to define quotas for namespaces, tables, or 
> users. On multi-tenant clusters, users may be broken down into groups based 
> on their use-case. For us this comes down to 2 main cases:
>  # Hadoop jobs – it would be good to be able to limit all hadoop jobs in 
> aggregate
>  # Proxy APIs - this is common where upstream callers don't hit hbase 
> directly, instead they go through one of many proxy api's.  For us we have a 
> custom auth plugin which sets the username to the upstream caller name. But 
> it would still be useful to be able to limit all usage from some particular 
> proxy API in aggregate.
> I think this could build upon the idea for Connection attributes in 
> HBASE-27657. Basically when a Connection is established we can set an 
> attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI).  In 
> QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also 
> allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => 
> 'hadoop', LIMIT => '100M/sec'}}
> I need to do more investigation into whether we'd want to return a simple 
> group limiter (more similar to table/namespace handling) or treat it more 
> like the USER limiters which returns a QuotaState (so you can limit 
> by-group-by-table).
> We need to consider how GROUP quotas interact with USER quotas. If a user has 
> a quota defined, and that user is also part of a group with a quota defined, 
> does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS 
> setting, similar to GLOBAL_BYPASS?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28010) Connection attributes can become corrupted on the server side

2023-08-15 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28010:
-

Assignee: Ray Mattingly

> Connection attributes can become corrupted on the server side
> -
>
> Key: HBASE-28010
> URL: https://issues.apache.org/jira/browse/HBASE-28010
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> In ServerRpcConnection.processOneRpc, it calls processConnectionHeader and 
> then immediately calls callCleanupIfNeeded. The parsing of the ByteBuff into 
> the ConnectionHeader does not copy the bytes. We keep a reference to 
> ConnectionHeader for later use, but since the underlying ByteBuff gets 
> released in callCleanupIfNeeded, later requests can override the memory 
> locations that the ConnectionHeader points at.
> The unit tests we added dont catch this possibly because they don't send 
> enough requests to corrupt the buffers. It happens pretty quickly in a 
> deployed cluster.
> We need to copy the List from the ConnectionHeader into a Map 
> before the buffer is released. This probably means we should remove 
> getConnectionHeader from the RpcCall interface and instead add 
> getConnectionAttributes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28002) Add Get, Mutate, and Multi operations to slow log params

2023-07-30 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28002:
-

Assignee: Ray Mattingly

> Add Get, Mutate, and Multi operations to slow log params
> 
>
> Key: HBASE-28002
> URL: https://issues.apache.org/jira/browse/HBASE-28002
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> In https://issues.apache.org/jira/browse/HBASE-27536 we added the ability to 
> include Scan operations in the slow log params. It would be useful to include 
> more operations too. Beyond just showing the shape of the request to slow log 
> readers, this would also ensure that operation attributes can be inferred.
> There are a few complications to consider for some operation types:
>  * Mutate:
>  ** we should probably strip the columns from these puts. Otherwise we might 
> produce unpredictably large slow log payloads, and there are potentially 
> security concerns to consider
>  * Multi
>  ** we should also consider stripping columns from these requests
>  ** (configurably?) limiting the number of operations that can be included. 
> For example, maybe we only want to include 5 operations on a slow log payload 
> for a 100 operation MultiRequest for the sake of brevity 
>  ** we may want to deduplicate operation attributes. I'm not really sure how 
> we'd do this without the output being misleading



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28002) Add Get, Mutate, and Multi operations to slow log params

2023-07-30 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28002:
-

 Summary: Add Get, Mutate, and Multi operations to slow log params
 Key: HBASE-28002
 URL: https://issues.apache.org/jira/browse/HBASE-28002
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


In https://issues.apache.org/jira/browse/HBASE-27536 we added the ability to 
include Scan operations in the slow log params. It would be useful to include 
more operations too. Beyond just showing the shape of the request to slow log 
readers, this would also ensure that operation attributes can be inferred.

There are a few complications to consider for some operation types:
 * Mutate:
 ** we should probably strip the columns from these puts. Otherwise we might 
produce unpredictably large slow log payloads, and there are potentially 
security concerns to consider
 * Multi
 ** we should also consider stripping columns from these requests
 ** (configurably?) limiting the number of operations that can be included. For 
example, maybe we only want to include 5 operations on a slow log payload for a 
100 operation MultiRequest for the sake of brevity 
 ** we may want to deduplicate operation attributes. I'm not really sure how 
we'd do this without the output being misleading



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28001) Add request attribute support to BufferedMutator

2023-07-28 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-28001:
-

 Summary: Add request attribute support to BufferedMutator
 Key: HBASE-28001
 URL: https://issues.apache.org/jira/browse/HBASE-28001
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


In https://issues.apache.org/jira/browse/HBASE-27657 we added support for 
specifying connection and request attributes. One oversight was including 
support for doing so via the BufferedMutator class. We should add such support 
in a follow up PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28001) Add request attribute support to BufferedMutator

2023-07-28 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-28001:
-

Assignee: Ray Mattingly

> Add request attribute support to BufferedMutator
> 
>
> Key: HBASE-28001
> URL: https://issues.apache.org/jira/browse/HBASE-28001
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>
> In https://issues.apache.org/jira/browse/HBASE-27657 we added support for 
> specifying connection and request attributes. One oversight was including 
> support for doing so via the BufferedMutator class. We should add such 
> support in a follow up PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27981) Add connection, request, and operation attributes to slow log

2023-07-20 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745279#comment-17745279
 ] 

Ray Mattingly commented on HBASE-27981:
---

Yeah agreed, we could definitely add single request operation attributes to the 
params

> Add connection, request, and operation attributes to slow log
> -
>
> Key: HBASE-27981
> URL: https://issues.apache.org/jira/browse/HBASE-27981
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> These can help users diagnose slow requests by pushing identifying 
> information into the log. It might make sense to union them into a single 
> field or put them in separate fields.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27981) Add connection, request, and operation attributes to slow log

2023-07-20 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745211#comment-17745211
 ] 

Ray Mattingly commented on HBASE-27981:
---

It should be trivial to add request & connection attributes to the slow logs 
once HBASE-27657 is merged. I've written up [this 
branch|https://github.com/HubSpot/hbase/compare/HBASE-27657-custom-rpc-controller...HubSpot:hbase:HBASE-27981]
 as a proof of concept in the meantime.

Operation attributes will be trickier because we'll need to parse the messages 
appropriately, either reiterate some logic or significantly refactor our 
current payload derivation, think through what we'll do with large multi 
requests that contain tons of attributes, etc.. I wonder whether we're better 
off omitting that work from this ticket, or at least from the first PR.

> Add connection, request, and operation attributes to slow log
> -
>
> Key: HBASE-27981
> URL: https://issues.apache.org/jira/browse/HBASE-27981
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> These can help users diagnose slow requests by pushing identifying 
> information into the log. It might make sense to union them into a single 
> field or put them in separate fields.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27981) Add connection, request, and operation attributes to slow log

2023-07-20 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27981:
-

Assignee: Ray Mattingly

> Add connection, request, and operation attributes to slow log
> -
>
> Key: HBASE-27981
> URL: https://issues.apache.org/jira/browse/HBASE-27981
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> These can help users diagnose slow requests by pushing identifying 
> information into the log. It might make sense to union them into a single 
> field or put them in separate fields.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27657) Connection and Request Attributes

2023-07-17 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743955#comment-17743955
 ] 

Ray Mattingly commented on HBASE-27657:
---

I wrote up a draft PR demonstrating a TableBuilder request attributes 
implementation here: https://github.com/apache/hbase/pull/5326

> Connection and Request Attributes
> -
>
> Key: HBASE-27657
> URL: https://issues.apache.org/jira/browse/HBASE-27657
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we have the ability to set Operation attributes, via 
> Get.setAttribute, etc. It would be useful to be able to set attributes at the 
> request and connection level.
> These levels can result in less duplication. For example, send some 
> attributes once per connection instead of for every one of the millions of 
> requests a connection might send. Or send once for the request, instead of 
> duplicating on every operation in a multi request.
> Additionally, the Connection and RequestHeader are more globally available on 
> the server side. Both can be accessed via RpcServer.getCurrentCall(), which 
> is useful in various integration points – coprocessors, custom queues, 
> quotas, slow log, etc. Operation attributes are harder to access because you 
> need to parse the raw Message into the appropriate type to get access to the 
> getter.
> I was thinking adding two new methods to Connection interface:
> - setAttribute (and getAttribute/getAttributes)
> - setRequestAttributeProvider
> Any Connection attributes would be set onto the ConnectionHeader during 
> initialization. The RequestAttributeProvider would be called when creating 
> each RequestHeader.
> An alternative to setRequestAttributeProvider would be to add this into 
> HBaseRpcController, which can already be customized via site configuration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27975) Region (un)assignment should have a more direct timeout

2023-07-14 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-27975:
-

 Summary: Region (un)assignment should have a more direct timeout
 Key: HBASE-27975
 URL: https://issues.apache.org/jira/browse/HBASE-27975
 Project: HBase
  Issue Type: Improvement
Reporter: Ray Mattingly


h3. Problem

We've observed a few cases in which region (un)assignment can hang for 
significant, and sometimes seemingly indefinite, periods of time. This results 
in unpredictably long downtime which must be remediated via manually initiated 
ServerCrashProcedures.
h3. Example 1

If a RS is unable to communicate with the NameNode and it is asked to close a 
region then its RS_CLOSE_REGION thread will get stuck awaiting a NN failover. 
Due to several default configurations of options like:
 * hbase.hstore.flush.retries.number
 * hbase.server.pause
 * dfs.client.failover.max.attempts
 * dfs.client.failover.sleep.base.millis
 * dfs.client.failover.max.attempts

this region unassignment attempt will hang for approximately 30 minutes before 
it allows the failure to bubble up and automatically trigger a 
ServerCrashProcedure.

One can tune the aforementioned options to reduce the TTR here, but it's not a 
very obvious/direct solution.
h3. Example 2

In rare cases our public cloud provider may supply us with machines that have 
degraded hardware. If we're unable to catch this degradation prior to startup, 
then we've observed that the degraded RegionServer process may come online; as 
a result it will be assigned regions which can often never actually be 
successfully opened. If the RegionServer's assignment handling fails to 
intentionally fail, then there will never be outside intervention; the 
assignment will be stuck hanging indefinitely. I've written [a unit 
test|https://github.com/apache/hbase/compare/master...HubSpot:hbase:rsit-opening-repro]
 which reproduces this behavior. On this same branch is a unit test 
demonstrating that a timeout placed on the AssignRegionHandler helps to fast 
fail and reliably trigger the necessary ServerCrashProcedure.
h3. Proposal

I want to propose that we add optional and configurable timeouts to the 
AssignRegion and UnassignRegion event handlers.

This would allow us to much more intentionally & clearly prevent long running 
retries for these downtime inducing procedures and could consequently improve 
our reliability in both examples.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27657) Connection and Request Attributes

2023-07-14 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743225#comment-17743225
 ] 

Ray Mattingly commented on HBASE-27657:
---

I've updated [the design 
doc|https://docs.google.com/document/d/1cGEmUn2kAPhn_Q18DvAOhigtCbnQ8ia6oV4OvMb5DyU/edit?usp=sharing]
 to reflect our new TableBuilder interface for request attributes and have a 
branch which has proved the concept. Does this seem like a more suitable design 
to you [~zhangduo]?

> Connection and Request Attributes
> -
>
> Key: HBASE-27657
> URL: https://issues.apache.org/jira/browse/HBASE-27657
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we have the ability to set Operation attributes, via 
> Get.setAttribute, etc. It would be useful to be able to set attributes at the 
> request and connection level.
> These levels can result in less duplication. For example, send some 
> attributes once per connection instead of for every one of the millions of 
> requests a connection might send. Or send once for the request, instead of 
> duplicating on every operation in a multi request.
> Additionally, the Connection and RequestHeader are more globally available on 
> the server side. Both can be accessed via RpcServer.getCurrentCall(), which 
> is useful in various integration points – coprocessors, custom queues, 
> quotas, slow log, etc. Operation attributes are harder to access because you 
> need to parse the raw Message into the appropriate type to get access to the 
> getter.
> I was thinking adding two new methods to Connection interface:
> - setAttribute (and getAttribute/getAttributes)
> - setRequestAttributeProvider
> Any Connection attributes would be set onto the ConnectionHeader during 
> initialization. The RequestAttributeProvider would be called when creating 
> each RequestHeader.
> An alternative to setRequestAttributeProvider would be to add this into 
> HBaseRpcController, which can already be customized via site configuration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27657) Connection and Request Attributes

2023-07-11 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742194#comment-17742194
 ] 

Ray Mattingly commented on HBASE-27657:
---

Thanks for the feedback here, that's a fair criticism for sure. I'm going to 
explore whether we can add support for request header configuration in the 
TableBuilder

> Connection and Request Attributes
> -
>
> Key: HBASE-27657
> URL: https://issues.apache.org/jira/browse/HBASE-27657
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we have the ability to set Operation attributes, via 
> Get.setAttribute, etc. It would be useful to be able to set attributes at the 
> request and connection level.
> These levels can result in less duplication. For example, send some 
> attributes once per connection instead of for every one of the millions of 
> requests a connection might send. Or send once for the request, instead of 
> duplicating on every operation in a multi request.
> Additionally, the Connection and RequestHeader are more globally available on 
> the server side. Both can be accessed via RpcServer.getCurrentCall(), which 
> is useful in various integration points – coprocessors, custom queues, 
> quotas, slow log, etc. Operation attributes are harder to access because you 
> need to parse the raw Message into the appropriate type to get access to the 
> getter.
> I was thinking adding two new methods to Connection interface:
> - setAttribute (and getAttribute/getAttributes)
> - setRequestAttributeProvider
> Any Connection attributes would be set onto the ConnectionHeader during 
> initialization. The RequestAttributeProvider would be called when creating 
> each RequestHeader.
> An alternative to setRequestAttributeProvider would be to add this into 
> HBaseRpcController, which can already be customized via site configuration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27657) Connection and Request Attributes

2023-07-10 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741604#comment-17741604
 ] 

Ray Mattingly commented on HBASE-27657:
---

Is the HBaseRpcController trivially available to clients when generating 
requests? I think inaccessibility is the main blocker for just adding 
getters/setters there

> Connection and Request Attributes
> -
>
> Key: HBASE-27657
> URL: https://issues.apache.org/jira/browse/HBASE-27657
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we have the ability to set Operation attributes, via 
> Get.setAttribute, etc. It would be useful to be able to set attributes at the 
> request and connection level.
> These levels can result in less duplication. For example, send some 
> attributes once per connection instead of for every one of the millions of 
> requests a connection might send. Or send once for the request, instead of 
> duplicating on every operation in a multi request.
> Additionally, the Connection and RequestHeader are more globally available on 
> the server side. Both can be accessed via RpcServer.getCurrentCall(), which 
> is useful in various integration points – coprocessors, custom queues, 
> quotas, slow log, etc. Operation attributes are harder to access because you 
> need to parse the raw Message into the appropriate type to get access to the 
> getter.
> I was thinking adding two new methods to Connection interface:
> - setAttribute (and getAttribute/getAttributes)
> - setRequestAttributeProvider
> Any Connection attributes would be set onto the ConnectionHeader during 
> initialization. The RequestAttributeProvider would be called when creating 
> each RequestHeader.
> An alternative to setRequestAttributeProvider would be to add this into 
> HBaseRpcController, which can already be customized via site configuration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27657) Connection and Request Attributes

2023-06-26 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737337#comment-17737337
 ] 

Ray Mattingly commented on HBASE-27657:
---

I've written up [a basic design 
doc|https://docs.google.com/document/d/1cGEmUn2kAPhn_Q18DvAOhigtCbnQ8ia6oV4OvMb5DyU/edit?usp=sharing]
 to pair with [our initial PR|https://github.com/apache/hbase/pull/5306].

> Connection and Request Attributes
> -
>
> Key: HBASE-27657
> URL: https://issues.apache.org/jira/browse/HBASE-27657
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we have the ability to set Operation attributes, via 
> Get.setAttribute, etc. It would be useful to be able to set attributes at the 
> request and connection level.
> These levels can result in less duplication. For example, send some 
> attributes once per connection instead of for every one of the millions of 
> requests a connection might send. Or send once for the request, instead of 
> duplicating on every operation in a multi request.
> Additionally, the Connection and RequestHeader are more globally available on 
> the server side. Both can be accessed via RpcServer.getCurrentCall(), which 
> is useful in various integration points – coprocessors, custom queues, 
> quotas, slow log, etc. Operation attributes are harder to access because you 
> need to parse the raw Message into the appropriate type to get access to the 
> getter.
> I was thinking adding two new methods to Connection interface:
> - setAttribute (and getAttribute/getAttributes)
> - setRequestAttributeProvider
> Any Connection attributes would be set onto the ConnectionHeader during 
> initialization. The RequestAttributeProvider would be called when creating 
> each RequestHeader.
> An alternative to setRequestAttributeProvider would be to add this into 
> HBaseRpcController, which can already be customized via site configuration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27784) Custom quota groupings

2023-06-23 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27784:
-

Assignee: Ray Mattingly

> Custom quota groupings
> --
>
> Key: HBASE-27784
> URL: https://issues.apache.org/jira/browse/HBASE-27784
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we provide the ability to define quotas for namespaces, tables, or 
> users. On multi-tenant clusters, users may be broken down into groups based 
> on their use-case. For us this comes down to 2 main cases:
>  # Hadoop jobs – it would be good to be able to limit all hadoop jobs in 
> aggregate
>  # Proxy APIs - this is common where upstream callers don't hit hbase 
> directly, instead they go through one of many proxy api's.  For us we have a 
> custom auth plugin which sets the username to the upstream caller name. But 
> it would still be useful to be able to limit all usage from some particular 
> proxy API in aggregate.
> I think this could build upon the idea for Connection attributes in 
> HBASE-27657. Basically when a Connection is established we can set an 
> attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI).  In 
> QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also 
> allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => 
> 'hadoop', LIMIT => '100M/sec'}}
> I need to do more investigation into whether we'd want to return a simple 
> group limiter (more similar to table/namespace handling) or treat it more 
> like the USER limiters which returns a QuotaState (so you can limit 
> by-group-by-table).
> We need to consider how GROUP quotas interact with USER quotas. If a user has 
> a quota defined, and that user is also part of a group with a quota defined, 
> does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS 
> setting, similar to GLOBAL_BYPASS?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27657) Connection and Request Attributes

2023-05-24 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27657:
-

Assignee: Ray Mattingly

> Connection and Request Attributes
> -
>
> Key: HBASE-27657
> URL: https://issues.apache.org/jira/browse/HBASE-27657
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> Currently we have the ability to set Operation attributes, via 
> Get.setAttribute, etc. It would be useful to be able to set attributes at the 
> request and connection level.
> These levels can result in less duplication. For example, send some 
> attributes once per connection instead of for every one of the millions of 
> requests a connection might send. Or send once for the request, instead of 
> duplicating on every operation in a multi request.
> Additionally, the Connection and RequestHeader are more globally available on 
> the server side. Both can be accessed via RpcServer.getCurrentCall(), which 
> is useful in various integration points – coprocessors, custom queues, 
> quotas, slow log, etc. Operation attributes are harder to access because you 
> need to parse the raw Message into the appropriate type to get access to the 
> getter.
> I was thinking adding two new methods to Connection interface:
> - setAttribute (and getAttribute/getAttributes)
> - setRequestAttributeProvider
> Any Connection attributes would be set onto the ConnectionHeader during 
> initialization. The RequestAttributeProvider would be called when creating 
> each RequestHeader.
> An alternative to setRequestAttributeProvider would be to add this into 
> HBaseRpcController, which can already be customized via site configuration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27800) Add support for default user quotas using USER => 'all'

2023-05-05 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27800:
-

Assignee: Ray Mattingly

> Add support for default user quotas using USER => 'all' 
> 
>
> Key: HBASE-27800
> URL: https://issues.apache.org/jira/browse/HBASE-27800
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> If someone sets a quota with USER => 'all' (or maybe '*'), treat that as a 
> default quota for each individual user. When a request comes from a user, it 
> will lookup current QuotaState based on username. If one doesn't exist, it 
> will be pre-filled with whatever the 'all' quota was set to. Otherwise, if 
> you then define a quota for a specific user that will override whatever 
> default you have set for that user only.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27553) SlowLog does not include params for Mutations

2023-05-05 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719940#comment-17719940
 ] 

Ray Mattingly commented on HBASE-27553:
---

{quote}Currently it handles MutationProto, but it should be MutateRequest
{quote}
I think it does handle MutateRequest lower down: 
[https://github.com/apache/hbase/blame/master/hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java#L2182-L2187].
 Is that adequate?
{quote}While we are here, the CoprocessorServiceRequest (handled further down) 
has a getRegion() method, but that is not passed into the SlowLogParams either. 
We should add that too.
{quote}
Totally agreed, this should be easy

> SlowLog does not include params for Mutations
> -
>
> Key: HBASE-27553
> URL: https://issues.apache.org/jira/browse/HBASE-27553
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Minor
>
> SlowLog params are extracted via 
> [ProtobufUtil.getSlowLogParams|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java#L2154].
>  This method has various if/else branches for each request type, but mutation 
> (the line linked above) is incorrect. Currently it handles MutationProto, but 
> it should be MutateRequest. A MutationProto is never passed into this method, 
> only MutateRequests so any MutateRequests being passed in now will fall 
> through to the default case which contains nothing useful about the request.
> As part of fixing this, we should also ensure that we extract the region name 
> from the MutateRequest to add into the SlowLogParams object like all the 
> other requests.
> While we are here, the CoprocessorServiceRequest (handled further down) has a 
> getRegion() method, but that is not passed into the SlowLogParams either. We 
> should add that too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException

2023-04-24 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715954#comment-17715954
 ] 

Ray Mattingly commented on HBASE-27798:
---

 sounds good, thanks for the input!

> Client side should back off based on wait interval in RpcThrottlingException
> 
>
> Key: HBASE-27798
> URL: https://issues.apache.org/jira/browse/HBASE-27798
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException

2023-04-24 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715947#comment-17715947
 ] 

Ray Mattingly commented on HBASE-27798:
---

That sounds good to me. Should we still add the retry backoff for the 
waitInterval case?

> Client side should back off based on wait interval in RpcThrottlingException
> 
>
> Key: HBASE-27798
> URL: https://issues.apache.org/jira/browse/HBASE-27798
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException

2023-04-24 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715939#comment-17715939
 ] 

Ray Mattingly commented on HBASE-27798:
---

The current retry backoff has a few inputs & steps. For example, we use [the 
{{pause}} and {{pauseForServerOverloaded}} durations in 
{{RpcRetryingCallerImpl}}|https://github.com/apache/hbase/blob/branch-2/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerImpl.java#L59-L60]
 to determine a pauseBase millis, and then multiply it by the relevant 
{{RETRY_BACKOFF}} item. 

I'm wondering how we should incorporate the {{waitInterval}} into this existing 
system; I see a few options:
 # We could consider waitInterval an addition to the pauseBase
 # We could consider waitInterval an addition to the product of pauseBase * 
retryBackoff
 # We could consider waitInterval, if present, to be a replacement for the 
pauseBase
 # We could consider waitInterval, if present, to be a replacement for the 
produce of pauseBase * retryBackoff

[~bbeaudreault] do you have any thoughts/preference here?

> Client side should back off based on wait interval in RpcThrottlingException
> 
>
> Key: HBASE-27798
> URL: https://issues.apache.org/jira/browse/HBASE-27798
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException

2023-04-19 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27798:
-

Assignee: Ray Mattingly

> Client side should back off based on wait interval in RpcThrottlingException
> 
>
> Key: HBASE-27798
> URL: https://issues.apache.org/jira/browse/HBASE-27798
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27799) RpcThrottlingException wait interval message is misleading between 0-1s

2023-04-19 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27799:
-

Assignee: Ray Mattingly

> RpcThrottlingException wait interval message is misleading between 0-1s
> ---
>
> Key: HBASE-27799
> URL: https://issues.apache.org/jira/browse/HBASE-27799
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>
> When the wait interval is below 1s, it shows 0sec. We should show 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27535) Separate slowlog thresholds for scans vs other requests

2023-04-18 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17713615#comment-17713615
 ] 

Ray Mattingly commented on HBASE-27535:
---

[https://github.com/apache/hbase/pull/5188] is ready for review

> Separate slowlog thresholds for scans vs other requests
> ---
>
> Key: HBASE-27535
> URL: https://issues.apache.org/jira/browse/HBASE-27535
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>  Labels: slowlog
>
> Scans by their nature are able to more efficiently pull back larger response 
> sizes than gets. They also may take longer to execute than other request 
> types. We should make it possible to configure a separate threshold for 
> response time and response time for scans. This will allow us to tune down 
> the thresholds for others without adding unnecessary noise for requests which 
> are known to be slower/bigger.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27535) Separate slowlog thresholds for scans vs other requests

2023-04-17 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-27535:
-

Assignee: Ray Mattingly

> Separate slowlog thresholds for scans vs other requests
> ---
>
> Key: HBASE-27535
> URL: https://issues.apache.org/jira/browse/HBASE-27535
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>  Labels: slowlog
>
> Scans by their nature are able to more efficiently pull back larger response 
> sizes than gets. They also may take longer to execute than other request 
> types. We should make it possible to configure a separate threshold for 
> response time and response time for scans. This will allow us to tune down 
> the thresholds for others without adding unnecessary noise for requests which 
> are known to be slower/bigger.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27786) CompoundBloomFilters break with an error rate that is too high

2023-04-10 Thread Ray Mattingly (Jira)
Ray Mattingly created HBASE-27786:
-

 Summary: CompoundBloomFilters break with an error rate that is too 
high
 Key: HBASE-27786
 URL: https://issues.apache.org/jira/browse/HBASE-27786
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.5.2
Reporter: Ray Mattingly


At my company we're beginning to more heavily utilize the bloom error rate 
configuration. This is because bloom filters are a nice optimization, but for 
well distributed workloads with relatively dense data (many rows:host), we've 
found that they can cause lots of memory/GC pressure unless they can entirely 
fit in the block cache (and consequently not churn memory that's subject to GC).

Because it's easier to estimate the memory requirements of changes in existing 
bloom filters, rather than net new bloom filters, we wanted to begin with very 
high bloom error rates (and consequently small bloom filters), and then ratchet 
down as memory availability allowed.

This led to us discovering that bloom filters appear to become corrupt at a 
relatively arbitrary error rate threshold. Blooms with an error rate of 0.61 
work as expected, but produce nonsensical results with an error rate of 0.62. 
I've pushed this branch with test updates to demonstrate the deficit: 
[https://github.com/apache/hbase/compare/master...HubSpot:hbase:rmattingly/bloom-error-rate-bug]

The test changes confirm that the BloomFilterUtil works as expected, at least 
with respect to its error rate : size relationship. You can see this in the 
output of {{{}TestBloomFilterChunk#testBloomErrorRateSizeRelationship{}}}:

 
{noformat}
previousErrorRate=0.01, previousSize=1048568
currentErrorRate=0.05, currentSize=682109
previousErrorRate=0.05, previousSize=682109
currentErrorRate=0.1, currentSize=524284
previousErrorRate=0.1, previousSize=524284
currentErrorRate=0.2, currentSize=366459
previousErrorRate=0.2, previousSize=366459
currentErrorRate=0.4, currentSize=208634
previousErrorRate=0.4, previousSize=208634
currentErrorRate=0.5, currentSize=157826
previousErrorRate=0.5, previousSize=157826
currentErrorRate=0.75, currentSize=65504
previousErrorRate=0.75, previousSize=65504
currentErrorRate=0.99, currentSize=2289
{noformat}
 

With this in mind, the updates to {{TestCompoundBloomFilter}} tell us that the 
bug must live somewhere in the {{CompoundBloomFilter}} logic. The output 
indicates this:

 
{noformat}
2023-04-10T15:07:50,925 INFO  [Time-limited test] 
regionserver.TestCompoundBloomFilter(245): Functional bloom has error rate 0.01 
and size 1kb
...
2023-04-10T15:07:56,657 INFO  [Time-limited test] 
regionserver.TestCompoundBloomFilter(245): Functional bloom has error rate 0.61 
and size 1kb
...
java.lang.AssertionError: False positive is too high: 0.99985334 
(greater than 0.65), fake lookup is enabled. Bloom size is 4687kb
    at org.junit.Assert.fail(Assert.java:89)
    at org.junit.Assert.assertTrue(Assert.java:42)
    at 
org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter.readStoreFile(TestCompoundBloomFilter.java:243)
{noformat}
 

The bloom size change from ~1kb -> 4687kb and total lack of precision is 
clearly not as intended, and totally inline with what we saw in our HBase 
clusters that attempted to use high bloom error rates.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27536) Include more request information in slowlog for Scans

2023-03-28 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly updated HBASE-27536:
--
Description: 
Currently the slowlog only includes a barebones text format of the underlying 
protobuf Message fields. This is not a great UX for 2 reasons:
 # Most of the proto fields dont mirror the actual API names in our requests 
(Scan, Get, etc).
 # The chosen data is often not enough to actually infer anything about the 
request

Any of the API class's toString method would be a much better representation of 
the request. On the server side, we already have to turn the protobuf Message 
into an actual API class in order to serve the request in RSRpcServices. Given 
slow logs should be a very small percent of total requests, I think we should 
do a similar parsing in SlowLogQueueService. Or better yet, perhaps we can pass 
the already parsed request into the queue at the start to avoid the extra work. 

When hydrating a SlowLogPayload with this request information, I believe we 
should use {{Operation's toMap(int maxCols)}} method. Adding this onto the 
SlowLogPayload as a map (or list of key/values) will make it easier to consume 
via downstream automation. Alternatively we could use {{{}toJSON(){}}}.

We should also include any attributes from the queries, as those made aid 
tracing at the client level.

Edit: because of nuance related to handling multis and the adequacy of info 
available for gets/puts, we're scoping this issue down to focus on improving 
the information available on Scan slowlogs

  was:
Currently the slowlog only includes a barebones text format of the underlying 
protobuf Message fields. This is not a great UX for 2 reasons:
 # Most of the proto fields dont mirror the actual API names in our requests 
(Scan, Get, etc).
 # The chosen data is often not enough to actually infer anything about the 
request

Any of the API class's toString method would be a much better representation of 
the request. On the server side, we already have to turn the protobuf Message 
into an actual API class in order to serve the request in RSRpcServices. Given 
slow logs should be a very small percent of total requests, I think we should 
do a similar parsing in SlowLogQueueService. Or better yet, perhaps we can pass 
the already parsed request into the queue at the start to avoid the extra work. 

When hydrating a SlowLogPayload with this request information, I believe we 
should use {{Operation's toMap(int maxCols)}} method. Adding this onto the 
SlowLogPayload as a map (or list of key/values) will make it easier to consume 
via downstream automation. Alternatively we could use {{{}toJSON(){}}}.

We should also include any attributes from the queries, as those made aid 
tracing at the client level.

 


> Include more request information in slowlog for Scans
> -
>
> Key: HBASE-27536
> URL: https://issues.apache.org/jira/browse/HBASE-27536
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Priority: Major
>  Labels: slowlog
>
> Currently the slowlog only includes a barebones text format of the underlying 
> protobuf Message fields. This is not a great UX for 2 reasons:
>  # Most of the proto fields dont mirror the actual API names in our requests 
> (Scan, Get, etc).
>  # The chosen data is often not enough to actually infer anything about the 
> request
> Any of the API class's toString method would be a much better representation 
> of the request. On the server side, we already have to turn the protobuf 
> Message into an actual API class in order to serve the request in 
> RSRpcServices. Given slow logs should be a very small percent of total 
> requests, I think we should do a similar parsing in SlowLogQueueService. Or 
> better yet, perhaps we can pass the already parsed request into the queue at 
> the start to avoid the extra work. 
> When hydrating a SlowLogPayload with this request information, I believe we 
> should use {{Operation's toMap(int maxCols)}} method. Adding this onto the 
> SlowLogPayload as a map (or list of key/values) will make it easier to 
> consume via downstream automation. Alternatively we could use 
> {{{}toJSON(){}}}.
> We should also include any attributes from the queries, as those made aid 
> tracing at the client level.
> Edit: because of nuance related to handling multis and the adequacy of info 
> available for gets/puts, we're scoping this issue down to focus on improving 
> the information available on Scan slowlogs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27536) Include more request information in slowlog for Scans

2023-03-28 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly updated HBASE-27536:
--
Summary: Include more request information in slowlog for Scans  (was: 
Include more request information in slowlog)

> Include more request information in slowlog for Scans
> -
>
> Key: HBASE-27536
> URL: https://issues.apache.org/jira/browse/HBASE-27536
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Priority: Major
>  Labels: slowlog
>
> Currently the slowlog only includes a barebones text format of the underlying 
> protobuf Message fields. This is not a great UX for 2 reasons:
>  # Most of the proto fields dont mirror the actual API names in our requests 
> (Scan, Get, etc).
>  # The chosen data is often not enough to actually infer anything about the 
> request
> Any of the API class's toString method would be a much better representation 
> of the request. On the server side, we already have to turn the protobuf 
> Message into an actual API class in order to serve the request in 
> RSRpcServices. Given slow logs should be a very small percent of total 
> requests, I think we should do a similar parsing in SlowLogQueueService. Or 
> better yet, perhaps we can pass the already parsed request into the queue at 
> the start to avoid the extra work. 
> When hydrating a SlowLogPayload with this request information, I believe we 
> should use {{Operation's toMap(int maxCols)}} method. Adding this onto the 
> SlowLogPayload as a map (or list of key/values) will make it easier to 
> consume via downstream automation. Alternatively we could use 
> {{{}toJSON(){}}}.
> We should also include any attributes from the queries, as those made aid 
> tracing at the client level.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-26874) VerifyReplication recompare async

2023-02-15 Thread Ray Mattingly (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Mattingly reassigned HBASE-26874:
-

Assignee: Hernan Gelaf-Romer  (was: Ray Mattingly)

> VerifyReplication recompare async
> -
>
> Key: HBASE-26874
> URL: https://issues.apache.org/jira/browse/HBASE-26874
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Hernan Gelaf-Romer
>Priority: Major
>
> VerifyReplication includes an option "sleepMsBeforeReCompare". This is useful 
> for helping work around replication lag. However, adding a sleep in a hadoop 
> job can drastically slow that job down if there is anything more than a small 
> number of invalid results.
> We can mitigate this by doing the recompare in a separate thread. We can 
> limit the thread pool and fallback to doing the recompare in the main thread 
> if the thread pool is full. This way we offload some of the slowness but 
> still retain the same validation guarantees. A configuration can be added to 
> control how many threads per mapper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27536) Include more request information in slowlog

2022-12-19 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649476#comment-17649476
 ] 

Ray Mattingly commented on HBASE-27536:
---

{quote}I believe we should use {{Operation's toMap(int maxCols)}} method
{quote}
This seems doable. [~bbeaudreault] do you have any thoughts re: what a 
reasonable default {{maxCols}} might be? Should this value be configurable, or 
should we be hesitant to add another new conf option for something like this? 
I've looked through some implementations of toMap and my initial impression is 
that it wouldn't be too dangerous to have a relatively high default (like, in 
the hundreds?). But wanted to get your thoughts here too.

> Include more request information in slowlog
> ---
>
> Key: HBASE-27536
> URL: https://issues.apache.org/jira/browse/HBASE-27536
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Priority: Major
>  Labels: slowlog
>
> Currently the slowlog only includes a barebones text format of the underlying 
> protobuf Message fields. This is not a great UX for 2 reasons:
>  # Most of the proto fields dont mirror the actual API names in our requests 
> (Scan, Get, etc).
>  # The chosen data is often not enough to actually infer anything about the 
> request
> Any of the API class's toString method would be a much better representation 
> of the request. On the server side, we already have to turn the protobuf 
> Message into an actual API class in order to serve the request in 
> RSRpcServices. Given slow logs should be a very small percent of total 
> requests, I think we should do a similar parsing in SlowLogQueueService. Or 
> better yet, perhaps we can pass the already parsed request into the queue at 
> the start to avoid the extra work. 
> When hydrating a SlowLogPayload with this request information, I believe we 
> should use {{Operation's toMap(int maxCols)}} method. Adding this onto the 
> SlowLogPayload as a map (or list of key/values) will make it easier to 
> consume via downstream automation. Alternatively we could use 
> {{{}toJSON(){}}}.
> We should also include any attributes from the queries, as those made aid 
> tracing at the client level.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27253) Make slow log configs updatable with configuration observer

2022-12-15 Thread Ray Mattingly (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648041#comment-17648041
 ] 

Ray Mattingly commented on HBASE-27253:
---

[A solution|https://github.com/apache/hbase/pull/4926] is ready for review.

> Make slow log configs updatable with configuration observer
> ---
>
> Key: HBASE-27253
> URL: https://issues.apache.org/jira/browse/HBASE-27253
> Project: HBase
>  Issue Type: Improvement
>Reporter: Bryan Beaudreault
>Assignee: Ray Mattingly
>Priority: Major
>  Labels: slowlog
>
> It would be very useful to be able to turn slow log on or off, change 
> thresholds, etc on demand as needed when diagnosing a traffic issue. Should 
> be a simple matter of moving the configs into RpcServer#onConfigurationChange



--
This message was sent by Atlassian Jira
(v8.20.10#820010)