[jira] [Commented] (HBASE-28084) Incremental backups should be forbidden after deleting backups
[ https://issues.apache.org/jira/browse/HBASE-28084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844462#comment-17844462 ] Ray Mattingly commented on HBASE-28084: --- {quote}Imagine I have a set of backups (Full1, Incr2, Incr3), delete the last backup (Incr3), and then create a new incremental backup (Incr4). This backup history will now show: Full1, Incr2, Incr4. {quote} In this case shouldn't Incr4 recognize that the last existing backup is Incr2, so Incr4 would cover from the end of Incr2 to now if possible (WALs are still around?) or throw if this is not possible? I'm concerned that this proposal would mean that any incremental backup corruption or ungraceful failures would necessitate a new full backup. A nicer UX would be to delete the bad incremental backup, and try again > Incremental backups should be forbidden after deleting backups > -- > > Key: HBASE-28084 > URL: https://issues.apache.org/jira/browse/HBASE-28084 > Project: HBase > Issue Type: Bug > Components: backuprestore >Reporter: Dieter De Paepe >Priority: Major > > Imagine I have a set of backups (Full1, Incr2, Incr3), delete the last backup > (Incr3), and then create a new incremental backup (Incr4). > This backup history will now show: Full1, Incr2, Incr4. > However, restoring Incr4 will not contain the data that was captured in > Incr3, effectively leading to data loss. This will certainly surprise some > users. > I suggest to add some internal bookkeeping to prevent incremental backups in > case the most recent backup was deleted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28562) Ancestor calculation of backups is wrong
[ https://issues.apache.org/jira/browse/HBASE-28562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843012#comment-17843012 ] Ray Mattingly commented on HBASE-28562: --- I like the simplicity, but I think we're missing some necessary complexity. Backups may apply to some tables, but not others, and we need to fetch every ancestor, rather than just one. > Ancestor calculation of backups is wrong > > > Key: HBASE-28562 > URL: https://issues.apache.org/jira/browse/HBASE-28562 > Project: HBase > Issue Type: Bug > Components: backuprestore >Affects Versions: 2.6.0, 3.0.0 >Reporter: Dieter De Paepe >Priority: Major > Labels: pull-request-available > > This is the same issue as HBASE-25870, but I think the fix there was wrong. > This issue can prevent creation of (incremental) backups when data of > unrelated backups was damaged on backup storage. > Minimal example to reproduce from source: > * Add following to `conf/hbase-site.xml` to enable backups: > {code:java} > > hbase.backup.enable > true > > > hbase.master.logcleaner.plugins > > org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner > > > hbase.procedure.master.classes > > org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager > > > hbase.procedure.regionserver.classes > > org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager > > > hbase.coprocessor.region.classes > org.apache.hadoop.hbase.backup.BackupObserver > > > hbase.fs.tmp.dir > file:/tmp/hbase-tmp >{code} > * Start HBase and open a shell: {{{}bin/start-hbase.sh{}}}, {{bin/hbase > shell}} > * Execute following commands ("put" & "create" commands in hbase shell, > other commands in commandline): > * > {code:java} > create 'experiment', 'fam' > put 'experiment', 'row1', 'fam:b', 'value1' > bin/hbase backup create full file:/tmp/hbasebackup > Backup session backup_1714649896776 finished. Status: SUCCESS > put 'experiment', 'row2', 'fam:b', 'value2' > bin/hbase backup create incremental file:/tmp/hbasebackup > Backup session backup_1714649920488 finished. Status: SUCCESS > put 'experiment', 'row3', 'fam:b', 'value3' > bin/hbase backup create incremental file:/tmp/hbasebackup > Backup session backup_1714650054960 finished. Status: SUCCESS > (Delete the files corresponding to the first incremental backup - > backup_1714649920488 in this example) > put 'experiment', 'row4', 'fam:a', 'value4' > bin/hbase backup create full file:/tmp/hbasebackup > Backup session backup_1714650236911 finished. Status: SUCCESS > put 'experiment', 'row5', 'fam:a', 'value5' > bin/hbase backup create incremental file:/tmp/hbasebackup > Backup session backup_1714650289957 finished. Status: SUCCESS > put 'experiment', 'row6', 'fam:a', 'value6' > bin/hbase backup create incremental > file:/tmp/hbasebackup2024-05-02T13:45:27,534 ERROR [main {}] > impl.BackupManifest: file:/tmp/hbasebackup/backup_1714649920488 does not exist > 2024-05-02T13:45:27,534 ERROR [main {}] impl.TableBackupClient: Unexpected > Exception : file:/tmp/hbasebackup/backup_1714649920488 does not exist > org.apache.hadoop.hbase.backup.impl.BackupException: > file:/tmp/hbasebackup/backup_1714649920488 does not exist > at > org.apache.hadoop.hbase.backup.impl.BackupManifest.(BackupManifest.java:451) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.BackupManifest.(BackupManifest.java:402) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:331) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:353) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.TableBackupClient.addManifest(TableBackupClient.java:286) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.TableBackupClient.completeBackup(TableBackupClient.java:351) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:314) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603) > ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT] > at >
[jira] [Commented] (HBASE-28562) Ancestor calculation of backups is wrong
[ https://issues.apache.org/jira/browse/HBASE-28562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843008#comment-17843008 ] Ray Mattingly commented on HBASE-28562: --- Yes, we've experience huge backup manifests due to some bugs in the getAncestors call, and its underlying BackupManifest#canCoverImage method. The BackupManifest#canCoverImage method specifies that its fullImages parameter is intended to only be full backup images, not incremental. Its name implies this, and [a comment makes that clear|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManifest.java#L614]: "each image of fullImages must not be an incremental image" But we pass in all ancestors, including incremental images, to this method. For example: [https://github.com/apache/hbase/blob/6b672cc0717e762ecaad203714099b962c035ef0/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java#L320] And the BackupManifest#canCoverImage does not assert the precondition well — instead of throwing an IllegalArgumentException, it proceeds and will just return false [if any of the given ancestors are incremental backups|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManifest.java#L619]! This means that, once an incremental backup ancestor has been found, all subsequent backup images will also be considered ancestors and this will balloon the backup manifest size. This could also be a factor in why checking the entirety of backup history is problematic for you. We probably need to largely refactor getAncestors and/or canCoverImage > Ancestor calculation of backups is wrong > > > Key: HBASE-28562 > URL: https://issues.apache.org/jira/browse/HBASE-28562 > Project: HBase > Issue Type: Bug > Components: backuprestore >Affects Versions: 2.6.0, 3.0.0 >Reporter: Dieter De Paepe >Priority: Major > Labels: pull-request-available > > This is the same issue as HBASE-25870, but I think the fix there was wrong. > This issue can prevent creation of (incremental) backups when data of > unrelated backups was damaged on backup storage. > Minimal example to reproduce from source: > * Add following to `conf/hbase-site.xml` to enable backups: > {code:java} > > hbase.backup.enable > true > > > hbase.master.logcleaner.plugins > > org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner > > > hbase.procedure.master.classes > > org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager > > > hbase.procedure.regionserver.classes > > org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager > > > hbase.coprocessor.region.classes > org.apache.hadoop.hbase.backup.BackupObserver > > > hbase.fs.tmp.dir > file:/tmp/hbase-tmp >{code} > * Start HBase and open a shell: {{{}bin/start-hbase.sh{}}}, {{bin/hbase > shell}} > * Execute following commands ("put" & "create" commands in hbase shell, > other commands in commandline): > * > {code:java} > create 'experiment', 'fam' > put 'experiment', 'row1', 'fam:b', 'value1' > bin/hbase backup create full file:/tmp/hbasebackup > Backup session backup_1714649896776 finished. Status: SUCCESS > put 'experiment', 'row2', 'fam:b', 'value2' > bin/hbase backup create incremental file:/tmp/hbasebackup > Backup session backup_1714649920488 finished. Status: SUCCESS > put 'experiment', 'row3', 'fam:b', 'value3' > bin/hbase backup create incremental file:/tmp/hbasebackup > Backup session backup_1714650054960 finished. Status: SUCCESS > (Delete the files corresponding to the first incremental backup - > backup_1714649920488 in this example) > put 'experiment', 'row4', 'fam:a', 'value4' > bin/hbase backup create full file:/tmp/hbasebackup > Backup session backup_1714650236911 finished. Status: SUCCESS > put 'experiment', 'row5', 'fam:a', 'value5' > bin/hbase backup create incremental file:/tmp/hbasebackup > Backup session backup_1714650289957 finished. Status: SUCCESS > put 'experiment', 'row6', 'fam:a', 'value6' > bin/hbase backup create incremental > file:/tmp/hbasebackup2024-05-02T13:45:27,534 ERROR [main {}] > impl.BackupManifest: file:/tmp/hbasebackup/backup_1714649920488 does not exist > 2024-05-02T13:45:27,534 ERROR [main {}] impl.TableBackupClient: Unexpected > Exception : file:/tmp/hbasebackup/backup_1714649920488 does not exist > org.apache.hadoop.hbase.backup.impl.BackupException:
[jira] [Created] (HBASE-28513) Secondary replica balancing squashes all other cost considerations
Ray Mattingly created HBASE-28513: - Summary: Secondary replica balancing squashes all other cost considerations Key: HBASE-28513 URL: https://issues.apache.org/jira/browse/HBASE-28513 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly I have a larger write up available [here.|https://git.hubteam.com/gist/rmattingly/8bc9cbe7c422db12ffc9cd1825069bd7] Basically there are a few cost functions with relatively huge default multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a multiplier of 5. Having any multiplier of 100k, while others are single digit, basically makes the latter category totally irrelevant from balancer considerations. I understand that it's critical to distribute a region's replicas across multiple hosts/racks, but I don't think we should do this at the expense of all other balancer considerations. For example, maybe we could have two types of balancer considerations: costs (as we do now), and conditionals (for the more discrete considerations, like ">1 replica of the same region should not exist on a single host"). This would allow us to prioritize replica distribution _and_ maintain consideration for things like storefile balance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28429) Quotas should have a configurable minimum wait interval
[ https://issues.apache.org/jira/browse/HBASE-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly resolved HBASE-28429. --- Resolution: Won't Fix https://issues.apache.org/jira/browse/HBASE-28453 is a better solution > Quotas should have a configurable minimum wait interval > --- > > Key: HBASE-28429 > URL: https://issues.apache.org/jira/browse/HBASE-28429 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > At my day job we're attempting to rollout read size throttling by default for > thousands of distinct users across hundreds of multi-tenant clusters. > During our rollout we've observed that throttles with a 1 second refill > interval will yield relatively tiny wait intervals disproportionately often. > From what we've seen, wait intervals are <=5ms on approximately 20-50% of our > RpcThrottlingExceptions; this could sound theoretically promising if latency > is your top priority. But, in reality, this makes it very difficult to > configure a throttle tolerant HBase client because retries become very prone > to near-immediate exhaustion, and throttled clients quickly saturate the > cluster's RPC layer with rapid-fire retries. > One can combat this with the FixedIntervalRateLimiter, but that's a very > heavy handed approach from latency's perspective, and can still yield tiny > intervals that exhaust retries and erroneously fail client operations under > significant load. > With this in mind, I'm proposing that we introduce a configurable minimum > wait interval for quotas, defaulted to 0. This would make quotas much more > usable at scale from our perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-28453) Support a middle ground between the Average and Fixed interval rate limiters
[ https://issues.apache.org/jira/browse/HBASE-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly updated HBASE-28453: -- Description: h3. Background HBase quotas support two rate limiters: a "fixed" and an "average" interval rate limiter. h4. FixedIntervalRateLimiter The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, and it refills a resource allotment on the recurring interval. So you may get 10 resources every second, and if you exhaust all 10 resources in the first millisecond of an interval then you will need to wait 999ms to acquire even 1 more resource. h4. AverageIntervalRateLimiter The average interval rate limiter, HBase's default, allows for more flexibly timed refilling of the resource allotment. Extending our previous example, say you have a 10 reads/sec quota and you have exhausted all 10 resources within 1ms of the last full refill. If you request 1 more read then, rather than returning a 999ms wait interval indicating the next full refill time, the rate limiter will recognize that you only need to wait 99ms before 1 read can be available. After 100ms has passed in aggregate since the last full refill, it will support the refilling of 1/10th the limit to facilitate the request for 1/10th the resources. h3. The Problems with Current RateLimiters The problem with the fixed interval rate limiter is that it is too strict from a latency perspective. It results in quota limits to which we cannot fully subscribe with any consistency. The problem with the average interval rate limiter is that, in practice, it is far too optimistic. For example, a real rate limiter might limit to 100MB/sec of read IO per machine. Any multigets that come in will require only a tiny fraction of this limit; for example, a 64kb block is only 0.06% of the total. As a result, the vast majority of wait intervals end up being tiny — like <5ms. This can actually cause an inverse of your intention, where setting up a throttle causes a DDOS of your RPC layer via continuous throttling and ~immediate retrying. I've discussed this problem in https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait interval as the solution there; after some more thinking, I believe this new rate limiter would be a less hacky solution to this deficit so I'd like to close that Jira in favor of this one. See the attached chart where I put in place a 10k req/sec/machine throttle for this user at 10:43 to try to curb this high traffic, and it resulted in a huge spike of req/sec due to the throttle/retry loop created by the AverageIntervalRateLimiter. h3. Original Proposal: PartialIntervalRateLimiter as a Solution I've implemented a RateLimiter which allows for partial chunks of the overall interval to be refilled, by default these chunks are 10% (or 100ms of a 1s interval). I've deployed this to a test cluster at my day job and have seen this really help our ability to full subscribe to a quota limit without executing superfluous retries. See the other attached chart which shows a cluster undergoing a rolling restart from using FixedIntervalRateLimiter to my new PartialIntervalRateLimiter and how it is then able to fully subscribe to its allotted 25MB/sec/machine read IO quota. h3. Updated Proposal: Improving FixedIntervalRateLimiter Rather than implement a new rate limiter, we can make a lower touch change which just adds support for a refill interval that is less than the time unit on a FixedIntervalRateLimiter. This can be a no-op change for those who have not opted into the feature by having the refill interval default to the time unit. For clarity, see [my branch here|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-28453] which I will PR soon was: h3. Background HBase quotas support two rate limiters: a "fixed" and an "average" interval rate limiter. h4. FixedIntervalRateLimiter The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, and it refills a resource allotment on the recurring interval. So you may get 10 resources every second, and if you exhaust all 10 resources in the first millisecond of an interval then you will need to wait 999ms to acquire even 1 more resource. h4. AverageIntervalRateLimiter The average interval rate limiter, HBase's default, allows for more flexibly timed refilling of the resource allotment. Extending our previous example, say you have a 10 reads/sec quota and you have exhausted all 10 resources within 1ms of the last full refill. If you request 1 more read then, rather than returning a 999ms wait interval indicating the next full refill time, the rate limiter will recognize that you only need to wait 99ms before 1 read can be available. After 100ms has passed in aggregate since the last full refill, it will support the refilling of 1/10th the limit to facilitate the request for
[jira] [Assigned] (HBASE-28453) Support a middle ground between the Average and Fixed interval rate limiters
[ https://issues.apache.org/jira/browse/HBASE-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28453: - Assignee: Ray Mattingly > Support a middle ground between the Average and Fixed interval rate limiters > > > Key: HBASE-28453 > URL: https://issues.apache.org/jira/browse/HBASE-28453 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > Attachments: Screenshot 2024-03-21 at 2.08.51 PM.png, Screenshot > 2024-03-21 at 2.30.01 PM.png > > > h3. Background > HBase quotas support two rate limiters: a "fixed" and an "average" interval > rate limiter. > h4. FixedIntervalRateLimiter > The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, > and it refills a resource allotment on the recurring interval. So you may get > 10 resources every second, and if you exhaust all 10 resources in the first > millisecond of an interval then you will need to wait 999ms to acquire even 1 > more resource. > h4. AverageIntervalRateLimiter > The average interval rate limiter, HBase's default, allows for more flexibly > timed refilling of the resource allotment. Extending our previous example, > say you have a 10 reads/sec quota and you have exhausted all 10 resources > within 1ms of the last full refill. If you request 1 more read then, rather > than returning a 999ms wait interval indicating the next full refill time, > the rate limiter will recognize that you only need to wait 99ms before 1 read > can be available. After 100ms has passed in aggregate since the last full > refill, it will support the refilling of 1/10th the limit to facilitate the > request for 1/10th the resources. > h3. The Problems with Current RateLimiters > The problem with the fixed interval rate limiter is that it is too strict > from a latency perspective. It results in quota limits to which we cannot > fully subscribe with any consistency. > The problem with the average interval rate limiter is that, in practice, it > is far too optimistic. For example, a real rate limiter might limit to > 100MB/sec of read IO per machine. Any multigets that come in will require > only a tiny fraction of this limit; for example, a 64kb block is only 0.06% > of the total. As a result, the vast majority of wait intervals end up being > tiny — like <5ms. This can actually cause an inverse of your intention, where > setting up a throttle causes a DDOS of your RPC layer via continuous > throttling and ~immediate retrying. I've discussed this problem in > https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait > interval as the solution there; after some more thinking, I believe this new > rate limiter would be a less hacky solution to this deficit so I'd like to > close that Jira in favor of this one. > See the attached chart where I put in place a 10k req/sec/machine throttle > for this user at 10:43 to try to curb this high traffic, and it resulted in a > huge spike of req/sec due to the throttle/retry loop created by the > AverageIntervalRateLimiter. > h3. PartialIntervalRateLimiter as a Solution > I've implemented a RateLimiter which allows for partial chunks of the overall > interval to be refilled, by default these chunks are 10% (or 100ms of a 1s > interval). I've deployed this to a test cluster at my day job and have seen > this really help our ability to full subscribe to a quota limit without > executing superfluous retries. See the other attached chart which shows a > cluster undergoing a rolling restart from using FixedIntervalRateLimiter to > my new PartialIntervalRateLimiter and how it is then able to fully subscribe > to its allotted 25MB/sec/machine read IO quota. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28429) Quotas should have a configurable minimum wait interval
[ https://issues.apache.org/jira/browse/HBASE-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829667#comment-17829667 ] Ray Mattingly commented on HBASE-28429: --- I think we should close this and instead favor https://issues.apache.org/jira/browse/HBASE-28453 > Quotas should have a configurable minimum wait interval > --- > > Key: HBASE-28429 > URL: https://issues.apache.org/jira/browse/HBASE-28429 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > At my day job we're attempting to rollout read size throttling by default for > thousands of distinct users across hundreds of multi-tenant clusters. > During our rollout we've observed that throttles with a 1 second refill > interval will yield relatively tiny wait intervals disproportionately often. > From what we've seen, wait intervals are <=5ms on approximately 20-50% of our > RpcThrottlingExceptions; this could sound theoretically promising if latency > is your top priority. But, in reality, this makes it very difficult to > configure a throttle tolerant HBase client because retries become very prone > to near-immediate exhaustion, and throttled clients quickly saturate the > cluster's RPC layer with rapid-fire retries. > One can combat this with the FixedIntervalRateLimiter, but that's a very > heavy handed approach from latency's perspective, and can still yield tiny > intervals that exhaust retries and erroneously fail client operations under > significant load. > With this in mind, I'm proposing that we introduce a configurable minimum > wait interval for quotas, defaulted to 0. This would make quotas much more > usable at scale from our perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28453) Support a middle ground between the Average and Fixed interval rate limiters
Ray Mattingly created HBASE-28453: - Summary: Support a middle ground between the Average and Fixed interval rate limiters Key: HBASE-28453 URL: https://issues.apache.org/jira/browse/HBASE-28453 Project: HBase Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Ray Mattingly Attachments: Screenshot 2024-03-21 at 2.08.51 PM.png, Screenshot 2024-03-21 at 2.30.01 PM.png h3. Background HBase quotas support two rate limiters: a "fixed" and an "average" interval rate limiter. h4. FixedIntervalRateLimiter The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, and it refills a resource allotment on the recurring interval. So you may get 10 resources every second, and if you exhaust all 10 resources in the first millisecond of an interval then you will need to wait 999ms to acquire even 1 more resource. h4. AverageIntervalRateLimiter The average interval rate limiter, HBase's default, allows for more flexibly timed refilling of the resource allotment. Extending our previous example, say you have a 10 reads/sec quota and you have exhausted all 10 resources within 1ms of the last full refill. If you request 1 more read then, rather than returning a 999ms wait interval indicating the next full refill time, the rate limiter will recognize that you only need to wait 99ms before 1 read can be available. After 100ms has passed in aggregate since the last full refill, it will support the refilling of 1/10th the limit to facilitate the request for 1/10th the resources. h3. The Problems with Current RateLimiters The problem with the fixed interval rate limiter is that it is too strict from a latency perspective. It results in quota limits to which we cannot fully subscribe with any consistency. The problem with the average interval rate limiter is that, in practice, it is far too optimistic. For example, a real rate limiter might limit to 100MB/sec of read IO per machine. Any multigets that come in will require only a tiny fraction of this limit; for example, a 64kb block is only 0.06% of the total. As a result, the vast majority of wait intervals end up being tiny — like <5ms. This can actually cause an inverse of your intention, where setting up a throttle causes a DDOS of your RPC layer via continuous throttling and ~immediate retrying. I've discussed this problem in https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait interval as the solution there; after some more thinking, I believe this new rate limiter would be a less hacky solution to this deficit so I'd like to close that Jira in favor of this one. See the attached chart where I put in place a 10k req/sec/machine throttle for this user at 10:43 to try to curb this high traffic, and it resulted in a huge spike of req/sec due to the throttle/retry loop created by the AverageIntervalRateLimiter. h3. PartialIntervalRateLimiter as a Solution I've implemented a RateLimiter which allows for partial chunks of the overall interval to be refilled, by default these chunks are 10% (or 100ms of a 1s interval). I've deployed this to a test cluster at my day job and have seen this really help our ability to full subscribe to a quota limit without executing superfluous retries. See the other attached chart which shows a cluster undergoing a rolling restart from using FixedIntervalRateLimiter to my new PartialIntervalRateLimiter and how it is then able to fully subscribe to its allotted 25MB/sec/machine read IO quota. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28430) RpcThrottlingException messages should describe the throttled access pattern
[ https://issues.apache.org/jira/browse/HBASE-28430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28430: - Assignee: Ray Mattingly > RpcThrottlingException messages should describe the throttled access pattern > > > Key: HBASE-28430 > URL: https://issues.apache.org/jira/browse/HBASE-28430 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > Right now we catch RpcThrottlingExceptions and have some debug logging in > [RegionServerRpcQuotaManager|https://github.com/apache/hbase/blob/98eb3e01b352684de3c647a6fda6208a657c4607/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RegionServerRpcQuotaManager.java#L234-L236] > — this is okay support for in depth explanations, but is not trivially > transparent to users. > For example, at my day job we have proxy APIs which sit between HBase and > many microservices. We throttle these microservices in isolation, but the > RpcThrottlingExceptions appear to be indiscriminate in the stdout of the > proxy API. > If we added the given username, table, and namespace to > RpcThrottlingException messages then understanding the nature and specificity > of any given throttle violation should be much more straightforward. Given > that quotas/throttling is most useful in a multi-tenant environment, I would > anticipate this being a pretty universal usability pain point. > It would be a bit more complicated, but we should also consider including > more information about the rate limiter which has been violated. For example, > what is the current configured read size limit that we've exceeded? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28430) RpcThrottlingException messages should describe the throttled access pattern
Ray Mattingly created HBASE-28430: - Summary: RpcThrottlingException messages should describe the throttled access pattern Key: HBASE-28430 URL: https://issues.apache.org/jira/browse/HBASE-28430 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly Right now we catch RpcThrottlingExceptions and have some debug logging in [RegionServerRpcQuotaManager|https://github.com/apache/hbase/blob/98eb3e01b352684de3c647a6fda6208a657c4607/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RegionServerRpcQuotaManager.java#L234-L236] — this is okay support for in depth explanations, but is not trivially transparent to users. For example, at my day job we have proxy APIs which sit between HBase and many microservices. We throttle these microservices in isolation, but the RpcThrottlingExceptions appear to be indiscriminate in the stdout of the proxy API. If we added the given username, table, and namespace to RpcThrottlingException messages then understanding the nature and specificity of any given throttle violation should be much more straightforward. Given that quotas/throttling is most useful in a multi-tenant environment, I would anticipate this being a pretty universal usability pain point. It would be a bit more complicated, but we should also consider including more information about the rate limiter which has been violated. For example, what is the current configured read size limit that we've exceeded? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28429) Quotas should have a configurable minimum wait interval
[ https://issues.apache.org/jira/browse/HBASE-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28429: - Assignee: Ray Mattingly > Quotas should have a configurable minimum wait interval > --- > > Key: HBASE-28429 > URL: https://issues.apache.org/jira/browse/HBASE-28429 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > At my day job we're attempting to rollout read size throttling by default for > thousands of distinct users across hundreds of multi-tenant clusters. > During our rollout we've observed that throttles with a 1 second refill > interval will yield relatively tiny wait intervals disproportionately often. > From what we've seen, wait intervals are <=5ms on approximately 20-50% of our > RpcThrottlingExceptions; this could sound theoretically promising if latency > is your top priority. But, in reality, this makes it very difficult to > configure a throttle tolerant HBase client because retries become very prone > to near-immediate exhaustion, and throttled clients quickly saturate the > cluster's RPC layer with rapid-fire retries. > One can combat this with the FixedIntervalRateLimiter, but that's a very > heavy handed approach from latency's perspective, and can still yield tiny > intervals that exhaust retries and erroneously fail client operations under > significant load. > With this in mind, I'm proposing that we introduce a configurable minimum > wait interval for quotas, defaulted to 0. This would make quotas much more > usable at scale from our perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28429) Quotas should have a configurable minimum wait interval
Ray Mattingly created HBASE-28429: - Summary: Quotas should have a configurable minimum wait interval Key: HBASE-28429 URL: https://issues.apache.org/jira/browse/HBASE-28429 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly At my day job we're attempting to rollout read size throttling by default for thousands of distinct users across hundreds of multi-tenant clusters. During our rollout we've observed that throttles with a 1 second refill interval will yield relatively tiny wait intervals disproportionately often. From what we've seen, wait intervals are <=5ms on approximately 20-50% of our RpcThrottlingExceptions; this could sound theoretically promising if latency is your top priority. But, in reality, this makes it very difficult to configure a throttle tolerant HBase client because retries become very prone to near-immediate exhaustion, and throttled clients quickly saturate the cluster's RPC layer with rapid-fire retries. One can combat this with the FixedIntervalRateLimiter, but that's a very heavy handed approach from latency's perspective, and can still yield tiny intervals that exhaust retries and erroneously fail client operations under significant load. With this in mind, I'm proposing that we introduce a configurable minimum wait interval for quotas, defaulted to 0. This would make quotas much more usable at scale from our perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28385) Quota estimates are too optimistic for large scans
Ray Mattingly created HBASE-28385: - Summary: Quota estimates are too optimistic for large scans Key: HBASE-28385 URL: https://issues.apache.org/jira/browse/HBASE-28385 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly Fix For: 2.6.0 Let's say you're running a table scan with a throttle of 100MB/sec per RegionServer. Ideally your scans are going to pull down large results, often containing hundreds or thousands of blocks. You will estimate each scan as costing a single block of read capacity, and if your quota is already exhausted then the server will evaluate the backoff required for your estimated consumption (1 block) to be available. This will often be ~1ms, causing your retries to basically be immediate. Obviously it will routinely take much longer than 1ms for 100MB of IO to become available in the given configuration, so your retries will be destined to fail. At worst this can cause a saturation of your server's RPC layer, and at best this causes erroneous exhaustion of the client's retries. We should find a way to make these estimates a bit smarter for large scans. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28385) Quota estimates are too optimistic for large scans
[ https://issues.apache.org/jira/browse/HBASE-28385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28385: - Assignee: Ray Mattingly > Quota estimates are too optimistic for large scans > -- > > Key: HBASE-28385 > URL: https://issues.apache.org/jira/browse/HBASE-28385 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > Fix For: 2.6.0 > > > Let's say you're running a table scan with a throttle of 100MB/sec per > RegionServer. Ideally your scans are going to pull down large results, often > containing hundreds or thousands of blocks. > You will estimate each scan as costing a single block of read capacity, and > if your quota is already exhausted then the server will evaluate the backoff > required for your estimated consumption (1 block) to be available. This will > often be ~1ms, causing your retries to basically be immediate. > Obviously it will routinely take much longer than 1ms for 100MB of IO to > become available in the given configuration, so your retries will be destined > to fail. At worst this can cause a saturation of your server's RPC layer, and > at best this causes erroneous exhaustion of the client's retries. > We should find a way to make these estimates a bit smarter for large scans. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28370) Default user quotas are refreshing too frequently
Ray Mattingly created HBASE-28370: - Summary: Default user quotas are refreshing too frequently Key: HBASE-28370 URL: https://issues.apache.org/jira/browse/HBASE-28370 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly In [https://github.com/apache/hbase/pull/5666] we introduced default user quotas, but I accidentally called UserQuotaState's default constructor rather than passing in the current timestamp. The consequence is that we're constantly refreshing these default user quotas, and this can be a bottleneck for horizontal cluster scalability. This should be a 1 line fix in QuotaUtil's buildDefaultUserQuotaState method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28370) Default user quotas are refreshing too frequently
[ https://issues.apache.org/jira/browse/HBASE-28370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28370: - Assignee: Ray Mattingly > Default user quotas are refreshing too frequently > - > > Key: HBASE-28370 > URL: https://issues.apache.org/jira/browse/HBASE-28370 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > In [https://github.com/apache/hbase/pull/5666] we introduced default user > quotas, but I accidentally called UserQuotaState's default constructor rather > than passing in the current timestamp. The consequence is that we're > constantly refreshing these default user quotas, and this can be a bottleneck > for horizontal cluster scalability. > This should be a 1 line fix in QuotaUtil's buildDefaultUserQuotaState method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28359) Improve quota RateLimiter synchronization
[ https://issues.apache.org/jira/browse/HBASE-28359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28359: - Assignee: Ray Mattingly > Improve quota RateLimiter synchronization > - > > Key: HBASE-28359 > URL: https://issues.apache.org/jira/browse/HBASE-28359 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > We've been experiencing RpcThrottlingException with 0ms waitInterval. This > seems odd and wasteful, since the client side will immediately retry without > backoff. I think the problem is related to the synchronization of RateLimiter. > The TimeBasedLimiter checkQuota method does the following: > {code:java} > if (!reqSizeLimiter.canExecute(estimateWriteSize + estimateReadSize)) { > RpcThrottlingException.throwRequestSizeExceeded( > reqSizeLimiter.waitInterval(estimateWriteSize + estimateReadSize)); > } {code} > Both canExecute and waitInterval are synchronized, but we're calling them > independently. So it's possible under high concurrency for canExecute to > return false, but then waitInterval returns 0 (would have been true) > I think we should simplify the API to have a single synchronized call: > {code:java} > long waitInterval = reqSizeLimiter.tryAcquire(estimateWriteSize + > estimateReadSize); > if (waitInterval > 0) { > RpcThrottlingException.throwRequestSizeExceeded(waitInterval); > }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28349) Atomic requests should increment read usage in quotas
Ray Mattingly created HBASE-28349: - Summary: Atomic requests should increment read usage in quotas Key: HBASE-28349 URL: https://issues.apache.org/jira/browse/HBASE-28349 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly Right now atomic operations are just treated as a single write from the quota perspective. Since an atomic operation also encompasses a read, it would make sense to increment readNum and readSize counts appropriately. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28349) Atomic requests should increment read usage in quotas
[ https://issues.apache.org/jira/browse/HBASE-28349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28349: - Assignee: Ray Mattingly > Atomic requests should increment read usage in quotas > - > > Key: HBASE-28349 > URL: https://issues.apache.org/jira/browse/HBASE-28349 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > Right now atomic operations are just treated as a single write from the quota > perspective. Since an atomic operation also encompasses a read, it would make > sense to increment readNum and readSize counts appropriately. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28346) Expose checkQuota to Coprocessor Endpoints
[ https://issues.apache.org/jira/browse/HBASE-28346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28346: - Assignee: Ray Mattingly > Expose checkQuota to Coprocessor Endpoints > -- > > Key: HBASE-28346 > URL: https://issues.apache.org/jira/browse/HBASE-28346 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Coprocessor endpoints may do non-trivial amounts of work, yet quotas do not > throttle them. We can't generically apply quotas to coprocessors because we > have no information on what a particular endpoint might do. One thing we > could do is expose checkQuota to the RegionCoprocessorEnvironment. This way, > coprocessor authors have the tools to ensure that quotas cover their > implementations. > While adding this, we can update AggregationImplementation to call checkQuota > since those endpoints can be quite expensive. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27800) Add support for default user quotas using USER => 'all'
[ https://issues.apache.org/jira/browse/HBASE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814887#comment-17814887 ] Ray Mattingly commented on HBASE-27800: --- I've decided to add support here via some new configuration options, and to only support a handful of user throttles. PR coming soon > Add support for default user quotas using USER => 'all' > > > Key: HBASE-27800 > URL: https://issues.apache.org/jira/browse/HBASE-27800 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > If someone sets a quota with USER => 'all' (or maybe '*'), treat that as a > default quota for each individual user. When a request comes from a user, it > will lookup current QuotaState based on username. If one doesn't exist, it > will be pre-filled with whatever the 'all' quota was set to. Otherwise, if > you then define a quota for a specific user that will override whatever > default you have set for that user only. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28215) Region reopen procedure should support some sort of throttling
[ https://issues.apache.org/jira/browse/HBASE-28215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28215: - Assignee: Ray Mattingly > Region reopen procedure should support some sort of throttling > -- > > Key: HBASE-28215 > URL: https://issues.apache.org/jira/browse/HBASE-28215 > Project: HBase > Issue Type: Improvement > Components: master, proc-v2 >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > The mass reopening of regions caused by a table descriptor modification can > be quite disruptive. For latency/error sensitive workloads, like our user > facing traffic, we need to be very careful about when we modify table > descriptors, and it can be virtually impossible to do it painlessly for busy > tables. > It would be nice if we supported configurable batching/throttling of > reopenings so that the amplitude of any disruption can be kept relatively > small. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28215) Region reopen procedure should support some sort of throttling
Ray Mattingly created HBASE-28215: - Summary: Region reopen procedure should support some sort of throttling Key: HBASE-28215 URL: https://issues.apache.org/jira/browse/HBASE-28215 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly The mass reopening of regions caused by a table descriptor modification can be quite disruptive. For latency/error sensitive workloads, like our user facing traffic, we need to be very careful about when we modify table descriptors, and it can be virtually impossible to do it painlessly for busy tables. It would be nice if we supported configurable batching/throttling of reopenings so that the amplitude of any disruption can be kept relatively small. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HBASE-28175) RpcLogDetails' Message can become corrupt before log is consumed
[ https://issues.apache.org/jira/browse/HBASE-28175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779686#comment-17779686 ] Ray Mattingly edited comment on HBASE-28175 at 10/25/23 9:54 PM: - I believe I've confirmed that deep copying the Message field in the RpcLogDetails' constructor is an effective solution here. I'm heading on vacation soon, but will open up a PR here either tomorrow before I leave, or later next week when I'm back. was (Author: JIRAUSER286879): I believe I've confirmed that deep copying the RpcLogDetails Message field is an effective solution here. I'm heading on vacation soon, but will open up a PR here either tomorrow before I leave, or later next week when I'm back. > RpcLogDetails' Message can become corrupt before log is consumed > > > Key: HBASE-28175 > URL: https://issues.apache.org/jira/browse/HBASE-28175 > Project: HBase > Issue Type: Bug >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > The RpcLogDetails class represents a slow (or large) log event which will > later be consumed by the SlowLogQueueService. > Right now the RpcLogDetails' param field points to the slow call's Message, > and this Message is backed by a CodedInputStream which may be overwritten > before the given log is consumed. This overwriting of the Message may result > in slow query payloads for which the metadata derived post-consumption is > inaccurate. > To solve this bug I think we need to copy the Message in the RpcLogDetails > constructor. I have this bug reproduced in a QA environment and will test out > this idea and open a PR shortly if the test results are promising. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28175) RpcLogDetails' Message can become corrupt before log is consumed
[ https://issues.apache.org/jira/browse/HBASE-28175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779686#comment-17779686 ] Ray Mattingly commented on HBASE-28175: --- I believe I've confirmed that deep copying the RpcLogDetails Message field is an effective solution here. I'm heading on vacation soon, but will open up a PR here either tomorrow before I leave, or later next week when I'm back. > RpcLogDetails' Message can become corrupt before log is consumed > > > Key: HBASE-28175 > URL: https://issues.apache.org/jira/browse/HBASE-28175 > Project: HBase > Issue Type: Bug >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > The RpcLogDetails class represents a slow (or large) log event which will > later be consumed by the SlowLogQueueService. > Right now the RpcLogDetails' param field points to the slow call's Message, > and this Message is backed by a CodedInputStream which may be overwritten > before the given log is consumed. This overwriting of the Message may result > in slow query payloads for which the metadata derived post-consumption is > inaccurate. > To solve this bug I think we need to copy the Message in the RpcLogDetails > constructor. I have this bug reproduced in a QA environment and will test out > this idea and open a PR shortly if the test results are promising. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28175) RpcLogDetails' Message can become corrupt before log is consumed
Ray Mattingly created HBASE-28175: - Summary: RpcLogDetails' Message can become corrupt before log is consumed Key: HBASE-28175 URL: https://issues.apache.org/jira/browse/HBASE-28175 Project: HBase Issue Type: Bug Reporter: Ray Mattingly Assignee: Ray Mattingly The RpcLogDetails class represents a slow (or large) log event which will later be consumed by the SlowLogQueueService. Right now the RpcLogDetails' param field points to the slow call's Message, and this Message is backed by a CodedInputStream which may be overwritten before the given log is consumed. This overwriting of the Message may result in slow query payloads for which the metadata derived post-consumption is inaccurate. To solve this bug I think we need to copy the Message in the RpcLogDetails constructor. I have this bug reproduced in a QA environment and will test out this idea and open a PR shortly if the test results are promising. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28146) ServerManager's rsAdmins map should be thread safe
Ray Mattingly created HBASE-28146: - Summary: ServerManager's rsAdmins map should be thread safe Key: HBASE-28146 URL: https://issues.apache.org/jira/browse/HBASE-28146 Project: HBase Issue Type: Bug Affects Versions: 2.5.5 Reporter: Ray Mattingly Assignee: Ray Mattingly On 2.x [the ServerManager registers admins in a HashMap|https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java]. This can result in thread safety issues — we recently observed an exception which caused a region to be indefinitely stuck in transition until we could manually intervene. We saw the following exception in the HMaster logs: {code:java} 2023-10-11 02:20:05.213 [RSProcedureDispatcher-pool-325] ERROR org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher: Unexpected error caught, this may cause the procedure to hang forever java.lang.ClassCastException: class java.util.HashMap$Node cannot be cast to class java.util.HashMap$TreeNode (java.util.HashMap$Node and java.util.HashMap$TreeNode are in module java.base of loader 'bootstrap') at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1900) ~[?:?] at java.util.HashMap$TreeNode.treeify(HashMap.java:2016) ~[?:?] at java.util.HashMap.treeifyBin(HashMap.java:768) ~[?:?] at java.util.HashMap.putVal(HashMap.java:640) ~[?:?] at java.util.HashMap.put(HashMap.java:608) ~[?:?] at org.apache.hadoop.hbase.master.ServerManager.getRsAdmin(ServerManager.java:723){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27800) Add support for default user quotas using USER => 'all'
[ https://issues.apache.org/jira/browse/HBASE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771554#comment-17771554 ] Ray Mattingly commented on HBASE-27800: --- A small detail, I think we should prefer 'all' to '*' as our wildcard here because the precedent has already been set for RegionServer quotas. > Add support for default user quotas using USER => 'all' > > > Key: HBASE-27800 > URL: https://issues.apache.org/jira/browse/HBASE-27800 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > If someone sets a quota with USER => 'all' (or maybe '*'), treat that as a > default quota for each individual user. When a request comes from a user, it > will lookup current QuotaState based on username. If one doesn't exist, it > will be pre-filled with whatever the 'all' quota was set to. Otherwise, if > you then define a quota for a specific user that will override whatever > default you have set for that user only. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HBASE-27784) Custom quota groupings
[ https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766940#comment-17766940 ] Ray Mattingly edited comment on HBASE-27784 at 9/20/23 12:16 PM: - It isn't exactly what's described in the issue, but I want to propose [this draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft]. Basically we add support for a request attribute {{quota.user.override}} which, when configured, takes precedent when determining which user quota to apply to the given request. This allows for us to throttle distinct requests from a shared connection as is demonstrated by the added unit test. One could achieve the "quota group" idea described above by submitting hadoop jobs as a single user override (e.g., {{{}hadoop{}}}). It could also satisfy upstream caller distinctions within a proxy API's shared connection object by configuring the user override based on some identifying characteristic of the upstream caller. It also implicitly solves the conflict between user and group quotas because they're one in the same here — it's the requests that are different. was (Author: JIRAUSER286879): It isn't exactly what's described in the issue, but I want to propose [this draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft]. Basically we add support for a request attribute {{quota.user.override}} which, when configured, takes precedent when determining which user quota to apply to the given request. This allows for us to throttle distinct requests from a shared connection as is demonstrated by the added unit test. One could achieve the "quota group" idea described above by submitting hadoop jobs as a single user override (e.g., {{{}hadoop{}}}). It could also satisfy upstream caller distinctions within a proxy API's shared connection object by configuring the user override based on some identifying characteristic of the upstream caller. > Custom quota groupings > -- > > Key: HBASE-27784 > URL: https://issues.apache.org/jira/browse/HBASE-27784 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we provide the ability to define quotas for namespaces, tables, or > users. On multi-tenant clusters, users may be broken down into groups based > on their use-case. For us this comes down to 2 main cases: > # Hadoop jobs – it would be good to be able to limit all hadoop jobs in > aggregate > # Proxy APIs - this is common where upstream callers don't hit hbase > directly, instead they go through one of many proxy api's. For us we have a > custom auth plugin which sets the username to the upstream caller name. But > it would still be useful to be able to limit all usage from some particular > proxy API in aggregate. > I think this could build upon the idea for Connection attributes in > HBASE-27657. Basically when a Connection is established we can set an > attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI). In > QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also > allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => > 'hadoop', LIMIT => '100M/sec'}} > I need to do more investigation into whether we'd want to return a simple > group limiter (more similar to table/namespace handling) or treat it more > like the USER limiters which returns a QuotaState (so you can limit > by-group-by-table). > We need to consider how GROUP quotas interact with USER quotas. If a user has > a quota defined, and that user is also part of a group with a quota defined, > does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS > setting, similar to GLOBAL_BYPASS? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HBASE-27784) Custom quota groupings
[ https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766940#comment-17766940 ] Ray Mattingly edited comment on HBASE-27784 at 9/19/23 11:18 PM: - It isn't exactly what's described in the issue, but I want to propose [this draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft]. Basically we add support for a request attribute {{quota.user.override}} which, when configured, takes precedent when determining which user quota to apply to the given request. This allows for us to throttle distinct requests from a shared connection as is demonstrated by the added unit test. One could achieve the "quota group" idea described above by submitting hadoop jobs as a single user override (e.g., {{{}hadoop{}}}). It could also satisfy upstream caller distinctions within a proxy API's shared connection object by configuring the user override based on some identifying characteristic of the upstream caller. was (Author: JIRAUSER286879): It isn't exactly what's described in the issue, but I want to propose [this draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft]. Basically we add support for a request attribute {{quota.user.override}} which, when configured, takes precedent when determining which user quota to apply to the given request. This allows for us to throttle distinct requests from a shared connection as is demonstrated by the added unit test. > Custom quota groupings > -- > > Key: HBASE-27784 > URL: https://issues.apache.org/jira/browse/HBASE-27784 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we provide the ability to define quotas for namespaces, tables, or > users. On multi-tenant clusters, users may be broken down into groups based > on their use-case. For us this comes down to 2 main cases: > # Hadoop jobs – it would be good to be able to limit all hadoop jobs in > aggregate > # Proxy APIs - this is common where upstream callers don't hit hbase > directly, instead they go through one of many proxy api's. For us we have a > custom auth plugin which sets the username to the upstream caller name. But > it would still be useful to be able to limit all usage from some particular > proxy API in aggregate. > I think this could build upon the idea for Connection attributes in > HBASE-27657. Basically when a Connection is established we can set an > attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI). In > QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also > allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => > 'hadoop', LIMIT => '100M/sec'}} > I need to do more investigation into whether we'd want to return a simple > group limiter (more similar to table/namespace handling) or treat it more > like the USER limiters which returns a QuotaState (so you can limit > by-group-by-table). > We need to consider how GROUP quotas interact with USER quotas. If a user has > a quota defined, and that user is also part of a group with a quota defined, > does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS > setting, similar to GLOBAL_BYPASS? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27784) Custom quota groupings
[ https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766940#comment-17766940 ] Ray Mattingly commented on HBASE-27784: --- It isn't exactly what's described in the issue, but I want to propose [this draft|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-27784-draft]. Basically we add support for a request attribute {{quota.user.override}} which, when configured, takes precedent when determining which user quota to apply to the given request. This allows for us to throttle distinct requests from a shared connection as is demonstrated by the added unit test. > Custom quota groupings > -- > > Key: HBASE-27784 > URL: https://issues.apache.org/jira/browse/HBASE-27784 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we provide the ability to define quotas for namespaces, tables, or > users. On multi-tenant clusters, users may be broken down into groups based > on their use-case. For us this comes down to 2 main cases: > # Hadoop jobs – it would be good to be able to limit all hadoop jobs in > aggregate > # Proxy APIs - this is common where upstream callers don't hit hbase > directly, instead they go through one of many proxy api's. For us we have a > custom auth plugin which sets the username to the upstream caller name. But > it would still be useful to be able to limit all usage from some particular > proxy API in aggregate. > I think this could build upon the idea for Connection attributes in > HBASE-27657. Basically when a Connection is established we can set an > attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI). In > QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also > allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => > 'hadoop', LIMIT => '100M/sec'}} > I need to do more investigation into whether we'd want to return a simple > group limiter (more similar to table/namespace handling) or treat it more > like the USER limiters which returns a QuotaState (so you can limit > by-group-by-table). > We need to consider how GROUP quotas interact with USER quotas. If a user has > a quota defined, and that user is also part of a group with a quota defined, > does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS > setting, similar to GLOBAL_BYPASS? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28010) Connection attributes can become corrupted on the server side
[ https://issues.apache.org/jira/browse/HBASE-28010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28010: - Assignee: Ray Mattingly > Connection attributes can become corrupted on the server side > - > > Key: HBASE-28010 > URL: https://issues.apache.org/jira/browse/HBASE-28010 > Project: HBase > Issue Type: Bug >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > In ServerRpcConnection.processOneRpc, it calls processConnectionHeader and > then immediately calls callCleanupIfNeeded. The parsing of the ByteBuff into > the ConnectionHeader does not copy the bytes. We keep a reference to > ConnectionHeader for later use, but since the underlying ByteBuff gets > released in callCleanupIfNeeded, later requests can override the memory > locations that the ConnectionHeader points at. > The unit tests we added dont catch this possibly because they don't send > enough requests to corrupt the buffers. It happens pretty quickly in a > deployed cluster. > We need to copy the List from the ConnectionHeader into a Map > before the buffer is released. This probably means we should remove > getConnectionHeader from the RpcCall interface and instead add > getConnectionAttributes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28002) Add Get, Mutate, and Multi operations to slow log params
[ https://issues.apache.org/jira/browse/HBASE-28002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28002: - Assignee: Ray Mattingly > Add Get, Mutate, and Multi operations to slow log params > > > Key: HBASE-28002 > URL: https://issues.apache.org/jira/browse/HBASE-28002 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > In https://issues.apache.org/jira/browse/HBASE-27536 we added the ability to > include Scan operations in the slow log params. It would be useful to include > more operations too. Beyond just showing the shape of the request to slow log > readers, this would also ensure that operation attributes can be inferred. > There are a few complications to consider for some operation types: > * Mutate: > ** we should probably strip the columns from these puts. Otherwise we might > produce unpredictably large slow log payloads, and there are potentially > security concerns to consider > * Multi > ** we should also consider stripping columns from these requests > ** (configurably?) limiting the number of operations that can be included. > For example, maybe we only want to include 5 operations on a slow log payload > for a 100 operation MultiRequest for the sake of brevity > ** we may want to deduplicate operation attributes. I'm not really sure how > we'd do this without the output being misleading -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28002) Add Get, Mutate, and Multi operations to slow log params
Ray Mattingly created HBASE-28002: - Summary: Add Get, Mutate, and Multi operations to slow log params Key: HBASE-28002 URL: https://issues.apache.org/jira/browse/HBASE-28002 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly In https://issues.apache.org/jira/browse/HBASE-27536 we added the ability to include Scan operations in the slow log params. It would be useful to include more operations too. Beyond just showing the shape of the request to slow log readers, this would also ensure that operation attributes can be inferred. There are a few complications to consider for some operation types: * Mutate: ** we should probably strip the columns from these puts. Otherwise we might produce unpredictably large slow log payloads, and there are potentially security concerns to consider * Multi ** we should also consider stripping columns from these requests ** (configurably?) limiting the number of operations that can be included. For example, maybe we only want to include 5 operations on a slow log payload for a 100 operation MultiRequest for the sake of brevity ** we may want to deduplicate operation attributes. I'm not really sure how we'd do this without the output being misleading -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28001) Add request attribute support to BufferedMutator
Ray Mattingly created HBASE-28001: - Summary: Add request attribute support to BufferedMutator Key: HBASE-28001 URL: https://issues.apache.org/jira/browse/HBASE-28001 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly In https://issues.apache.org/jira/browse/HBASE-27657 we added support for specifying connection and request attributes. One oversight was including support for doing so via the BufferedMutator class. We should add such support in a follow up PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-28001) Add request attribute support to BufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-28001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-28001: - Assignee: Ray Mattingly > Add request attribute support to BufferedMutator > > > Key: HBASE-28001 > URL: https://issues.apache.org/jira/browse/HBASE-28001 > Project: HBase > Issue Type: Improvement >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > > In https://issues.apache.org/jira/browse/HBASE-27657 we added support for > specifying connection and request attributes. One oversight was including > support for doing so via the BufferedMutator class. We should add such > support in a follow up PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27981) Add connection, request, and operation attributes to slow log
[ https://issues.apache.org/jira/browse/HBASE-27981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745279#comment-17745279 ] Ray Mattingly commented on HBASE-27981: --- Yeah agreed, we could definitely add single request operation attributes to the params > Add connection, request, and operation attributes to slow log > - > > Key: HBASE-27981 > URL: https://issues.apache.org/jira/browse/HBASE-27981 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > These can help users diagnose slow requests by pushing identifying > information into the log. It might make sense to union them into a single > field or put them in separate fields. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27981) Add connection, request, and operation attributes to slow log
[ https://issues.apache.org/jira/browse/HBASE-27981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745211#comment-17745211 ] Ray Mattingly commented on HBASE-27981: --- It should be trivial to add request & connection attributes to the slow logs once HBASE-27657 is merged. I've written up [this branch|https://github.com/HubSpot/hbase/compare/HBASE-27657-custom-rpc-controller...HubSpot:hbase:HBASE-27981] as a proof of concept in the meantime. Operation attributes will be trickier because we'll need to parse the messages appropriately, either reiterate some logic or significantly refactor our current payload derivation, think through what we'll do with large multi requests that contain tons of attributes, etc.. I wonder whether we're better off omitting that work from this ticket, or at least from the first PR. > Add connection, request, and operation attributes to slow log > - > > Key: HBASE-27981 > URL: https://issues.apache.org/jira/browse/HBASE-27981 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > These can help users diagnose slow requests by pushing identifying > information into the log. It might make sense to union them into a single > field or put them in separate fields. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27981) Add connection, request, and operation attributes to slow log
[ https://issues.apache.org/jira/browse/HBASE-27981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27981: - Assignee: Ray Mattingly > Add connection, request, and operation attributes to slow log > - > > Key: HBASE-27981 > URL: https://issues.apache.org/jira/browse/HBASE-27981 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > These can help users diagnose slow requests by pushing identifying > information into the log. It might make sense to union them into a single > field or put them in separate fields. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27657) Connection and Request Attributes
[ https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743955#comment-17743955 ] Ray Mattingly commented on HBASE-27657: --- I wrote up a draft PR demonstrating a TableBuilder request attributes implementation here: https://github.com/apache/hbase/pull/5326 > Connection and Request Attributes > - > > Key: HBASE-27657 > URL: https://issues.apache.org/jira/browse/HBASE-27657 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we have the ability to set Operation attributes, via > Get.setAttribute, etc. It would be useful to be able to set attributes at the > request and connection level. > These levels can result in less duplication. For example, send some > attributes once per connection instead of for every one of the millions of > requests a connection might send. Or send once for the request, instead of > duplicating on every operation in a multi request. > Additionally, the Connection and RequestHeader are more globally available on > the server side. Both can be accessed via RpcServer.getCurrentCall(), which > is useful in various integration points – coprocessors, custom queues, > quotas, slow log, etc. Operation attributes are harder to access because you > need to parse the raw Message into the appropriate type to get access to the > getter. > I was thinking adding two new methods to Connection interface: > - setAttribute (and getAttribute/getAttributes) > - setRequestAttributeProvider > Any Connection attributes would be set onto the ConnectionHeader during > initialization. The RequestAttributeProvider would be called when creating > each RequestHeader. > An alternative to setRequestAttributeProvider would be to add this into > HBaseRpcController, which can already be customized via site configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27975) Region (un)assignment should have a more direct timeout
Ray Mattingly created HBASE-27975: - Summary: Region (un)assignment should have a more direct timeout Key: HBASE-27975 URL: https://issues.apache.org/jira/browse/HBASE-27975 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly h3. Problem We've observed a few cases in which region (un)assignment can hang for significant, and sometimes seemingly indefinite, periods of time. This results in unpredictably long downtime which must be remediated via manually initiated ServerCrashProcedures. h3. Example 1 If a RS is unable to communicate with the NameNode and it is asked to close a region then its RS_CLOSE_REGION thread will get stuck awaiting a NN failover. Due to several default configurations of options like: * hbase.hstore.flush.retries.number * hbase.server.pause * dfs.client.failover.max.attempts * dfs.client.failover.sleep.base.millis * dfs.client.failover.max.attempts this region unassignment attempt will hang for approximately 30 minutes before it allows the failure to bubble up and automatically trigger a ServerCrashProcedure. One can tune the aforementioned options to reduce the TTR here, but it's not a very obvious/direct solution. h3. Example 2 In rare cases our public cloud provider may supply us with machines that have degraded hardware. If we're unable to catch this degradation prior to startup, then we've observed that the degraded RegionServer process may come online; as a result it will be assigned regions which can often never actually be successfully opened. If the RegionServer's assignment handling fails to intentionally fail, then there will never be outside intervention; the assignment will be stuck hanging indefinitely. I've written [a unit test|https://github.com/apache/hbase/compare/master...HubSpot:hbase:rsit-opening-repro] which reproduces this behavior. On this same branch is a unit test demonstrating that a timeout placed on the AssignRegionHandler helps to fast fail and reliably trigger the necessary ServerCrashProcedure. h3. Proposal I want to propose that we add optional and configurable timeouts to the AssignRegion and UnassignRegion event handlers. This would allow us to much more intentionally & clearly prevent long running retries for these downtime inducing procedures and could consequently improve our reliability in both examples. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27657) Connection and Request Attributes
[ https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743225#comment-17743225 ] Ray Mattingly commented on HBASE-27657: --- I've updated [the design doc|https://docs.google.com/document/d/1cGEmUn2kAPhn_Q18DvAOhigtCbnQ8ia6oV4OvMb5DyU/edit?usp=sharing] to reflect our new TableBuilder interface for request attributes and have a branch which has proved the concept. Does this seem like a more suitable design to you [~zhangduo]? > Connection and Request Attributes > - > > Key: HBASE-27657 > URL: https://issues.apache.org/jira/browse/HBASE-27657 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we have the ability to set Operation attributes, via > Get.setAttribute, etc. It would be useful to be able to set attributes at the > request and connection level. > These levels can result in less duplication. For example, send some > attributes once per connection instead of for every one of the millions of > requests a connection might send. Or send once for the request, instead of > duplicating on every operation in a multi request. > Additionally, the Connection and RequestHeader are more globally available on > the server side. Both can be accessed via RpcServer.getCurrentCall(), which > is useful in various integration points – coprocessors, custom queues, > quotas, slow log, etc. Operation attributes are harder to access because you > need to parse the raw Message into the appropriate type to get access to the > getter. > I was thinking adding two new methods to Connection interface: > - setAttribute (and getAttribute/getAttributes) > - setRequestAttributeProvider > Any Connection attributes would be set onto the ConnectionHeader during > initialization. The RequestAttributeProvider would be called when creating > each RequestHeader. > An alternative to setRequestAttributeProvider would be to add this into > HBaseRpcController, which can already be customized via site configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27657) Connection and Request Attributes
[ https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742194#comment-17742194 ] Ray Mattingly commented on HBASE-27657: --- Thanks for the feedback here, that's a fair criticism for sure. I'm going to explore whether we can add support for request header configuration in the TableBuilder > Connection and Request Attributes > - > > Key: HBASE-27657 > URL: https://issues.apache.org/jira/browse/HBASE-27657 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we have the ability to set Operation attributes, via > Get.setAttribute, etc. It would be useful to be able to set attributes at the > request and connection level. > These levels can result in less duplication. For example, send some > attributes once per connection instead of for every one of the millions of > requests a connection might send. Or send once for the request, instead of > duplicating on every operation in a multi request. > Additionally, the Connection and RequestHeader are more globally available on > the server side. Both can be accessed via RpcServer.getCurrentCall(), which > is useful in various integration points – coprocessors, custom queues, > quotas, slow log, etc. Operation attributes are harder to access because you > need to parse the raw Message into the appropriate type to get access to the > getter. > I was thinking adding two new methods to Connection interface: > - setAttribute (and getAttribute/getAttributes) > - setRequestAttributeProvider > Any Connection attributes would be set onto the ConnectionHeader during > initialization. The RequestAttributeProvider would be called when creating > each RequestHeader. > An alternative to setRequestAttributeProvider would be to add this into > HBaseRpcController, which can already be customized via site configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27657) Connection and Request Attributes
[ https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741604#comment-17741604 ] Ray Mattingly commented on HBASE-27657: --- Is the HBaseRpcController trivially available to clients when generating requests? I think inaccessibility is the main blocker for just adding getters/setters there > Connection and Request Attributes > - > > Key: HBASE-27657 > URL: https://issues.apache.org/jira/browse/HBASE-27657 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we have the ability to set Operation attributes, via > Get.setAttribute, etc. It would be useful to be able to set attributes at the > request and connection level. > These levels can result in less duplication. For example, send some > attributes once per connection instead of for every one of the millions of > requests a connection might send. Or send once for the request, instead of > duplicating on every operation in a multi request. > Additionally, the Connection and RequestHeader are more globally available on > the server side. Both can be accessed via RpcServer.getCurrentCall(), which > is useful in various integration points – coprocessors, custom queues, > quotas, slow log, etc. Operation attributes are harder to access because you > need to parse the raw Message into the appropriate type to get access to the > getter. > I was thinking adding two new methods to Connection interface: > - setAttribute (and getAttribute/getAttributes) > - setRequestAttributeProvider > Any Connection attributes would be set onto the ConnectionHeader during > initialization. The RequestAttributeProvider would be called when creating > each RequestHeader. > An alternative to setRequestAttributeProvider would be to add this into > HBaseRpcController, which can already be customized via site configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27657) Connection and Request Attributes
[ https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737337#comment-17737337 ] Ray Mattingly commented on HBASE-27657: --- I've written up [a basic design doc|https://docs.google.com/document/d/1cGEmUn2kAPhn_Q18DvAOhigtCbnQ8ia6oV4OvMb5DyU/edit?usp=sharing] to pair with [our initial PR|https://github.com/apache/hbase/pull/5306]. > Connection and Request Attributes > - > > Key: HBASE-27657 > URL: https://issues.apache.org/jira/browse/HBASE-27657 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we have the ability to set Operation attributes, via > Get.setAttribute, etc. It would be useful to be able to set attributes at the > request and connection level. > These levels can result in less duplication. For example, send some > attributes once per connection instead of for every one of the millions of > requests a connection might send. Or send once for the request, instead of > duplicating on every operation in a multi request. > Additionally, the Connection and RequestHeader are more globally available on > the server side. Both can be accessed via RpcServer.getCurrentCall(), which > is useful in various integration points – coprocessors, custom queues, > quotas, slow log, etc. Operation attributes are harder to access because you > need to parse the raw Message into the appropriate type to get access to the > getter. > I was thinking adding two new methods to Connection interface: > - setAttribute (and getAttribute/getAttributes) > - setRequestAttributeProvider > Any Connection attributes would be set onto the ConnectionHeader during > initialization. The RequestAttributeProvider would be called when creating > each RequestHeader. > An alternative to setRequestAttributeProvider would be to add this into > HBaseRpcController, which can already be customized via site configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27784) Custom quota groupings
[ https://issues.apache.org/jira/browse/HBASE-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27784: - Assignee: Ray Mattingly > Custom quota groupings > -- > > Key: HBASE-27784 > URL: https://issues.apache.org/jira/browse/HBASE-27784 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we provide the ability to define quotas for namespaces, tables, or > users. On multi-tenant clusters, users may be broken down into groups based > on their use-case. For us this comes down to 2 main cases: > # Hadoop jobs – it would be good to be able to limit all hadoop jobs in > aggregate > # Proxy APIs - this is common where upstream callers don't hit hbase > directly, instead they go through one of many proxy api's. For us we have a > custom auth plugin which sets the username to the upstream caller name. But > it would still be useful to be able to limit all usage from some particular > proxy API in aggregate. > I think this could build upon the idea for Connection attributes in > HBASE-27657. Basically when a Connection is established we can set an > attribute (i.e. quotaGrouping=hadoop or quotaGrouping=MyProxyAPI). In > QuotaCache, we can add a {{getQuotaGroupLimiter(String groupName)}} and also > allow someone to define quotas using {{set_quota TYPE => THROTTLE, GROUP => > 'hadoop', LIMIT => '100M/sec'}} > I need to do more investigation into whether we'd want to return a simple > group limiter (more similar to table/namespace handling) or treat it more > like the USER limiters which returns a QuotaState (so you can limit > by-group-by-table). > We need to consider how GROUP quotas interact with USER quotas. If a user has > a quota defined, and that user is also part of a group with a quota defined, > does the request need to honor both quotas? Maybe we provide a GROUP_BYPASS > setting, similar to GLOBAL_BYPASS? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27657) Connection and Request Attributes
[ https://issues.apache.org/jira/browse/HBASE-27657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27657: - Assignee: Ray Mattingly > Connection and Request Attributes > - > > Key: HBASE-27657 > URL: https://issues.apache.org/jira/browse/HBASE-27657 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > Currently we have the ability to set Operation attributes, via > Get.setAttribute, etc. It would be useful to be able to set attributes at the > request and connection level. > These levels can result in less duplication. For example, send some > attributes once per connection instead of for every one of the millions of > requests a connection might send. Or send once for the request, instead of > duplicating on every operation in a multi request. > Additionally, the Connection and RequestHeader are more globally available on > the server side. Both can be accessed via RpcServer.getCurrentCall(), which > is useful in various integration points – coprocessors, custom queues, > quotas, slow log, etc. Operation attributes are harder to access because you > need to parse the raw Message into the appropriate type to get access to the > getter. > I was thinking adding two new methods to Connection interface: > - setAttribute (and getAttribute/getAttributes) > - setRequestAttributeProvider > Any Connection attributes would be set onto the ConnectionHeader during > initialization. The RequestAttributeProvider would be called when creating > each RequestHeader. > An alternative to setRequestAttributeProvider would be to add this into > HBaseRpcController, which can already be customized via site configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27800) Add support for default user quotas using USER => 'all'
[ https://issues.apache.org/jira/browse/HBASE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27800: - Assignee: Ray Mattingly > Add support for default user quotas using USER => 'all' > > > Key: HBASE-27800 > URL: https://issues.apache.org/jira/browse/HBASE-27800 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > If someone sets a quota with USER => 'all' (or maybe '*'), treat that as a > default quota for each individual user. When a request comes from a user, it > will lookup current QuotaState based on username. If one doesn't exist, it > will be pre-filled with whatever the 'all' quota was set to. Otherwise, if > you then define a quota for a specific user that will override whatever > default you have set for that user only. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27553) SlowLog does not include params for Mutations
[ https://issues.apache.org/jira/browse/HBASE-27553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719940#comment-17719940 ] Ray Mattingly commented on HBASE-27553: --- {quote}Currently it handles MutationProto, but it should be MutateRequest {quote} I think it does handle MutateRequest lower down: [https://github.com/apache/hbase/blame/master/hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java#L2182-L2187]. Is that adequate? {quote}While we are here, the CoprocessorServiceRequest (handled further down) has a getRegion() method, but that is not passed into the SlowLogParams either. We should add that too. {quote} Totally agreed, this should be easy > SlowLog does not include params for Mutations > - > > Key: HBASE-27553 > URL: https://issues.apache.org/jira/browse/HBASE-27553 > Project: HBase > Issue Type: Bug >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Minor > > SlowLog params are extracted via > [ProtobufUtil.getSlowLogParams|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java#L2154]. > This method has various if/else branches for each request type, but mutation > (the line linked above) is incorrect. Currently it handles MutationProto, but > it should be MutateRequest. A MutationProto is never passed into this method, > only MutateRequests so any MutateRequests being passed in now will fall > through to the default case which contains nothing useful about the request. > As part of fixing this, we should also ensure that we extract the region name > from the MutateRequest to add into the SlowLogParams object like all the > other requests. > While we are here, the CoprocessorServiceRequest (handled further down) has a > getRegion() method, but that is not passed into the SlowLogParams either. We > should add that too. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException
[ https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715954#comment-17715954 ] Ray Mattingly commented on HBASE-27798: --- sounds good, thanks for the input! > Client side should back off based on wait interval in RpcThrottlingException > > > Key: HBASE-27798 > URL: https://issues.apache.org/jira/browse/HBASE-27798 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException
[ https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715947#comment-17715947 ] Ray Mattingly commented on HBASE-27798: --- That sounds good to me. Should we still add the retry backoff for the waitInterval case? > Client side should back off based on wait interval in RpcThrottlingException > > > Key: HBASE-27798 > URL: https://issues.apache.org/jira/browse/HBASE-27798 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException
[ https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715939#comment-17715939 ] Ray Mattingly commented on HBASE-27798: --- The current retry backoff has a few inputs & steps. For example, we use [the {{pause}} and {{pauseForServerOverloaded}} durations in {{RpcRetryingCallerImpl}}|https://github.com/apache/hbase/blob/branch-2/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerImpl.java#L59-L60] to determine a pauseBase millis, and then multiply it by the relevant {{RETRY_BACKOFF}} item. I'm wondering how we should incorporate the {{waitInterval}} into this existing system; I see a few options: # We could consider waitInterval an addition to the pauseBase # We could consider waitInterval an addition to the product of pauseBase * retryBackoff # We could consider waitInterval, if present, to be a replacement for the pauseBase # We could consider waitInterval, if present, to be a replacement for the produce of pauseBase * retryBackoff [~bbeaudreault] do you have any thoughts/preference here? > Client side should back off based on wait interval in RpcThrottlingException > > > Key: HBASE-27798 > URL: https://issues.apache.org/jira/browse/HBASE-27798 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27798) Client side should back off based on wait interval in RpcThrottlingException
[ https://issues.apache.org/jira/browse/HBASE-27798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27798: - Assignee: Ray Mattingly > Client side should back off based on wait interval in RpcThrottlingException > > > Key: HBASE-27798 > URL: https://issues.apache.org/jira/browse/HBASE-27798 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27799) RpcThrottlingException wait interval message is misleading between 0-1s
[ https://issues.apache.org/jira/browse/HBASE-27799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27799: - Assignee: Ray Mattingly > RpcThrottlingException wait interval message is misleading between 0-1s > --- > > Key: HBASE-27799 > URL: https://issues.apache.org/jira/browse/HBASE-27799 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > > When the wait interval is below 1s, it shows 0sec. We should show > milliseconds. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27535) Separate slowlog thresholds for scans vs other requests
[ https://issues.apache.org/jira/browse/HBASE-27535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17713615#comment-17713615 ] Ray Mattingly commented on HBASE-27535: --- [https://github.com/apache/hbase/pull/5188] is ready for review > Separate slowlog thresholds for scans vs other requests > --- > > Key: HBASE-27535 > URL: https://issues.apache.org/jira/browse/HBASE-27535 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > Labels: slowlog > > Scans by their nature are able to more efficiently pull back larger response > sizes than gets. They also may take longer to execute than other request > types. We should make it possible to configure a separate threshold for > response time and response time for scans. This will allow us to tune down > the thresholds for others without adding unnecessary noise for requests which > are known to be slower/bigger. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-27535) Separate slowlog thresholds for scans vs other requests
[ https://issues.apache.org/jira/browse/HBASE-27535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-27535: - Assignee: Ray Mattingly > Separate slowlog thresholds for scans vs other requests > --- > > Key: HBASE-27535 > URL: https://issues.apache.org/jira/browse/HBASE-27535 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > Labels: slowlog > > Scans by their nature are able to more efficiently pull back larger response > sizes than gets. They also may take longer to execute than other request > types. We should make it possible to configure a separate threshold for > response time and response time for scans. This will allow us to tune down > the thresholds for others without adding unnecessary noise for requests which > are known to be slower/bigger. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27786) CompoundBloomFilters break with an error rate that is too high
Ray Mattingly created HBASE-27786: - Summary: CompoundBloomFilters break with an error rate that is too high Key: HBASE-27786 URL: https://issues.apache.org/jira/browse/HBASE-27786 Project: HBase Issue Type: Bug Affects Versions: 2.5.2 Reporter: Ray Mattingly At my company we're beginning to more heavily utilize the bloom error rate configuration. This is because bloom filters are a nice optimization, but for well distributed workloads with relatively dense data (many rows:host), we've found that they can cause lots of memory/GC pressure unless they can entirely fit in the block cache (and consequently not churn memory that's subject to GC). Because it's easier to estimate the memory requirements of changes in existing bloom filters, rather than net new bloom filters, we wanted to begin with very high bloom error rates (and consequently small bloom filters), and then ratchet down as memory availability allowed. This led to us discovering that bloom filters appear to become corrupt at a relatively arbitrary error rate threshold. Blooms with an error rate of 0.61 work as expected, but produce nonsensical results with an error rate of 0.62. I've pushed this branch with test updates to demonstrate the deficit: [https://github.com/apache/hbase/compare/master...HubSpot:hbase:rmattingly/bloom-error-rate-bug] The test changes confirm that the BloomFilterUtil works as expected, at least with respect to its error rate : size relationship. You can see this in the output of {{{}TestBloomFilterChunk#testBloomErrorRateSizeRelationship{}}}: {noformat} previousErrorRate=0.01, previousSize=1048568 currentErrorRate=0.05, currentSize=682109 previousErrorRate=0.05, previousSize=682109 currentErrorRate=0.1, currentSize=524284 previousErrorRate=0.1, previousSize=524284 currentErrorRate=0.2, currentSize=366459 previousErrorRate=0.2, previousSize=366459 currentErrorRate=0.4, currentSize=208634 previousErrorRate=0.4, previousSize=208634 currentErrorRate=0.5, currentSize=157826 previousErrorRate=0.5, previousSize=157826 currentErrorRate=0.75, currentSize=65504 previousErrorRate=0.75, previousSize=65504 currentErrorRate=0.99, currentSize=2289 {noformat} With this in mind, the updates to {{TestCompoundBloomFilter}} tell us that the bug must live somewhere in the {{CompoundBloomFilter}} logic. The output indicates this: {noformat} 2023-04-10T15:07:50,925 INFO [Time-limited test] regionserver.TestCompoundBloomFilter(245): Functional bloom has error rate 0.01 and size 1kb ... 2023-04-10T15:07:56,657 INFO [Time-limited test] regionserver.TestCompoundBloomFilter(245): Functional bloom has error rate 0.61 and size 1kb ... java.lang.AssertionError: False positive is too high: 0.99985334 (greater than 0.65), fake lookup is enabled. Bloom size is 4687kb at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter.readStoreFile(TestCompoundBloomFilter.java:243) {noformat} The bloom size change from ~1kb -> 4687kb and total lack of precision is clearly not as intended, and totally inline with what we saw in our HBase clusters that attempted to use high bloom error rates. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27536) Include more request information in slowlog for Scans
[ https://issues.apache.org/jira/browse/HBASE-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly updated HBASE-27536: -- Description: Currently the slowlog only includes a barebones text format of the underlying protobuf Message fields. This is not a great UX for 2 reasons: # Most of the proto fields dont mirror the actual API names in our requests (Scan, Get, etc). # The chosen data is often not enough to actually infer anything about the request Any of the API class's toString method would be a much better representation of the request. On the server side, we already have to turn the protobuf Message into an actual API class in order to serve the request in RSRpcServices. Given slow logs should be a very small percent of total requests, I think we should do a similar parsing in SlowLogQueueService. Or better yet, perhaps we can pass the already parsed request into the queue at the start to avoid the extra work. When hydrating a SlowLogPayload with this request information, I believe we should use {{Operation's toMap(int maxCols)}} method. Adding this onto the SlowLogPayload as a map (or list of key/values) will make it easier to consume via downstream automation. Alternatively we could use {{{}toJSON(){}}}. We should also include any attributes from the queries, as those made aid tracing at the client level. Edit: because of nuance related to handling multis and the adequacy of info available for gets/puts, we're scoping this issue down to focus on improving the information available on Scan slowlogs was: Currently the slowlog only includes a barebones text format of the underlying protobuf Message fields. This is not a great UX for 2 reasons: # Most of the proto fields dont mirror the actual API names in our requests (Scan, Get, etc). # The chosen data is often not enough to actually infer anything about the request Any of the API class's toString method would be a much better representation of the request. On the server side, we already have to turn the protobuf Message into an actual API class in order to serve the request in RSRpcServices. Given slow logs should be a very small percent of total requests, I think we should do a similar parsing in SlowLogQueueService. Or better yet, perhaps we can pass the already parsed request into the queue at the start to avoid the extra work. When hydrating a SlowLogPayload with this request information, I believe we should use {{Operation's toMap(int maxCols)}} method. Adding this onto the SlowLogPayload as a map (or list of key/values) will make it easier to consume via downstream automation. Alternatively we could use {{{}toJSON(){}}}. We should also include any attributes from the queries, as those made aid tracing at the client level. > Include more request information in slowlog for Scans > - > > Key: HBASE-27536 > URL: https://issues.apache.org/jira/browse/HBASE-27536 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Priority: Major > Labels: slowlog > > Currently the slowlog only includes a barebones text format of the underlying > protobuf Message fields. This is not a great UX for 2 reasons: > # Most of the proto fields dont mirror the actual API names in our requests > (Scan, Get, etc). > # The chosen data is often not enough to actually infer anything about the > request > Any of the API class's toString method would be a much better representation > of the request. On the server side, we already have to turn the protobuf > Message into an actual API class in order to serve the request in > RSRpcServices. Given slow logs should be a very small percent of total > requests, I think we should do a similar parsing in SlowLogQueueService. Or > better yet, perhaps we can pass the already parsed request into the queue at > the start to avoid the extra work. > When hydrating a SlowLogPayload with this request information, I believe we > should use {{Operation's toMap(int maxCols)}} method. Adding this onto the > SlowLogPayload as a map (or list of key/values) will make it easier to > consume via downstream automation. Alternatively we could use > {{{}toJSON(){}}}. > We should also include any attributes from the queries, as those made aid > tracing at the client level. > Edit: because of nuance related to handling multis and the adequacy of info > available for gets/puts, we're scoping this issue down to focus on improving > the information available on Scan slowlogs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-27536) Include more request information in slowlog for Scans
[ https://issues.apache.org/jira/browse/HBASE-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly updated HBASE-27536: -- Summary: Include more request information in slowlog for Scans (was: Include more request information in slowlog) > Include more request information in slowlog for Scans > - > > Key: HBASE-27536 > URL: https://issues.apache.org/jira/browse/HBASE-27536 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Priority: Major > Labels: slowlog > > Currently the slowlog only includes a barebones text format of the underlying > protobuf Message fields. This is not a great UX for 2 reasons: > # Most of the proto fields dont mirror the actual API names in our requests > (Scan, Get, etc). > # The chosen data is often not enough to actually infer anything about the > request > Any of the API class's toString method would be a much better representation > of the request. On the server side, we already have to turn the protobuf > Message into an actual API class in order to serve the request in > RSRpcServices. Given slow logs should be a very small percent of total > requests, I think we should do a similar parsing in SlowLogQueueService. Or > better yet, perhaps we can pass the already parsed request into the queue at > the start to avoid the extra work. > When hydrating a SlowLogPayload with this request information, I believe we > should use {{Operation's toMap(int maxCols)}} method. Adding this onto the > SlowLogPayload as a map (or list of key/values) will make it easier to > consume via downstream automation. Alternatively we could use > {{{}toJSON(){}}}. > We should also include any attributes from the queries, as those made aid > tracing at the client level. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HBASE-26874) VerifyReplication recompare async
[ https://issues.apache.org/jira/browse/HBASE-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Mattingly reassigned HBASE-26874: - Assignee: Hernan Gelaf-Romer (was: Ray Mattingly) > VerifyReplication recompare async > - > > Key: HBASE-26874 > URL: https://issues.apache.org/jira/browse/HBASE-26874 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Hernan Gelaf-Romer >Priority: Major > > VerifyReplication includes an option "sleepMsBeforeReCompare". This is useful > for helping work around replication lag. However, adding a sleep in a hadoop > job can drastically slow that job down if there is anything more than a small > number of invalid results. > We can mitigate this by doing the recompare in a separate thread. We can > limit the thread pool and fallback to doing the recompare in the main thread > if the thread pool is full. This way we offload some of the slowness but > still retain the same validation guarantees. A configuration can be added to > control how many threads per mapper. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27536) Include more request information in slowlog
[ https://issues.apache.org/jira/browse/HBASE-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649476#comment-17649476 ] Ray Mattingly commented on HBASE-27536: --- {quote}I believe we should use {{Operation's toMap(int maxCols)}} method {quote} This seems doable. [~bbeaudreault] do you have any thoughts re: what a reasonable default {{maxCols}} might be? Should this value be configurable, or should we be hesitant to add another new conf option for something like this? I've looked through some implementations of toMap and my initial impression is that it wouldn't be too dangerous to have a relatively high default (like, in the hundreds?). But wanted to get your thoughts here too. > Include more request information in slowlog > --- > > Key: HBASE-27536 > URL: https://issues.apache.org/jira/browse/HBASE-27536 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Priority: Major > Labels: slowlog > > Currently the slowlog only includes a barebones text format of the underlying > protobuf Message fields. This is not a great UX for 2 reasons: > # Most of the proto fields dont mirror the actual API names in our requests > (Scan, Get, etc). > # The chosen data is often not enough to actually infer anything about the > request > Any of the API class's toString method would be a much better representation > of the request. On the server side, we already have to turn the protobuf > Message into an actual API class in order to serve the request in > RSRpcServices. Given slow logs should be a very small percent of total > requests, I think we should do a similar parsing in SlowLogQueueService. Or > better yet, perhaps we can pass the already parsed request into the queue at > the start to avoid the extra work. > When hydrating a SlowLogPayload with this request information, I believe we > should use {{Operation's toMap(int maxCols)}} method. Adding this onto the > SlowLogPayload as a map (or list of key/values) will make it easier to > consume via downstream automation. Alternatively we could use > {{{}toJSON(){}}}. > We should also include any attributes from the queries, as those made aid > tracing at the client level. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-27253) Make slow log configs updatable with configuration observer
[ https://issues.apache.org/jira/browse/HBASE-27253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648041#comment-17648041 ] Ray Mattingly commented on HBASE-27253: --- [A solution|https://github.com/apache/hbase/pull/4926] is ready for review. > Make slow log configs updatable with configuration observer > --- > > Key: HBASE-27253 > URL: https://issues.apache.org/jira/browse/HBASE-27253 > Project: HBase > Issue Type: Improvement >Reporter: Bryan Beaudreault >Assignee: Ray Mattingly >Priority: Major > Labels: slowlog > > It would be very useful to be able to turn slow log on or off, change > thresholds, etc on demand as needed when diagnosing a traffic issue. Should > be a simple matter of moving the configs into RpcServer#onConfigurationChange -- This message was sent by Atlassian Jira (v8.20.10#820010)