[jira] [Created] (HBASE-28608) Client meta operation timeout does not default to the correct value

2024-05-21 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28608:
-

 Summary: Client meta operation timeout does not default to the 
correct value
 Key: HBASE-28608
 URL: https://issues.apache.org/jira/browse/HBASE-28608
 Project: HBase
  Issue Type: Bug
  Components: Client
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky


Client meta operation timeout {{hbase.client.meta.operation.timeout}} default 
was intended to be set to the configured client operation timeout, but it 
defaults to the default client operation timeout of 20 minutes instead. This 
defeats the purpose of the meta operation timeout if one has 
{{hbase.client.operation.timeout}} < 20 minutes and does not explicitly set 
{{hbase.client.meta.operation.timeout}} . From "Timeout settings" in the hbase 
reference :
{panel}
A higher-level timeout is hbase.client.operation.timeout which is valid for 
each client call. When an RPC call fails for instance for a timeout due to 
hbase.rpc.timeout it will be retried until hbase.client.operation.timeout is 
reached. Client operation timeout for system tables can be fine tuned by 
setting hbase.client.meta.operation.timeout configuration value. When this is 
not set its value will use hbase.client.operation.timeout
{panel}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28560) Region quotas: Split/merge procedure rollback can lead to inaccurate account of region counts

2024-04-29 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28560:
-

 Summary: Region quotas: Split/merge procedure rollback can lead to 
inaccurate account of region counts
 Key: HBASE-28560
 URL: https://issues.apache.org/jira/browse/HBASE-28560
 Project: HBase
  Issue Type: Bug
Affects Versions: 3.0.0-beta-2
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky


When region quotas are enabled, HMaster keeps an in memory account of region 
counts through NamespaceStateManager. Region counts in NamespaceStateManager 
are incremented/decremented at the beginning stages of split/merge procedures, 
in SPLIT_TABLE_REGION_PRE_OPERATION/MERGE_TABLE_REGIONS_PRE_MERGE_OPERATION 
before any region is offlined. If the split/merge procedure gets rolled back 
after the region count change in NamespaceStateManager is made, the split/merge 
procedure rollback does not revert the region count change in 
NamespaceStateManager to reflect that the expected split/merge never succeeded. 
This leaves NamespaceStateManager with an inaccurate account of the number of 
regions, believing that there are more/less regions than actually exist.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28559) Region quotas: Multi-region merge causes inaccurate accounting of region counts

2024-04-29 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28559:
-

 Summary: Region quotas: Multi-region merge causes inaccurate 
accounting of region counts
 Key: HBASE-28559
 URL: https://issues.apache.org/jira/browse/HBASE-28559
 Project: HBase
  Issue Type: Bug
  Components: Quotas
Affects Versions: 3.0.0-beta-2
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky


There is support for merging more than two regions in one merge procedure with 
multi-region merge, but if region quotas are enabled, [NamespaceAuditor assumes 
that every merge is a two region 
merge|https://github.com/apache/hbase/blob/branch-3/hbase-server/src/main/java/org/apache/hadoop/hbase/namespace/NamespaceAuditor.java#L128-L129].
 This causes an inaccurate in memory accounting of region counts in 
NamespaceStateManager, leading MasterQuotaManager to believe there are more 
regions than actually exist if multi-region merge is used. 

To demonstrate the issue:
1. Start with a table with 3 regions in a namespace with a region quota limit 
of 3
2. Merge all 3 regions leaving 1 region, NamespaceAuditor assumed it was a 2 
region merge and believes the number of regions to be 2.
3. Split a region, number of regions is now 2, and NamespaceAuditor believe it 
to be 3.
4. Attempt another region split, which will fail because NamespaceAuditor 
believes the ns to be at its region limit of 3 when there are actually only 2 
regions. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback

2024-04-17 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28533:
-

 Summary: Region split failure due to region quota limit leaves 
Hmaster's in memory state for the region in SPLITTING after procedure rollback
 Key: HBASE-28533
 URL: https://issues.apache.org/jira/browse/HBASE-28533
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 2.5.8
 Environment: HBase Version 2.5.8, 
r37444de6531b1bdabf2e445c83d0268ab1a6f919, Thu Feb 29 15:37:32 PST 2024
Reporter: Daniel Roudnitsky


When a SplitTableRegionProcedure is run for a region whose namespace is at its 
maximum region quota limit, the split procedure will fail and rollback, and 
Hmaster's in memory RegionStateNode for the region is left in a SPLITTING 
state. Hmaster will then refuse to start any subsequent merge/split/move 
procedures for that region because it believes the region is not OPEN, until it 
is restarted and the in memory record of region states is reset.

In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the parent 
region's RegionStateNode state is set to SPLITTING, and the transition is not 
written to the meta table. In the next step SPLIT_TABLE_REGION_PRE_OPERATION 
the region quota check is done, QuotaExceededException is thrown and the 
procedure ends in ROLLEDBACK state without reverting the RegionStateNode back 
to OPEN state. Hmaster is left believing the region is in a SPLITTING state 
according to its in memory RegionStates, while the region is still online on 
the assigned region server and according to meta.

To reproduce in HBase shell:

{code:java}
> create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2}
> create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => 
> 'UniformSplit'}
> region_a = 
> region_b = 

> split region_a, 'x'
# HMaster will report: 
pid=405, state=ROLLEDBACK, 
exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via 
master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: 
Region split not possible for : as quota limits are exceeded ; 
SplitTableRegionProcedure table=test_ns:test_table, parent=...

> merge_region region_a, region_b
ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: 
org.apache.hadoop.hbase.client.DoNotRetryRegionException:  is not 
OPEN; state=SPLITTING

> stop_master # trigger hmaster failover 
> merge_region region_a, region_b # merge now succeeds {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)