[jira] [Created] (HBASE-28608) Client meta operation timeout does not default to the correct value
Daniel Roudnitsky created HBASE-28608: - Summary: Client meta operation timeout does not default to the correct value Key: HBASE-28608 URL: https://issues.apache.org/jira/browse/HBASE-28608 Project: HBase Issue Type: Bug Components: Client Reporter: Daniel Roudnitsky Assignee: Daniel Roudnitsky Client meta operation timeout {{hbase.client.meta.operation.timeout}} default was intended to be set to the configured client operation timeout, but it defaults to the default client operation timeout of 20 minutes instead. This defeats the purpose of the meta operation timeout if one has {{hbase.client.operation.timeout}} < 20 minutes and does not explicitly set {{hbase.client.meta.operation.timeout}} . From "Timeout settings" in the hbase reference : {panel} A higher-level timeout is hbase.client.operation.timeout which is valid for each client call. When an RPC call fails for instance for a timeout due to hbase.rpc.timeout it will be retried until hbase.client.operation.timeout is reached. Client operation timeout for system tables can be fine tuned by setting hbase.client.meta.operation.timeout configuration value. When this is not set its value will use hbase.client.operation.timeout {panel} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28560) Region quotas: Split/merge procedure rollback can lead to inaccurate account of region counts
Daniel Roudnitsky created HBASE-28560: - Summary: Region quotas: Split/merge procedure rollback can lead to inaccurate account of region counts Key: HBASE-28560 URL: https://issues.apache.org/jira/browse/HBASE-28560 Project: HBase Issue Type: Bug Affects Versions: 3.0.0-beta-2 Reporter: Daniel Roudnitsky Assignee: Daniel Roudnitsky When region quotas are enabled, HMaster keeps an in memory account of region counts through NamespaceStateManager. Region counts in NamespaceStateManager are incremented/decremented at the beginning stages of split/merge procedures, in SPLIT_TABLE_REGION_PRE_OPERATION/MERGE_TABLE_REGIONS_PRE_MERGE_OPERATION before any region is offlined. If the split/merge procedure gets rolled back after the region count change in NamespaceStateManager is made, the split/merge procedure rollback does not revert the region count change in NamespaceStateManager to reflect that the expected split/merge never succeeded. This leaves NamespaceStateManager with an inaccurate account of the number of regions, believing that there are more/less regions than actually exist. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28559) Region quotas: Multi-region merge causes inaccurate accounting of region counts
Daniel Roudnitsky created HBASE-28559: - Summary: Region quotas: Multi-region merge causes inaccurate accounting of region counts Key: HBASE-28559 URL: https://issues.apache.org/jira/browse/HBASE-28559 Project: HBase Issue Type: Bug Components: Quotas Affects Versions: 3.0.0-beta-2 Reporter: Daniel Roudnitsky Assignee: Daniel Roudnitsky There is support for merging more than two regions in one merge procedure with multi-region merge, but if region quotas are enabled, [NamespaceAuditor assumes that every merge is a two region merge|https://github.com/apache/hbase/blob/branch-3/hbase-server/src/main/java/org/apache/hadoop/hbase/namespace/NamespaceAuditor.java#L128-L129]. This causes an inaccurate in memory accounting of region counts in NamespaceStateManager, leading MasterQuotaManager to believe there are more regions than actually exist if multi-region merge is used. To demonstrate the issue: 1. Start with a table with 3 regions in a namespace with a region quota limit of 3 2. Merge all 3 regions leaving 1 region, NamespaceAuditor assumed it was a 2 region merge and believes the number of regions to be 2. 3. Split a region, number of regions is now 2, and NamespaceAuditor believe it to be 3. 4. Attempt another region split, which will fail because NamespaceAuditor believes the ns to be at its region limit of 3 when there are actually only 2 regions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback
Daniel Roudnitsky created HBASE-28533: - Summary: Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback Key: HBASE-28533 URL: https://issues.apache.org/jira/browse/HBASE-28533 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.5.8 Environment: HBase Version 2.5.8, r37444de6531b1bdabf2e445c83d0268ab1a6f919, Thu Feb 29 15:37:32 PST 2024 Reporter: Daniel Roudnitsky When a SplitTableRegionProcedure is run for a region whose namespace is at its maximum region quota limit, the split procedure will fail and rollback, and Hmaster's in memory RegionStateNode for the region is left in a SPLITTING state. Hmaster will then refuse to start any subsequent merge/split/move procedures for that region because it believes the region is not OPEN, until it is restarted and the in memory record of region states is reset. In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the parent region's RegionStateNode state is set to SPLITTING, and the transition is not written to the meta table. In the next step SPLIT_TABLE_REGION_PRE_OPERATION the region quota check is done, QuotaExceededException is thrown and the procedure ends in ROLLEDBACK state without reverting the RegionStateNode back to OPEN state. Hmaster is left believing the region is in a SPLITTING state according to its in memory RegionStates, while the region is still online on the assigned region server and according to meta. To reproduce in HBase shell: {code:java} > create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2} > create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => > 'UniformSplit'} > region_a = > region_b = > split region_a, 'x' # HMaster will report: pid=405, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: Region split not possible for : as quota limits are exceeded ; SplitTableRegionProcedure table=test_ns:test_table, parent=... > merge_region region_a, region_b ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: org.apache.hadoop.hbase.client.DoNotRetryRegionException: is not OPEN; state=SPLITTING > stop_master # trigger hmaster failover > merge_region region_a, region_b # merge now succeeds {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)