[
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193038#comment-15193038
]
Jianwei Cui commented on HBASE-15433:
-------------------------------------
{quote}
When QEE is thrown we will still end up in updating the region quota which is
not really required, may be we can avoid that.
{quote}
Yes, we should catch QEE firstly and not update the quota information in such
situation as you suggested above.
{quote}
Also suggest to rename currentRegionCount to tableRegionCount and
updatedRegionCount to snapshotRegionCount for better understanding. Please add
more comments like why are we doing this way.
{quote}
Good suggestions, will update the patch.
{quote}
If this throws exception then there will be another issue, because now the
snapshot has been successfully restored but in the catch clause we are updating
the table region count in namespace quota.
{quote}
Good find. Here, the {{checkAndUpdateNamespaceRegionQuota}} should succeed
because it will reduce the region count for the table? However, if the
{{checkAndUpdateNamespaceRegionQuota}} throws exception, there must be some
unexpected reasons, and call {{checkAndUpdateNamespaceRegionQuota}} in catch
clause may also fail. We can log an error message in QEE catch clause and throw
it directly? And the code here can be updated as:
{code}
int tableRegionCount = -1;
try {
// Table already exist. Check and update the region quota for this
table namespace
// Table is disabled, table region count won't change during
restoreSnapshot
tableRegionCount = getRegionCountOfTable(tableName);
int snapshotRegionCount = manifest.getRegionManifestsMap().size();
// Update region count before restoreSnapshot if snapshotRegionCount is
larger. If we
// updated the region count to a smaller value before retoreSnapshot
and the retoreSnapshot
// fails, we may fail to reset the region count to its original value
if the namespace
// region count quota is consumed by other tables during the
restoreSnapshot, such as
// region split or table create under the same namespace.
if (tableRegionCount > 0 && tableRegionCount < snapshotRegionCount) {
checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
}
restoreSnapshot(snapshot, snapshotTableDesc);
// Update the region count after restoreSnapshot succeeded if
snapshotRegionCount is
// smaller. This step should not fail because it will reduce the region
count for table
if (tableRegionCount > 0 && tableRegionCount > snapshotRegionCount) {
checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
}
} catch (QuotaExceededException e) {
LOG.error("Exception occurred while restoring the snapshot " +
snapshot.getName()
+ " as table " + tableName.getNameAsString(), e);
// If QEE is thrown before restoreSnapshot, quota information is not
updated, and we
// should throw the exception directly. If QEE is thrown after
restoreSnapshot, there
// must be unexpected reasons, we also throw the exception directly
throw e;
} catch (IOException e) {
if (tableRegionCount > 0) {
// reset region count for table
checkAndUpdateNamespaceRegionQuota(tableRegionCount, tableName);
}
LOG.error("Exception occurred while restoring the snapshot " +
snapshot.getName()
+ " as table " + tableName.getNameAsString(), e);
throw e;
}
{code}
What's your opinion about this issue? [~ashish singhi]
> SnapshotManager#restoreSnapshot not update table and region count quota
> correctly when encountering exception
> -------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 2.0.0
> Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch,
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be
> checked and updated as:
> {code}
> try {
> // Table already exist. Check and update the region quota for this
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
> } catch (IOException e) {
>
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " +
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
> }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot
> make the region count quota exceeded, then, the table will be removed in the
> 'catch' block. This will make the current table count and region count
> decrease, following table creation or region split will succeed even if the
> actual quota is exceeded.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)