[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193038#comment-15193038
 ] 

Jianwei Cui commented on HBASE-15433:
-------------------------------------

{quote}
When QEE is thrown we will still end up in updating the region quota which is 
not really required, may be we can avoid that.
{quote}
Yes, we should catch QEE firstly and not update the quota information in such 
situation as you suggested above.
{quote}
Also suggest to rename currentRegionCount to tableRegionCount and 
updatedRegionCount to snapshotRegionCount for better understanding. Please add 
more comments like why are we doing this way.
{quote}
Good suggestions, will update the patch.

{quote}
If this throws exception then there will be another issue, because now the 
snapshot has been successfully restored but in the catch clause we are updating 
the table region count in namespace quota.
{quote}
Good find. Here, the {{checkAndUpdateNamespaceRegionQuota}} should succeed 
because it will reduce the region count for the table? However, if the 
{{checkAndUpdateNamespaceRegionQuota}} throws exception, there must be some 
unexpected reasons, and call {{checkAndUpdateNamespaceRegionQuota}} in catch 
clause may also fail. We can log an error message in QEE catch clause and throw 
it directly? And the code here can be updated as:
{code}
      int tableRegionCount = -1;
      try {
        // Table already exist. Check and update the region quota for this 
table namespace
        // Table is disabled, table region count won't change during 
restoreSnapshot
        tableRegionCount = getRegionCountOfTable(tableName);
        int snapshotRegionCount = manifest.getRegionManifestsMap().size();
        
        // Update region count before restoreSnapshot if snapshotRegionCount is 
larger. If we
        // updated the region count to a smaller value before retoreSnapshot 
and the retoreSnapshot
        // fails, we may fail to reset the region count to its original value 
if the namespace
        // region count quota is consumed by other tables during the 
restoreSnapshot, such as
        // region split or table create under the same namespace.
        if (tableRegionCount > 0 && tableRegionCount < snapshotRegionCount) {
          checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
        }
        
        restoreSnapshot(snapshot, snapshotTableDesc);
        
        // Update the region count after restoreSnapshot succeeded if 
snapshotRegionCount is
        // smaller. This step should not fail because it will reduce the region 
count for table
        if (tableRegionCount > 0 && tableRegionCount > snapshotRegionCount) {
          checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
        }
      } catch (QuotaExceededException e) {
        LOG.error("Exception occurred while restoring the snapshot " + 
snapshot.getName()
          + " as table " + tableName.getNameAsString(), e);
        // If QEE is thrown before restoreSnapshot, quota information is not 
updated, and we
        // should throw the exception directly. If QEE is thrown after 
restoreSnapshot, there
        // must be unexpected reasons, we also throw the exception directly
        throw e;
      } catch (IOException e) {
        if (tableRegionCount > 0) {
          // reset region count for table
          checkAndUpdateNamespaceRegionQuota(tableRegionCount, tableName);
        }
        LOG.error("Exception occurred while restoring the snapshot " + 
snapshot.getName()
            + " as table " + tableName.getNameAsString(), e);
        throw e;
      }
{code}
What's your opinion about this issue? [~ashish singhi]

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15433
>                 URL: https://issues.apache.org/jira/browse/HBASE-15433
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 2.0.0
>            Reporter: Jianwei Cui
>         Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>       try {
>         // Table already exist. Check and update the region quota for this 
> table namespace
>         checkAndUpdateNamespaceRegionQuota(manifest, tableName);
>         restoreSnapshot(snapshot, snapshotTableDesc);
>       } catch (IOException e) {
>         
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
>         LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
>             + " as table " + tableName.getNameAsString(), e);
>         throw e;
>       }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to