[
https://issues.apache.org/jira/browse/HBASE-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663128#comment-13663128
]
Jerry He commented on HBASE-8310:
---------------------------------
[~mbertozzi]
Thanks for the explanation. We can probably live with the 'imperfect'
synchronized client for now. But I am still worried about the big out-of-sync
in the time out values: 1 min for client and 10 mins for lock.
On the other hand, I looked at the latest code again. This is what we do in
SnapshotManager#snapshotTable()
{code}
try {
handler.prepare();
this.executorService.submit(handler);
this.snapshotHandlers.put(snapshot.getTable(), handler);
} catch (Exception e) {
{code}
The handler is not put in the map until after the lock attempt in prepare().
This would allow the other snapshot request to come in the meantime without
getting a rejection.
Should we move it up? or anything I missed?
> HBase snapshot timeout default values and TableLockManger timeout
> -----------------------------------------------------------------
>
> Key: HBASE-8310
> URL: https://issues.apache.org/jira/browse/HBASE-8310
> Project: HBase
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 0.95.0
> Reporter: Jerry He
> Assignee: Jerry He
> Priority: Minor
> Fix For: 0.98.0, 0.95.2, 0.94.9
>
>
> There are a few timeout values and defaults being used by HBase snapshot.
> DEFAULT_MAX_WAIT_TIME (60000 milli sec, 1 min) for client response
> TIMEOUT_MILLIS_DEFAULT (60000 milli sec, 1 min) for Procedure timeout
> SNAPSHOT_TIMEOUT_MILLIS_DEFAULT (60000 milli sec, 1 min) for region server
> subprocedure
> There is also other timeout involved, for example,
> DEFAULT_TABLE_WRITE_LOCK_TIMEOUT_MS (10 mins) for
> TakeSnapshotHandler#prepare()
> We could have this case:
> The user issues a sync snapshot request, waits for 1 min, and gets an
> exception.
> In the meantime the snapshot handler is blocked on the table lock, and the
> snapshot may continue to finish after 10 mins.
> But the user will probably re-issue the snapshot request during the 10 mins.
> This is a little confusing and messy when this happens.
> To be more reasonable, we should either increase the DEFAULT_MAX_WAIT_TIME or
> decrease the table lock waiting time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira