[
https://issues.apache.org/jira/browse/HDDS-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze updated HDDS-11470:
------------------------------
Description:
When OM failed to installCheckpoint (e.g. HDDS-10300), it should not reply
"Completed INSTALL_SNAPSHOT".
In the code below, when there is an exception, it just print an error message
and continue to reply "Completed INSTALL_SNAPSHOT".
{code}
//OzoneManager.installCheckpoint
try {
time = Time.monotonicNow();
dbBackup = replaceOMDBWithCheckpoint(lastAppliedIndex,
oldDBLocation, checkpointLocation);
term = checkpointTrxnInfo.getTerm();
lastAppliedIndex = checkpointTrxnInfo.getTransactionIndex();
LOG.info("Replaced DB with checkpoint from OM: {}, term: {}, " +
"index: {}, time: {} ms", leaderId, term, lastAppliedIndex,
Time.monotonicNow() - time);
} catch (Exception e) {
LOG.error("Failed to install Snapshot from {} as OM failed to replace" +
" DB with downloaded checkpoint. Reloading old OM state.",
leaderId, e);
}
{code}
was:
When OM failed to installCheckpoint (e.g. HDDS-10300), it should not reply
"Completed INSTALL_SNAPSHOT".
{code}
{code}
> OM should not reply Completed INSTALL_SNAPSHOT when installCheckpoint failed
> ----------------------------------------------------------------------------
>
> Key: HDDS-11470
> URL: https://issues.apache.org/jira/browse/HDDS-11470
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM HA
> Reporter: Tsz-wo Sze
> Priority: Major
>
> When OM failed to installCheckpoint (e.g. HDDS-10300), it should not reply
> "Completed INSTALL_SNAPSHOT".
> In the code below, when there is an exception, it just print an error message
> and continue to reply "Completed INSTALL_SNAPSHOT".
> {code}
> //OzoneManager.installCheckpoint
> try {
> time = Time.monotonicNow();
> dbBackup = replaceOMDBWithCheckpoint(lastAppliedIndex,
> oldDBLocation, checkpointLocation);
> term = checkpointTrxnInfo.getTerm();
> lastAppliedIndex = checkpointTrxnInfo.getTransactionIndex();
> LOG.info("Replaced DB with checkpoint from OM: {}, term: {}, " +
> "index: {}, time: {} ms", leaderId, term, lastAppliedIndex,
> Time.monotonicNow() - time);
> } catch (Exception e) {
> LOG.error("Failed to install Snapshot from {} as OM failed to
> replace" +
> " DB with downloaded checkpoint. Reloading old OM state.",
> leaderId, e);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]