[
https://issues.apache.org/jira/browse/HDDS-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Attila Doroszlai updated HDDS-12377:
------------------------------------
Summary: Improve error handling of OM background tasks processing in case
of abrupt crash of Recon (was: Ozone Recon - Improve error handling of OM
background tasks processing in case of abrupt crash of Recon)
> Improve error handling of OM background tasks processing in case of abrupt
> crash of Recon
> -----------------------------------------------------------------------------------------
>
> Key: HDDS-12377
> URL: https://issues.apache.org/jira/browse/HDDS-12377
> Project: Apache Ozone
> Issue Type: Task
> Components: Ozone Recon
> Reporter: Devesh Kumar Singh
> Assignee: Devesh Kumar Singh
> Priority: Major
>
> If Recon has applied incremental DB updates and just before consuming those
> events, if Recon crashed due to some unexpected error or CU restarted the
> Recon during that time, then on restart of Recon again, recon will not try to
> consume those events again and due to this edge case, OM DB updates will be
> missed, So there are 2 solutions to fix this gap:
> * On restart, check if incremental DB update task lastSequence number not
> matching with lastUpdatedSeq number of underlying task, then just run
> reprocess for such tasks.
> * Another way, maintain lastUpdatedSequence number with each event
> consumption and then start applying from there on restart, but this may not
> be worth to implement the complex handling for this edge case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]