[jira] [Created] (HDDS-11688) Ozone Recon - Improve processing reliability of OM DB events by Recon' background tasks

Devesh Kumar Singh (Jira) Tue, 12 Nov 2024 07:27:40 -0800

Devesh Kumar Singh created HDDS-11688:
-----------------------------------------


             Summary: Ozone Recon - Improve processing reliability of OM DB 
events by Recon' background tasks
                 Key: HDDS-11688
                 URL: https://issues.apache.org/jira/browse/HDDS-11688
             Project: Apache Ozone
          Issue Type: Task
          Components: Ozone Recon
            Reporter: Devesh Kumar Singh
            Assignee: Devesh Kumar Singh


When a set of OM DB events being synced periodically and incrementally in 
Recon, Recon process those set of events through some tasks to derive some 
insights about OM DB data and each task process each OM DB event sequentially, 
so it is important to know what all tasks have processed how many events and 
how many are still remaining to be processed received out of current OM DB 
sequence number Recon has pulled from OM DB. Currently Recon processes all 
events per task and if any event gets failed, Recon marks the whole task as 
failed and retry (re-run) the task another 2 times with the same set of events 
to try to process. 

Below are the steps:
1. Task will try to process those incremental set of events.
2. If task fails in step #1, then task is retried with same set of events, if 
it succeed, then we all good.
3. But if step #2 fails again with same set of events,  then task will run 
re-process and run against full records of that respective OM DB table.
4. Now here issue is, if step #3 also some where fails at any point of time, 
then currently  those set of incremental events synced are ignored and proceed 
to wait for next periodic sync of events from OM DB. So need to handle this 
edge case more diligently and efficiently to make Recon data more reliable.

 

Proposed way to handle:

If a task was failed in last run, then in its next run, let task run and 
process full OM DB snapshot to bring processed data to normalized state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDDS-11688) Ozone Recon - Improve processing reliability of OM DB events by Recon' background tasks

Reply via email to