wenbingshen opened a new issue, #4025:
URL: https://github.com/apache/bookkeeper/issues/4025

   **BP**
   
   This is the master ticket for tracking BP-63 :
   Proposal PR - #3964 
   
   ### Motivation
   Currently, the Bookie can reschedule Auditor check tasks in several ways, 
excluding the auditorBookieTask as it provides a separate mechanism to trigger 
task reexecution. This BP specifically discusses 
AuditorCheckAllLedgersTask/AuditorPlacementPolicyCheckTask/AuditorReplicasCheckTask:
   
   1: The Bookie provides three execution times based on ZooKeeper, 
checkallledgersctime/placementpolicycheckctime/replicascheckctime. By updating 
these execution times, we can dynamically adjust the execution frequency of 
auditor tasks, but it requires restarting the Auditor process or reopening the 
Auditor election to trigger task execution.
   
   2: By using the ForceAuditorChecksCmd tool, which is still based on the 
underlying logic of the first point, restarting the Auditor or performing an 
election is also necessary to trigger task execution.
   
   3: The Decommission and RecoveryBookie tools tend to focus on executing 
recovery logic and only check and recover a specific subset of Bookie services.
   
   The above methods are complex and have poor stability when rescheduling the 
Auditor check tasks in a cluster.
   
   ### Proposal
   
   Therefore, I propose further optimizing the rescheduling of Auditor tasks.
   
   1: The Auditor monitors the persistent znode path 
/ZK_LEDGERS_ROOT_PATH/underreplication/scheduleAuditor.
   2: Users modify the task ctime using the ForceAuditorChecksCmd tool and 
forcefully create the above znode path using the force parameter.
   3: The Auditor creates callbacks through scheduleAuditor to reschedule the 
aforementioned three tasks.
   4: After the Auditor completes rescheduling the tasks, the scheduleAuditor 
node is deleted.
   5: When the Auditor starts, it deletes the old scheduleAuditor node to avoid 
logical confusion.
   
   This way, we can trigger the scheduling and execution of Auditor tasks 
through an online interface without relying on service restart or re-election.
   
   ### Compatibility, Deprecation, and Migration Plan
   
   There are no compatibility issues. This BP introduces a new trigger flag 
that does not affect the original logic and does not involve any changes to 
other existing public APIs. There is no deprecation or migration plan.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to