hzh0425 opened a new issue, #5585:
URL: https://github.com/apache/rocketmq/issues/5585

   # Background
   
   The DLedger Controller mode was launched after RocketMQ 5.0, which can 
provide the function of automatic master-slave switching.
   
   The principle is as follows:
   
   
![img](https://camo.githubusercontent.com/85dc7386e8030f9b921267269bb9dcd5a2d9805609d7c8431e2bc2c97002cc57/68747470733a2f2f73312e617831782e636f6d2f323032322f30372f30312f6a51726e38732e706e67)
   
   DLedger Controller is a strongly consistent metadata center built on DLedger 
(Raft), which stores metadata related to master selection.
   
   However, because DLedger did not provide the ability of Snapshot before, 
when the Controller restarts, it needs to replay all the logs in DLedger to 
restore the state before Shutdown. Restarting all logs can cause the restart to 
take a lot of time, especially when there are a lot of logs.
   
   Recently, we added the Snapshot capability to DLedger, thanks to @ for his 
contribution.
   
   The basic principle is as follows: For example, for the log before LogIndex 
= 6, DLedger will create a snapshot for it. This Snapshot contains the state 
machine state before LogIndex = 6. In this way, when restarting, we only need 
to load this snapshot, and only replay logs 6 and 7, instead of replaying logs 
1 ~ 7, which can save a lot of time.
   
   
   
   
![image-20221124132327142](https://hzh-pic.oss-cn-beijing.aliyuncs.com/img/image-20221124132327142.png)
   
   
   
   
   
   # Implementation
   
   The Snapshot function has been implemented in DLdeger, and we currently 
require the following improvements in the Controller module:
   
   - Design a ControllerSnapshotFile and its file format.
   - Construct SnapshotGenerator for creating and reading SnapshotFile.
   - Implement the onSnapshotLoad and onSnapshotSave APIs in 
ControllerStateMachine.
   
   
   
   ## MetadataManager 
   
   MetadataManager represents a metadata manager in DLedgerController, and can 
also be considered as the memory state machine of DLedgerController. 
DLedgerController will apply Event to MetadataManager.
   
   Currently, there is only one MetadataManager in DLedgerController, that is, 
ReplicasInfoManager. Other MetadataManagers, such as TopicManager, may be 
expanded later, so our SnapshotFile needs to be able to carry the data of 
multiple MetadataManagers.
   
   For each specific MetadataManager, it needs to implement the 
SnapshotAbleMetadataManager interface and define its own Snapshot format.
   
   ```
   public interface SnapshotAbleMetadataManager {
       /**
        * Encode the metadata contained in this MetadataManager
        * @return encoded metadata
        */
       byte[] encodeMetadata();
   
   
       /**
        *
        * According to the param data, load metadata into the MetadataManager
        * @param data encoded metadata
        */
       void loadMetadata(byte[] data);
   
   
       /**
        * Get the type of this MetadataManager
        * @return MetadataManagerType
        */
       MetadataManagerType getMetadataManagerType();
   }
   ```
   
   
   
   ## SnapshotFile Format
   
   SnapshotFile consists of the following two parts
   
   - SnapshotFilHeader: record version, CRC and other informations
   - Sections array: composed of multiple Sections, each Section records the 
metadata of a MetadataManager.
   
   The specific Format is as follows:
   
   ```
   Snapshot File Header:
   - Version: Char
   - TotalSections: Int (number of sections)
   - Magic: Int
   - CRC: Int (CRC check digit)
   - Reversed: Long (reserved, for subsequent expansion)
   
   Sections:
   - SectionHeader:
   - SectionType: Char (refers to the category of MetadataManager)
   - Length: Int (total length of SectionBody)
   - SectionBody: bytes
   ```
   
   
   
   ### ReplicasInfoManager Snapshot Format
   
   The metadata that needs to be stored in ReplicasInfoManager are:
   
   ```
   private final Map<String/* brokerName */, BrokerInfo> replicaInfoTable;
   private final Map<String/* brokerName */, SyncStateInfo> 
syncStateSetInfoTable;
   ```
   
   Its Snapshot Format is as follows:
   
   ```
   - ReplicaInfoTableLength: Int
   - ReplicaInfoTableBytes
   - syncStateSetInfoTableLength: Int
   - syncStateSetInfoTableBytes
   ```
   
   
   
   ## Whole snapshot process
   
   
![image-20221124153407628](https://hzh-pic.oss-cn-beijing.aliyuncs.com/img/image-20221124153407628.png)
   
   The Snapshot Save process is as follows:
   
   - When DLedger triggers Snapshot, it will call the onSnapshotSave interface 
of DLedgerControllerStateMachine.
   - onSnapshotSave builds the SnapshotGenerator.
   - SnapshotGenerator creates a corresponding SectionBuilder based on each 
MetadataManager, which is responsible for building the metadata snapshot of the 
MetadataManager.
   - SnapshotGenerator creates SnapshotFile, populates Header and Sections.
   - SnapshotGenerator passes SnapshotFile to DLedger.
   - Snapshot ends.
   
   
   
   The Snapshot Load process is similar to the Save process.
   
   
   
   ## Steps
   
   - [ ]  Update the version of DLedger in RocketMQ.
   - [ ]  Build the SnapshotAbleMetadataManager interface.
   - [ ]  Let ReplicasInfoManager implement the SnapshotAbleMetadataManager 
interface to implement metadata snapshots.
   - [ ]  Build SnapshotGenerator and SectionBuilder.
   - [ ]  Implement the onSnapshotLoad and onSnapshotSave APIs in 
ControllerStateMachine.
   - [ ]  Add more integration tests to prove the correctness of Controller 
Snapshot.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to