LeslieKid commented on code in PR #141: URL: https://github.com/apache/horaedb-docs/pull/141#discussion_r1818170643
########## content/en/docs/design/compaction_offload.md: ########## @@ -0,0 +1,94 @@ +--- +title: "Compaction Offload" +--- + +**Note: This feature is still in development.** + +This chapter discusses compaction offload, which is designed to separate the compaction workload from the local horaedb nodes and delegate it to external compaction nodes. + +## Overview + +```plaintext +┌───────────────────────────────────────────────────────────────────────┐ +│ │ +│ HoraeMeta Cluster │ +│ │ +└───────────────────────────────────────────────────────────────────────┘ + ▲ ▲ | | + │ │Fetch compaction │Monitor compaction │ + │ │node info │node │ + | │ ▼ ▼ +┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ +│ │ │ │Offload Task│ │ │ │ +│ HoraeDB │ │ HoraeDB │ ◀───────▶ │ Compaction │ │ Compaction │ +│ │ │ │ Ret Result │ Node │ │ Node │ +└────────────┘ └────────────┘ └────────────┘ └────────────┘ + | | | | + │ │ Update the SSTable │ │ + │ │ │ │ + ▼ ▼ ▼ ▼ +┌───────────────────────────────────────────────────────────────────────┐ +│ │ +│ Object Storage │ +│ │ +└───────────────────────────────────────────────────────────────────────┘ +``` + +The diagram above describes the architecture of cluster for compaction offload, where some key concepts need to be explained: + +- `Compaction Node`: Takes responsibility to handle offloaded compaction tasks. The compaction node receives the compaction task and performs the actual merging of SSTables, then sends back the task result to HoraeDB. +- `HoraeMeta Cluster`: HoraeMeta acts as a compaction nodes manager in the compaction offload scenario. It monitors the compaction nodes cluster and schedule the compaction nodes. + +The procedure of compaction based above architecture diagram is: + +1. HoraeDB triggers compaction procedure under some conditions and then generates the compaction task. +2. HoraeDB fetchs the information of suitable compaction node from the HoraeMeta. +3. HoraeDB distributes the compaction task to the remote compaction node, according to the information fetch from HoraeMeta. +4. Compaction node executes the task and send the result back to the HoraeDB. +5. HoraeDB receives the result and updates the manifest. + +We can see that the compaction offload architecture in HoraeDB is different from the traditional one. It separates the compaction task distribution from HoraeMeta and moves it into HoraeDB, which reduces the load of HoraeMeta to lower the risk of single points of failure. Review Comment: Thanks @jiacai2050 . The statement here is indeed incorrect. fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
