AMashenkov commented on code in PR #4325:
URL: https://github.com/apache/ignite-3/pull/4325#discussion_r1745742445


##########
modules/catalog-compaction/README.md:
##########
@@ -1 +1,101 @@
 # Catalog compaction module
+
+> **_NOTE_** Compaction has been moved to a separate module to eliminate 
circular dependencies,
+as it requires some components that may themselves depend on the catalog 
module. Please 
+refer to the catalog's module [readme](../catalog/README.md) to see more 
details about 
+catalog service and update log.
+
+## Overview
+
+During schema changes, the catalog update log stores incremental updates. Each 
update
+increases the catalog version. Over time, the log may grow to a humongous  
size. To 
+address this, snapshotting was introduced to UpdateLog. Snapshotting means 
replacing 
+incremental updates with a snapshot.
+
+But different components can refer to a specific version of the catalog. Until 
they 
+finish their work with this version, it cannot be truncated.
+
+This module introduces 
[CatalogCompactionRunner](src/main/java/org/apache/ignite/internal/catalog/compaction/CatalogCompactionRunner.java)
 
+component, which is responsible for periodically performing catalog compaction 
ensuring that
+dropped versions of the catalog are no longer needed by any component in the 
cluster.
+
+## Compaction restrictions
+
+1. Catalog must not be compacted by version which activation time is greater 
than or equal to earliest
+   active transaction in the cluster.
+2. Catalog must not be compacted by version which can be required to replay 
the raft log during recovery.
+3. Index building is stick with a specific catalog version. This version 
cannot be truncated until
+   the index build is complete.
+4. Rebalance is stick with a specific catalog version. This version cannot be 
truncated until the rebalance
+   is complete.
+
+## Coordinator
+
+Compaction is performed from single node called compaction coordinator.
+To simplify the selection of the coordinator, it was decided to consider it to 
be the same node 
+as the leader of the metastorage group.
+Therefore, when the metastorage group leader changes, the compaction 
coordinator also changes.
+
+The 
[ElectionListener](../metastorage/src/main/java/org/apache/ignite/internal/metastorage/impl/ElectionListener.java)
+interface was introduced to listen for metastore leader elections.
+
+## Triggering factors
+
+The process is initiated by one of the following events:
+
+1. `low watermark` has been changed
+2. the compaction coordinator has been changed
+
+## Overall process description
+
+When compaction is triggered, two parallel processes are started
+
+- The first process (let's call it "**replicas update**") updates all 
replication groups
+  with determined minimum begin time among all active RW transactions in the 
cluster.
+  This time will be used by compaction process (see below) to comply with the 
restriction
+  about raft log recovery.
+- The second one ("**compaction**") determines the minimum required version of 
the catalog
+  and performs compaction.
+
+![Replicas update](tech-notes/processes.png) 
+
+> **_NOTE_** Even though the compaction process depends on the result of the 
replicas
+update process, they are executed in parallel, so the compaction in the 
current iteration
+will see the result of the replicas update process obtained in one of the 
previous iterations.

Review Comment:
   I think this should be merged into previous paragraph.
   
   If my understanding right, 
   the first process calculates minimal TX begin time, which will be used later 
on the next compaction triggering.
   the second process uses result (TX begin time) of the previous run of the 
first process.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to