AMashenkov commented on code in PR #4325:
URL: https://github.com/apache/ignite-3/pull/4325#discussion_r1745666400


##########
modules/catalog-compaction/README.md:
##########
@@ -1 +1,101 @@
 # Catalog compaction module
+
+> **_NOTE_** Compaction has been moved to a separate module to eliminate 
circular dependencies,
+as it requires some components that may themselves depend on the catalog 
module. Please 
+refer to the catalog's module [readme](../catalog/README.md) to see more 
details about 
+catalog service and update log.
+
+## Overview
+
+During schema changes, the catalog update log stores incremental updates. Each 
update
+increases the catalog version. Over time, the log may grow to a humongous  
size. To 
+address this, snapshotting was introduced to UpdateLog. Snapshotting means 
replacing 
+incremental updates with a snapshot.
+
+But different components can refer to a specific version of the catalog. Until 
they 
+finish their work with this version, it cannot be truncated.
+
+This module introduces 
[CatalogCompactionRunner](src/main/java/org/apache/ignite/internal/catalog/compaction/CatalogCompactionRunner.java)
 
+component, which is responsible for periodically performing catalog compaction 
ensuring that
+dropped versions of the catalog are no longer needed by any component in the 
cluster.
+
+## Compaction restrictions
+
+1. Catalog must not be compacted by version which activation time is greater 
than or equal to earliest

Review Comment:
   You may have few catalog versions v1 and v2 with activation time 1 and 3.
   A transaction with activation time of 2 will use catalog of version v1.
   So, v1 must not be compacted.
   
   I'd say we can compact catalog up to the highest version (excluding) which 
activation time is less or equal to the earliest active transaction begin 
timestamp.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to