[ 
https://issues.apache.org/jira/browse/IGNITE-12263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis A. Magda updated IGNITE-12263:
------------------------------------

Dev list discussion: 
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html

> Introduce native persistence compaction operation
> -------------------------------------------------
>
>                 Key: IGNITE-12263
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12263
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexey Goncharuk
>            Priority: Critical
>
> Currently, Ignite native persistence does not shrink storage files after 
> key-value pairs are removed.
> The causes of this behavior are:
>  * The absence of a mechanism that allows Ignite to track highest non-empty 
> page position in a partition file
>  * The absence of a mechanism which allows Ignite to select a page closest to 
> the file beginning for write
>  * The absence of a mechanism which allows Ignite to move a key-value pair 
> from page to page during defragmentation
> As an initial change I suggest to introduce a new node startup mode, which 
> will run a defragmentation procedure allowing the node to shrink storage 
> files. The procedure will not mutate the logical state of a partition 
> allowing further historical rebalance to quickly catch up the node. Since the 
> procedure will run during the node startup (during the final stages of 
> recovery), there will be no concurrent load, thus the entries can be freely 
> moved from page to page with no tricky synchronization.
> If a procedure is applied during the whole cluster restart, then all nodes 
> will be defragmented simultaneously, allowing for a quicker parallel 
> defragmentation at a cost of downtime.
> The procedure should accept an optional list of cache groups to defragment to 
> allow arbitrary cache group selection for defragmentation.
> An idea of the actions taken during the run for each partition selected for 
> defragmentation:
>  * Partition pages are preloaded to memory if possible to avoid excessive 
> page replacement. During the scan, a HWM of the written data is detected 
> (empty pages are skipped)
>  * Pages references in a free list are sorted in a way allowing to pick pages 
> closest to the file start
>  * The partition is scanned in reverse order, key-value pairs are moved 
> closer to the file start, HWM is updated accordingly. This step is 
> particularly open for various optimizations because different strategies will 
> work well for different fragmentation patterns.
>  * After the scan iteration is completed, the file size can be updated 
> according to the HWM
> As a further improvement, this partition defragmentation procedure can be 
> later run in online mode, after proper cache update protocol changes are 
> designed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to