[
https://issues.apache.org/jira/browse/IGNITE-12263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denis A. Magda updated IGNITE-12263:
------------------------------------
Dev list discussion:
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html
> Introduce native persistence compaction operation
> -------------------------------------------------
>
> Key: IGNITE-12263
> URL: https://issues.apache.org/jira/browse/IGNITE-12263
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexey Goncharuk
> Priority: Critical
>
> Currently, Ignite native persistence does not shrink storage files after
> key-value pairs are removed.
> The causes of this behavior are:
> * The absence of a mechanism that allows Ignite to track highest non-empty
> page position in a partition file
> * The absence of a mechanism which allows Ignite to select a page closest to
> the file beginning for write
> * The absence of a mechanism which allows Ignite to move a key-value pair
> from page to page during defragmentation
> As an initial change I suggest to introduce a new node startup mode, which
> will run a defragmentation procedure allowing the node to shrink storage
> files. The procedure will not mutate the logical state of a partition
> allowing further historical rebalance to quickly catch up the node. Since the
> procedure will run during the node startup (during the final stages of
> recovery), there will be no concurrent load, thus the entries can be freely
> moved from page to page with no tricky synchronization.
> If a procedure is applied during the whole cluster restart, then all nodes
> will be defragmented simultaneously, allowing for a quicker parallel
> defragmentation at a cost of downtime.
> The procedure should accept an optional list of cache groups to defragment to
> allow arbitrary cache group selection for defragmentation.
> An idea of the actions taken during the run for each partition selected for
> defragmentation:
> * Partition pages are preloaded to memory if possible to avoid excessive
> page replacement. During the scan, a HWM of the written data is detected
> (empty pages are skipped)
> * Pages references in a free list are sorted in a way allowing to pick pages
> closest to the file start
> * The partition is scanned in reverse order, key-value pairs are moved
> closer to the file start, HWM is updated accordingly. This step is
> particularly open for various optimizations because different strategies will
> work well for different fragmentation patterns.
> * After the scan iteration is completed, the file size can be updated
> according to the HWM
> As a further improvement, this partition defragmentation procedure can be
> later run in online mode, after proper cache update protocol changes are
> designed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)