Ivan Bessonov created IGNITE-24877:
--------------------------------------
Summary: Dirty pages count calculation is wrong
Key: IGNITE-24877
URL: https://issues.apache.org/jira/browse/IGNITE-24877
Project: Ignite
Issue Type: Bug
Reporter: Ivan Bessonov
In checkpointer, we have a separate entities for "dirty pages" and "dirty
partitions". We mark partition dirty by calling {{{}"markPartitionAsDirty"{}}}.
Bad thing happen if some pages are updated, but partitions was not explicitly
marked as dirty. In such a case we write more pages to the storage than we
initially calculated. The reason is simple - every delta file also has a meta
page, that's rarely marked dirty in an explicit manner.
This might happen in tests. The probability of it happening in real case is
low, because we constantly update safe time in replicator's state machine.
Nonetheless, this is a ticking time bomb.
The reason why it's dangerous is throttling. Throttling estimates the time
until checkpoint completion by comparing dirty pages count and written pages
count. If the latter is larger, we'll get a negative time, which breaks
everything.
Rather than accounting for bugs, I suggest having a proper dirty pages count
calculation, by accounting for all dirty partitions even if they were not
explicitly marked as dirty.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)