[
https://issues.apache.org/jira/browse/IGNITE-23105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Tkalenko updated IGNITE-23105:
-------------------------------------
Description:
{{CheckpointProgressImpl#onStartPartitionProcessing}} and
{{CheckpointProgressImpl#onFinishPartitionProcessing}} don't work as intended
for several reasons:
* There's a race, we could call {{onFinish}} before {{onStart}} is called in a
concurrent thread. This might happen if there's only a handful of dirty pages
in each partition and there are more than one checkpoint threads. Basically,
this protection doesn't work.
* Even if that particular race wouldn't exits, this code still doesn't work,
because some of pages could be added to {{pageIdsToRetry}} map. That map will
be processed later, when {{writePages}} is finished, manning that we mark
unfinished partitions as finished.
* Due to aforementioned bugs, I didn't bother including these methods to
{{{}drainCheckpointBuffers{}}}. As a result, this method requires a fix too
*Upd:*
The first and second problems will be solved within IGNITE-23115, when the
pages of one partition will be written by only one thread.
was:
{{CheckpointProgressImpl#onStartPartitionProcessing}} and
{{CheckpointProgressImpl#onFinishPartitionProcessing}} don't work as intended
for several reasons:
* There's a race, we could call {{onFinish}} before {{onStart}} is called in a
concurrent thread. This might happen if there's only a handful of dirty pages
in each partition and there are more than one checkpoint threads. Basically,
this protection doesn't work.
* Even if that particular race wouldn't exits, this code still doesn't work,
because some of pages could be added to {{pageIdsToRetry}} map. That map will
be processed later, when {{writePages}} is finished, manning that we mark
unfinished partitions as finished.
* Due to aforementioned bugs, I didn't bother including these methods to
{{{}drainCheckpointBuffers{}}}. As a result, this method requires a fix too
> Data race in aipersist partition destruction
> --------------------------------------------
>
> Key: IGNITE-23105
> URL: https://issues.apache.org/jira/browse/IGNITE-23105
> Project: Ignite
> Issue Type: Bug
> Reporter: Ivan Bessonov
> Assignee: Kirill Tkalenko
> Priority: Major
> Labels: ignite-3
>
> {{CheckpointProgressImpl#onStartPartitionProcessing}} and
> {{CheckpointProgressImpl#onFinishPartitionProcessing}} don't work as intended
> for several reasons:
> * There's a race, we could call {{onFinish}} before {{onStart}} is called in
> a concurrent thread. This might happen if there's only a handful of dirty
> pages in each partition and there are more than one checkpoint threads.
> Basically, this protection doesn't work.
> * Even if that particular race wouldn't exits, this code still doesn't work,
> because some of pages could be added to {{pageIdsToRetry}} map. That map will
> be processed later, when {{writePages}} is finished, manning that we mark
> unfinished partitions as finished.
> * Due to aforementioned bugs, I didn't bother including these methods to
> {{{}drainCheckpointBuffers{}}}. As a result, this method requires a fix too
> *Upd:*
> The first and second problems will be solved within IGNITE-23115, when the
> pages of one partition will be written by only one thread.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)