[ 
https://issues.apache.org/jira/browse/IGNITE-23105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-23105:
-------------------------------------
    Description: 
{{CheckpointProgressImpl#onStartPartitionProcessing}} and 
{{CheckpointProgressImpl#onFinishPartitionProcessing}} don't work as intended 
for several reasons:
 * There's a race, we could call {{onFinish}} before {{onStart}} is called in a 
concurrent thread. This might happen if there's only a handful of dirty pages 
in each partition and there are more than one checkpoint threads. Basically, 
this protection doesn't work.
 * Even if that particular race wouldn't exits, this code still doesn't work, 
because some of pages could be added to {{pageIdsToRetry}} map. That map will 
be processed later, when {{writePages}} is finished, manning that we mark 
unfinished partitions as finished.
 * Due to aforementioned bugs, I didn't bother including these methods to 
{{{}drainCheckpointBuffers{}}}. As a result, this method requires a fix too

*Upd:*
The first and second problems will be solved within IGNITE-23115, when the 
pages of one partition will be written by only one thread.


  was:
{{CheckpointProgressImpl#onStartPartitionProcessing}} and 
{{CheckpointProgressImpl#onFinishPartitionProcessing}} don't work as intended 
for several reasons:
 * There's a race, we could call {{onFinish}} before {{onStart}} is called in a 
concurrent thread. This might happen if there's only a handful of dirty pages 
in each partition and there are more than one checkpoint threads. Basically, 
this protection doesn't work.
 * Even if that particular race wouldn't exits, this code still doesn't work, 
because some of pages could be added to {{pageIdsToRetry}} map. That map will 
be processed later, when {{writePages}} is finished, manning that we mark 
unfinished partitions as finished.
 * Due to aforementioned bugs, I didn't bother including these methods to 
{{{}drainCheckpointBuffers{}}}. As a result, this method requires a fix too


> Data race in aipersist partition destruction
> --------------------------------------------
>
>                 Key: IGNITE-23105
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23105
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Bessonov
>            Assignee: Kirill Tkalenko
>            Priority: Major
>              Labels: ignite-3
>
> {{CheckpointProgressImpl#onStartPartitionProcessing}} and 
> {{CheckpointProgressImpl#onFinishPartitionProcessing}} don't work as intended 
> for several reasons:
>  * There's a race, we could call {{onFinish}} before {{onStart}} is called in 
> a concurrent thread. This might happen if there's only a handful of dirty 
> pages in each partition and there are more than one checkpoint threads. 
> Basically, this protection doesn't work.
>  * Even if that particular race wouldn't exits, this code still doesn't work, 
> because some of pages could be added to {{pageIdsToRetry}} map. That map will 
> be processed later, when {{writePages}} is finished, manning that we mark 
> unfinished partitions as finished.
>  * Due to aforementioned bugs, I didn't bother including these methods to 
> {{{}drainCheckpointBuffers{}}}. As a result, this method requires a fix too
> *Upd:*
> The first and second problems will be solved within IGNITE-23115, when the 
> pages of one partition will be written by only one thread.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to