[jira] [Commented] (IGNITE-12081) Page replacement can reload invalid page during checkpoint

2019-08-18 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909952#comment-16909952
 ] 

Dmitriy Pavlov commented on IGNITE-12081:
-

Blockers not related to this fix (always happen because TC Bot does not compare 
base branch build problem occurrences).

The fix is looking good for me, Dmitriy, thank for your contribution and for 
preparing fix for 2.7.6.

Merged to 2.7.6. (via ./apply-pull-request.sh 6787 -tb ignite-2.7.6)

> Page replacement can reload invalid page during checkpoint
> --
>
> Key: IGNITE-12081
> URL: https://issues.apache.org/jira/browse/IGNITE-12081
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.7.6
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a race between {{writeCheckpointPages}} and page replacement process:
>  * Checkpointer thread begins a checkpoint
>  * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
> content *and clear dirty flag*
>  * Page replacement tries to find a page for replacement and chooses this 
> page, the page is thrown away
>  * Before the page is written back to the store, the page is acquired again.
> As a result, an older copy of the page is brought back to memory, which 
> causes all kinds of corruption exceptions and assertions.
> The attached unit test demonstrates the issue. It is likely that all 
> baselines are affected starting from 2.4
> As a part of this ticket, we must add more unit-tests for checkpointing 
> protocol invariants we rely on.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12081) Page replacement can reload invalid page during checkpoint

2019-08-18 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909948#comment-16909948
 ] 

Ignite TC Bot commented on IGNITE-12081:


{panel:title=Branch: [ignite-2.7.6_12081] Base: [ignite-2.7.6] : Possible 
Blockers (4)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform C++ (Linux Clang){color} [[tests 0 Exit Code , Failure 
on metric |https://ci.ignite.apache.org/viewLog.html?buildId=454]]

{color:#d04437}Platform .NET (Inspections)*{color} [[tests 0 Failure on metric 
|https://ci.ignite.apache.org/viewLog.html?buildId=456]]

{color:#d04437}Platform C++ (Linux)*{color} [[tests 0 Exit Code , Failure on 
metric |https://ci.ignite.apache.org/viewLog.html?buildId=458]]

{color:#d04437}Platform C++ (Win x64 / Release){color} [[tests 0 
BuildFailureOnMessage 
|https://ci.ignite.apache.org/viewLog.html?buildId=4511124]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4510431buildTypeId=IgniteTests24Java8_RunAll]

> Page replacement can reload invalid page during checkpoint
> --
>
> Key: IGNITE-12081
> URL: https://issues.apache.org/jira/browse/IGNITE-12081
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.7.6
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a race between {{writeCheckpointPages}} and page replacement process:
>  * Checkpointer thread begins a checkpoint
>  * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
> content *and clear dirty flag*
>  * Page replacement tries to find a page for replacement and chooses this 
> page, the page is thrown away
>  * Before the page is written back to the store, the page is acquired again.
> As a result, an older copy of the page is brought back to memory, which 
> causes all kinds of corruption exceptions and assertions.
> The attached unit test demonstrates the issue. It is likely that all 
> baselines are affected starting from 2.4
> As a part of this ticket, we must add more unit-tests for checkpointing 
> protocol invariants we rely on.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12081) Page replacement can reload invalid page during checkpoint

2019-08-17 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909794#comment-16909794
 ] 

Dmitriy Govorukhin commented on IGNITE-12081:
-

[~dpavlov] Please, review my changes.

> Page replacement can reload invalid page during checkpoint
> --
>
> Key: IGNITE-12081
> URL: https://issues.apache.org/jira/browse/IGNITE-12081
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.7.6
>
>
> There is a race between {{writeCheckpointPages}} and page replacement process:
>  * Checkpointer thread begins a checkpoint
>  * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
> content *and clear dirty flag*
>  * Page replacement tries to find a page for replacement and chooses this 
> page, the page is thrown away
>  * Before the page is written back to the store, the page is acquired again.
> As a result, an older copy of the page is brought back to memory, which 
> causes all kinds of corruption exceptions and assertions.
> The attached unit test demonstrates the issue. It is likely that all 
> baselines are affected starting from 2.4
> As a part of this ticket, we must add more unit-tests for checkpointing 
> protocol invariants we rely on.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)