[ 
https://issues.apache.org/jira/browse/IGNITE-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428409#comment-16428409
 ] 

ASF GitHub Bot commented on IGNITE-8167:
----------------------------------------

GitHub user amelius0712 opened a pull request:

    https://github.com/apache/ignite/pull/3771

    IGNITE-8167: Fix inconsistent last record pointer in case of recovery from 
corrupted WAL

    Let's look at this peace of code from 
GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory
    
    `
                WALPointer restore = restoreMemory(status);
    
                // First, bring memory to the last consistent checkpoint state 
if needed.
                // This method should return a pointer to the last valid record 
in the WAL.
    
                cctx.wal().resumeLogging(restore);
    `
    In case of `restore == null`. Logging will be resuming from 0 absolute WAL 
index.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Synesis-LLC/ignite ignite-8167

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/ignite/pull/3771.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3771
    
----
commit 40b9c8e227783f8d90fff5f2db4688e63be3dd37
Author: Pavel Sapezhko <pavel.sapezhko@...>
Date:   2018-04-06T14:36:23Z

    IGNITE-8167: Fix inconsistent last record pointer in case of recovery from 
corrupted WAL

----


> Recovery after crash sometimes leads to starting from beginning absolute wal 
> segment index
> ------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8167
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8167
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.4
>         Environment: Doesn't meter. We saw these behavior in k8s deployment 
> as in local deployment too. Using any of WAL_MOD.
>            Reporter: Pavel Sapezhko
>            Priority: Major
>             Fix For: 2.5
>
>
> When we are trying to restore after crash using wal log, sometimes we can 
> find corrupted wal messages which can leads to starting from beginning 
> absolute wal index. So, we will have broken wal archiver thread due to 
> assertation error(but we still having working Ignite instance. I think we 
> need to discuss if we are really want it) and as a result on next restart we 
> can see "Wal history is too short" message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to