[ 
https://issues.apache.org/jira/browse/IGNITE-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818481#comment-16818481
 ] 

Andrey Gura commented on IGNITE-11687:
--------------------------------------

[~agoncharuk] I've investigated the problem deeper. While code snippet pointed 
by you is incorrect and must be fixed it never executes by test because MMAP 
mode is switched on by default. I think that 
{{FileWriteHandleImpl#addRecord()}} method is root of the problem. See the 
following code snippet:

{code:java}
                    fillBuffer(buf, rec);

                    if (mmap) {
                        // written field must grow only, but segment with 
greater position can be serialized
                        // earlier than segment with smaller position.
                        while (true) {
                            long written0 = written;

                            if (seg.position() > written0) {
                                if (WRITTEN_UPD.compareAndSet(this, written0, 
seg.position()))
                                    break;
                            }
                            else
                                break;
                        }
                    }

                    return ptr;
{code}

WAL iterator on {{wal.replay()}} call gets {{hnd.written}} field value while 
some previous WAL record before this position is still not fully serialized. 
What do you think?

> Concurrent WAL replay & log may fail with CRC error on read
> -----------------------------------------------------------
>
>                 Key: IGNITE-11687
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11687
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexey Goncharuk
>            Assignee: Andrey Gura
>            Priority: Critical
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The cause is the way {{end}} is calculated for WAL iterator:
> {code}
> if (hnd != null)
>     end = hnd.position();
> {code}
> {code}
>     @Override public FileWALPointer position() {
>         lock.lock();
>         try {
>             return new FileWALPointer(getSegmentId(), (int)written, 0);
>         }
>         finally {
>             lock.unlock();
>         }
>     }
> {code}
> Consider a partially written entry. In this case, {{written}} has been 
> already updated, concurrent WAL replay will attempt to read the incompletely 
> written record and since {{end}} is not null, iterator will fail with CRC 
> error.
> The issue may be rarely reproduced by {{IgniteWalSerializerVersionTest}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to