Alexey Goncharuk created IGNITE-5772:
----------------------------------------

             Summary: Race between WAL segment rollover and concurrent log
                 Key: IGNITE-5772
                 URL: https://issues.apache.org/jira/browse/IGNITE-5772
             Project: Ignite
          Issue Type: Bug
          Components: cache
    Affects Versions: 2.1
            Reporter: Alexey Goncharuk
            Assignee: Alexey Goncharuk
             Fix For: 2.2


The WAL log() and close() are synch-ed as follows:
log: read head, check stop flag, cas head
close: set stop flag, cas head to fake record.
This guarantees that after close() is called, there will be no other records 
appended to the closed segment.
Now consider three threads doing the following operations:
T1: flush(); T2: rollOver(); T3: log();
The sequence of events:
1) T1 does a CAS of head to FakeRecord
2) T3 reads head as FakeRecord, reads stop flag as false
3) T2 attempts to rollOver: CAS stop to true; call flushOrWait(null); call 
flush(null); Since the head is an instance of FakeRecord, the flush(null) 
immediately returns false. This thread waits for written bytes and proceeds
4) T3 successfully does a CAS of head to non-fake record
5) T2 proceeds with rollOver, signals next available and asserts on head.
The invariant above is broken when T2 does not CAS fake record during rollover, 
which allows T3 to append an entry to the closed segment. The solution is to 
change the code so the CAS is always attempted on close even if the current 
head is already a FakeRecord.
Alternatively, we can introduce another type of fake record that will seal the 
WAL segment queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to