Chen Luo created ASTERIXDB-2666:
-----------------------------------

             Summary: One-Phase Log Replay Approach
                 Key: ASTERIXDB-2666
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2666
             Project: Apache AsterixDB
          Issue Type: Wish
          Components: STO - Storage, TX - Transactions
            Reporter: Chen Luo


AsterixDB currently uses a classical two-phase log replay approach during 
recovery by first identifying committed writes and then applying these commit 
writes to LSM-trees. This is a stardard approach for general-purpose 
transaction processing systems, but for AsterixDB, we can design something 
better.

AsterixDB uses a record-level transaction model where each write is committed 
as soon as possible by "entity commit". To exploit this property, we can design 
a one-phase log replay approach as follows:
* Start from the log head based on the low watermark LSN
* Whenever we see an update log record, store that log record in memory (for 
each job)
* Whenever we see an entity commit or abort record, redo the corresponding 
update log record immediately and remove it from memory

The key property here is that the window between an update log record and a 
commit log record is very short - we commit on a frame basis. Thus, this will 
speed up the recovery process by only using one log read pass and avoiding 
store all entity commits in memory. We only need a small amount of memory, 
based on the window between updates and commits, during the recovery proces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to