Aleksey Plekhanov created IGNITE-20697:
------------------------------------------
Summary: Move physical records from WAL to another storage
Key: IGNITE-20697
URL: https://issues.apache.org/jira/browse/IGNITE-20697
Project: Ignite
Issue Type: Improvement
Reporter: Aleksey Plekhanov
Assignee: Aleksey Plekhanov
Currentrly, physycal records take most of the WAL size. But physical records in
WAL files required only for crush recovery and these records are useful only
for a short period of time (since last checkpoint).
Size of physical records during checkpoint is more than size of all modified
pages between checkpoints, since we need to store page snapshot record for each
modified page and page delta records, if page is modified more than once
between checkpoints.
We process WAL file several times in normal workflow (without crashes):
1) We write records to WAL files
2) We copy WAL files to archive
3) We compact WAL files (remove phisical records + compress)
So, totally we write all physical records twice and read physical records
twice.
To reduce disc workload we can move physical records to another storage and
don't write them to WAL files.
To provide the same crush recovery guarantees we can write modified pages twice
during checkpoint. First time to some delta file and second time to the page
storage. In this case we can recover any page if we crash during write to page
storage from delta file (instead of WAL, as we do now).
This proposal has pros and cons.
Pros:
- Less size of stored data (we don't store page delta files, only final state
of the page)
- Reduced disc workload (we store additionally write once all modified pages
instead of 2 writes and 2 reads of larger amount of data)
- Potentially reduced latancy (instead of writing physical records
synchronously during data modification we write to WAL only logical records and
physical pages will be written by checkpointer threads)
Cons:
- Increased checkpoint duration (we should write doubled amount of data during
checkpoint)
Let's try it and benchmark.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)