I know I'm being pretty hand-wavy here since I don't know all of the
details of the WAL implementation and I'm not fluent in the terminology,
but it seems like checkpoint starvation shouldn't have to allow the WAL to
grow without bound provided the individual read and write processes don't
have long running transactions.

As I understand it, writers always append to the WAL and use a view of the
database that includes the main file plus any changes in the WAL. Readers
use the main file plus any changes up through a particular point in the WAL
? the point of the last completed transaction prior to the start of the
read operation.

A checkpoint can move data from the WAL into the main file provided it
doesn't move anything beyond the limit point for any client. (Any reader
client since writers use changes through the end of the WAL.) Checkpoint
starvation (and continued WAL expansion) occurs because the checkpointer
fails to make it to the end of the WAL because it is blocked by some
readers change limit.

My thinking is that this could be addressed by splitting the WAL into two
files full of changes that logically follow each other. Assuming readers
move forward frequently enough, they should eventually move their limit
point from the first file into the second. At this point, when the
checkpointer finishes transferring changes for the first WAL file into the
main database file, it can swap the first WAL file to follow the second WAL
file as empty space for future changes.

Would this work and if so is there a reason this hasn't been done?

Mark

Reply via email to