I know I'm being pretty hand-wavy here since I don't know all of the details of the WAL implementation and I'm not fluent in the terminology, but it seems like checkpoint starvation shouldn't have to allow the WAL to grow without bound provided the individual read and write processes don't have long running transactions.
As I understand it, writers always append to the WAL and use a view of the database that includes the main file plus any changes in the WAL. Readers use the main file plus any changes up through a particular point in the WAL ? the point of the last completed transaction prior to the start of the read operation. A checkpoint can move data from the WAL into the main file provided it doesn't move anything beyond the limit point for any client. (Any reader client since writers use changes through the end of the WAL.) Checkpoint starvation (and continued WAL expansion) occurs because the checkpointer fails to make it to the end of the WAL because it is blocked by some readers change limit. My thinking is that this could be addressed by splitting the WAL into two files full of changes that logically follow each other. Assuming readers move forward frequently enough, they should eventually move their limit point from the first file into the second. At this point, when the checkpointer finishes transferring changes for the first WAL file into the main database file, it can swap the first WAL file to follow the second WAL file as empty space for future changes. Would this work and if so is there a reason this hasn't been done? Mark