Hi I plan to submit a proposal for implementing support for read-only queries during wal replay as a "Google Summer of Code 2007" project.
I've been browsing the postgres source-code for the last few days, and came up with the following plan for a implementation. I'd be very interested in any feedback on the propsoal - especially of the "you overlooked this an that, it can never work that way" kind ;-) greetings, Florian Pflug Implementing read-only quries during wal archive replay ------------------------------------------------------- Submitter: Florian Pflug <[EMAIL PROTECTED]> Abstract: Implementing full support for read-only queries during wal archive replay is splitted into multiple parts, where each part offeres additional functionality over what postgres provides now. This makes tackling this as a "Google Summer of Code 2007" project feasable, and guarantees that at least some progress is made, even if solving the whole problem turns out to be harder then previously thought. Parts/Milestones of the implementation: A) Allow postgres to be started in read-only mode. After initial wal recovery, postgres doesn't perform writes anymore. All transactions started are implicitly in readonly mode. All transactions will be assigned dummy transaction ids, which never make it into the clog. B) Split StartupXLOG into two steps. The first (Recovery) will process only enough wal to bring the system into a consistent state, while the second one (Replay) replays the archive until it finds no more wal segments. This replay happens in chunks, such that after a chunk all *_safe_restartpoint functions return true. C) Combine A) and B), in the simplest possible way. Introduce a global R/W lock, which is taken by the Replay part of B) in write mode before replaying a chunk, then released, and immediatly reaquired before replaying the next chunk. The startup sequence is modified to do only the Recovery part where is is doing StartupXLOG now, and to lauch an extra process (similar to bgwriter) to do the second (Replay) part in the background. The system is then started up in read-only mode, with the addition that the global R/W lock is taken in read mode before starting any transaction. Thus, while a transaction is running, no archive replay happens. Benefits: *) Part A) alone might be of value for some people in the embedded world, or people who want to distribute software the use postgres. You could e.g. distribute a CD with a large, read-only database, and your application would just need to start postmaster to be able to query it directly from the CD. *) Read-only hot standby is a rather simple way to do load-balancing, if your application doesn't depend on the data being absolutely up-to-date. *) Even if this isn't used for load-balancing, it gives the DBA an easy way to check how far a PITR slave is lagging behind, therefore making PITR replication more user-friendly. Open Questions/Problems *) How do read-only transactions obtain a snapshot? Is it sufficient to just create an "empty" snapshot for them, meaning that they'll always look at the clog to obtain a transaction's state? *) How many places to attempt to issue writes? How hard is it to silence them all while in read-only mode. *) How does the user interface look like? I'm currently leaning towards a postgresql.conf setting read_only=yes. This would put postgres into read-only mode, and if a recovery.conf is present, archive replay would run as a background process. Limitations: *) The replaying process might be starved, letting the slave fall further and further behind the master. Only true if the slave executes a lot of queries, though. *) Postgres would continue to run in read-only mode, even after finishing archive recovery. A restart would be needed to switch it into read-write mode again. (I probably wouldn't be too hard to do that switch without a restart, but it seems better to tackle this after the basic features are working) ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings