Hi,
Here's some feedback, this is a feature that would be very useful to a
project I am currently working on. 

Doug

On Fri, 2007-02-23 at 17:34 +0100, Florian G. Pflug wrote:
> Hi
> 
> I plan to submit a proposal for implementing support for
> read-only queries during wal replay as a "Google Summer of Code 2007"
> project.
> 
> I've been browsing the postgres source-code for the last few days,
> and came up with the following plan for a implementation.
> 
> I'd be very interested in any feedback on the propsoal - especially
> of the "you overlooked this an that, it can never work that way" kind ;-)
> 
> greetings, Florian Pflug
> 
> Implementing read-only quries during wal archive replay
> -------------------------------------------------------
> 
> Submitter: Florian Pflug <[EMAIL PROTECTED]>
> 
> Abstract:
> Implementing full support for read-only queries during
> wal archive replay is splitted into multiple parts, where
> each part offeres additional functionality over what
> postgres provides now. This makes tackling this as a
> "Google Summer of Code 2007" project feasable, and guarantees
> that at least some progress is made, even if solving the
> whole problem turns out to be harder then previously
> thought.
> 
> Parts/Milestones of the implementation:
> A) Allow postgres to be started in read-only mode. After
>     initial wal recovery, postgres doesn't perform writes
>     anymore. All transactions started are implicitly in
>     readonly mode. All transactions will be assigned dummy
>     transaction ids, which never make it into the clog.
> B) Split StartupXLOG into two steps. The first (Recovery) will process
>     only enough wal to bring the system into a consistent state,
>     while the second one (Replay) replays the archive until it finds no
>     more wal segments. This replay happens in chunks, such that
>     after a chunk all *_safe_restartpoint functions return true.
> C) Combine A) and B), in the simplest possible way.
>     Introduce a global R/W lock, which is taken by the Replay part
>     of B) in write mode before replaying a chunk, then released,
>     and immediatly reaquired before replaying the next chunk.
>     The startup sequence is modified to do only the Recovery part
>     where is is doing StartupXLOG now, and to lauch an extra process
>     (similar to bgwriter) to do the second (Replay) part in the background.
>     The system is then started up in read-only mode, with the addition
>     that the global R/W lock is taken in read mode before starting any
>     transaction. Thus, while a transaction is running, no archive replay
>     happens.
> 
> Benefits:
> *) Part A) alone might be of value for some people in the embedded world,
>     or people who want to distribute software the use postgres. You could
>     e.g. distribute a CD with a large, read-only database, and your 
> application
>     would just need to start postmaster to be able to query it directly from
>     the CD.
> *) Read-only hot standby is a rather simple way to do load-balancing, if
>     your application doesn't depend on the data being absolutely up-to-date.
> *) Even if this isn't used for load-balancing, it gives the DBA an
>     easy way to check how far a PITR slave is lagging behind, therefore
>     making PITR replication more user-friendly.
> 
> Open Questions/Problems
> *) How do read-only transactions obtain a snapshot? Is it sufficient
>     to just create an "empty" snapshot for them, meaning that they'll
>     always look at the clog to obtain a transaction's state?
> *) How many places to attempt to issue writes? How hard is it to
>     silence them all while in read-only mode.
> *) How does the user interface look like? I'm currently leaning towards
>     a postgresql.conf setting read_only=yes. This would put postgres
>     into read-only mode, and if a recovery.conf is present, archive
>     replay would run as a background process.
> 
> Limitations:
> *) The replaying process might be starved, letting the slave fall
>     further and further behind the master. Only true if the slave
>     executes a lot of queries, though.
> *) Postgres would continue to run in read-only mode, even after finishing
>     archive recovery. A restart would be needed to switch it into read-write
>     mode again. (I probably wouldn't be too hard to do that switch without
>     a restart, but it seems better to tackle this after the basic features
>     are working)
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

Reply via email to