Re: [ADMIN] PG engine takeover or switch over

Medi Montaseri Thu, 16 Aug 2007 11:21:00 -0700

Thank you both for your input...this is indeed the type of analysis I was
looking for. Now I have to read and understand them more carefully. At this
point I wanted to thank you both and hopefully I can trouble you both with
some follow ups in the future.


Cheers
Medi

On 8/16/07, Chander Ganesan <[EMAIL PROTECTED]> wrote:
>
> Hannes Dorbath wrote:
> > On 15.08.2007 21:30, Medi Montaseri wrote:
> >> I am looking for some suggestions to optimize the following
> >> problem/solution.
> >>
> >> Given two nodes A and B (two computers) in a active-passive mode
> >> where A is
> >> running PG engine,  and B is simply standing by,  and  a common storage
> >> (twin tailed) ( or from pg_ctl point of view -D /common/data ), I am
> >> looking
> >> for a faster solution during the takeover where A has crashed and B
> >> is to
> >> start PG engine and run with it.
> >>
> >> My current solution is to start PG engine which should take little
> >> time to
> >> study the configuration files and /common/data and fork a few
> >> childrens. But
> >> I am still interested in optimizing this start-up cost.
> >>
> >> For example, would I gain anything by starting PG engine on both A
> >> and B,
> >> but on B I point it to /common/dummy and during the takeover, I
> >> somehow tell
> >> it to now read from /common/data, for example have two
> >> postmaster.conf or
> >> PGDATA and then use  pg_ctl reload.
> >
> > Starting up PostgreSQL should be very fast, given no recovery to be
> > done and decent hardware.
> >
> > PostgreSQL does not fork a lot unless it is accepting new connections
> > and if reading a config file is slow on your system, something else is
> > broken.
> >
> > In a active/passive setup your should be able to switch over in under
> > 3 seconds. If there was a lot of load on the failed node the recovery
> > times on the new active node increase. The only thing you can do about
> > that is getting faster disks..
> >
> Your startup time in the event of a failure will be predicated on the
> number of WAL files that need to be played back in order to perform
> auto-recovery.  For example, if you've set your checkpoint_segments to
> some high number, PostgreSQL will need to play back those WAL files to
> ensure that transactions that were committed to disk are correctly
> represented in your "on disk" object data (tables, indexes, etc).
>
> Note that in the case of a "graceful" shutdown, you'll find that
> PostgreSQL doesn't need to replay WAL files (since it checkpoints prior
> to shutting down) and as such the startup time is pretty fast.
>
> You can decrease the amount of time it takes to recover by decreasing
> the number of segments per checkpoint; however, this may result in more
> frequent checkpointing and as as result a reduction in overall
> performance on your active node.
>
> Basically, you're balancing the "cost" of auto crash recovery with the
> frequency of checkpointing.
>
> An alternative (and perhaps faster) method of failover would be to have
> a cluster in "warm standby" mode (this wouldn't rely on a shared pgdata
> directory).  In such cases you would only have to wait for the last few
> WAL files to be played back in order to recover.  In the case of a high
> checkpoint_segments number this would allow you to recover quickly from
> failures, but would introduce PITR overhead (copying WAL files when WAL
> files become full), but that's a pretty small cost - and you may already
> have that cost if you do PITR backups.
>
> There is no way to have postgresql "switch" data directories to speed up
> startup.
>
> --
> Chander Ganesan
> Open Technology Group, Inc.
> One Copley Parkway, Suite 210
> Morrisville, NC  27560
> 919-463-0999/866-229-3386
> http://www.otg-nc.com
>
>

Re: [ADMIN] PG engine takeover or switch over

Reply via email to