Re: [PERFORM] Best replication solution?

2009-04-09 Thread Jeff


On Apr 8, 2009, at 4:46 PM, Dimitri Fontaine wrote:



$ londiste.py setup.ini provider add schema.table
$ londiste.py setup.ini subscriber add schema.table



That is nice.  One could probably do that for slony too.

I may try some tests out with londiste.. I'm always open to new  
(ideally, better) things.


This could happen in Londiste too, just set pgq_lazy_fetch to a  
reasonable value and Londiste will use a cursor to fetch the events,  
lowering the load. Events are just tuples in an INSERT only table,  
which when not used anymore is TRUNCATEd away. PGQ will use 3 tables  
where to store events and will rotate its choice of where to insert  
new envents, allowing to use TRUNCATE rather than DELETE. And  
PostgreSQL is quite efficient to manage this :)

 
http://wiki.postgresql.org/wiki/Londiste_Tutorial#Londiste_is_eating_all_my_CPU_and_lag_is_raising



Well, Slony always uses a cursor to fetch, the problem is it may have  
to slog through millions of rows to find the new data - I've analyzed  
the queries and there isn't much it can do -  lots of calls to the  
xxid_ functions to determine whats to be used, whats not to be used.   
When all slaves have a sync event ack'd it is free to be removed by  
the cleanup routine which is run every few minutes.




Oh and some people asked what Londiste with failover and DDL would  
look like. Here's what the API being cooked looks like at the moment:

$ londiste setup.ini execute myddl.script.sql

$ londiste conf/londiste_db3.ini change-provider --provider=rnode1
$ londiste conf/londiste_db1.ini switchover --target=rnode2



ok, so londiste can't do failover yet, or is it just somewhat  
convoluted at this point?


--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Mark Kirkwood

Andrew Sullivan wrote:


I should have stated that differently.  First, you're right that if
you don't know where to look or what to look for, you can easily be
unaware of nodes being out of sync.  What's not a problem with Slony
is that the nodes can get out of internally consistent sync state: if
you have a node that is badly lagged, at least it represents, for
sure, an actual point in time of the origin set's history.  Some of
the replication systems aren't as careful about this, and it's
possible to get the replica into a state that never happened on the
origin.  That's much worse, in my view.

In addition, it is not possible that Slony's system tables report the
replica as being up to date without them actually being so, because
the system tables are updated in the same transaction as the data is
sent.  It's hard to read those tables, however, because you have to
check every node and understand all the states.

  


Yes, and nicely explained!


(on Londiste DDL + slave chaining)...
Well, those particular features -- which are indeed the source of much
of the complexity in Slony -- were planned in from the beginning.
Londiste aimed to be simpler, so it would be interesting to see
whether those features could be incorporated without the same
complication.



  

Yeah, that's the challenge!

Personally I would like DDL to be possible without any special wrappers 
or precautions, as the usual (accidental) breakage I end up looking at 
in Slony is because someone (or an app's upgrade script) has performed 
an ALTER TABLE directly on the master schema...


Cheers

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Marinos Yannikos

Heikki Linnakangas wrote:

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB 
in pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a dramatic 
effect.


I'll second this. Although it doesn't really answer the original 
question, you have to keep in mind that for read-intensive workloads, 
caching will give you the biggest benefit by far, orders of magnitude 
more than replication solutions unless you want to spend a lot of $ on 
hardware (which I take it you don't if you are reluctant to add new 
disks). Keeping the interesting parts of the DB completely in RAM makes 
a big difference, common older (P4-based) Xeon boards can usually be 
upgraded to 12-16GB RAM, newer ones to anywhere between 16 and 192GB ...


As for replication solutions - Slony I wouldn't recommend (tried it for 
workloads with large writes - bad idea), but PgQ looks very solid and 
you could either use Londiste or build your own very fast non-RDBMS 
slaves using PgQ by keeping the data in an optimized format for your 
queries (e.g. if you don't need joins - use TokyoCabinet/Berkeley DB).


Regards,
 Marinos


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Jeff


On Apr 7, 2009, at 1:18 PM, Andrew Sullivan wrote:


I should have stated that differently.  First, you're right that if
you don't know where to look or what to look for, you can easily be
unaware of nodes being out of sync.  What's not a problem with Slony


_$cluster.sl_status on the origin is a very handy tool to see your  
slaves, how many sync's behind they are and whatnot.  Maybe I'm lucky,  
but I haven't got into a funky state that didn't cause my alarms that  
watch sl_status to go nuts.



Complexity seems to be the major evil here.


Yes.  Slony is massively complex.



Configuring slony by hand using slonik commands does suck horribly.
But the included altperl tools that come with it, along with  
slon_tools.conf removes a HUGE amount of that suck.


To add a table with a pk you edit slon_tools.conf and add something  
along the lines of:


someset = {
set_id = 5,
table_id = 5,
pkeyedtables = [ tacos, burritos, gorditas ]
}

then you just run

[create tables on slave(s)]
slonik_create_set someset;
slonik_subscribe_set 1 2;

there are other handy scripts in there as well for failing over,  
adding tables, merging, etc. that hide a lot of the suck.  Especially  
the suck of adding a node and creating the store paths.


I'm running slony on a rather write intensive system, works fine, just  
make sure you've got beefy IO.  One sucky thing though is if a slave  
is down sl_log can grow very large (I've had it get over 30M rows, the  
slave was only down for hours) and this causes major cpu churn while  
the queries slon issues sift through tons of data.  But, to be fair,  
that'll hurt any replication system.


--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Dimitri Fontaine

Hi,

Ok I need to answer some more :)

Le 8 avr. 09 à 20:20, Jeff a écrit :
To add a table with a pk you edit slon_tools.conf and add something  
along the lines of:


someset = {
set_id = 5,
table_id = 5,
pkeyedtables = [ tacos, burritos, gorditas ]
}

then you just run

[create tables on slave(s)]
slonik_create_set someset;
slonik_subscribe_set 1 2;


 $ londiste.py setup.ini provider add schema.table
 $ londiste.py setup.ini subscriber add schema.table

Note both of those commands are to be run from the same host (often  
enough, the slave), if you have more than one slave, issue the second  
of them only on the remaining ones.


there are other handy scripts in there as well for failing over,  
adding tables, merging, etc. that hide a lot of the suck.   
Especially the suck of adding a node and creating the store paths.


There's no set in Londiste, so you just don't manage them. You add  
tables to queues (referencing the provider in fact) and the subscriber  
is free to subscribe to only a subset of the provider queue's tables.  
And any table could participate into more than one queue at any time  
too, of course.


I'm running slony on a rather write intensive system, works fine,  
just make sure you've got beefy IO.  One sucky thing though is if a  
slave is down sl_log can grow very large (I've had it get over 30M  
rows, the slave was only down for hours) and this causes major cpu  
churn while the queries slon issues sift through tons of data.  But,  
to be fair, that'll hurt any replication system.


This could happen in Londiste too, just set pgq_lazy_fetch to a  
reasonable value and Londiste will use a cursor to fetch the events,  
lowering the load. Events are just tuples in an INSERT only table,  
which when not used anymore is TRUNCATEd away. PGQ will use 3 tables  
where to store events and will rotate its choice of where to insert  
new envents, allowing to use TRUNCATE rather than DELETE. And  
PostgreSQL is quite efficient to manage this :)

  
http://wiki.postgresql.org/wiki/Londiste_Tutorial#Londiste_is_eating_all_my_CPU_and_lag_is_raising


Oh and some people asked what Londiste with failover and DDL would  
look like. Here's what the API being cooked looks like at the moment:

 $ londiste setup.ini execute myddl.script.sql

 $ londiste conf/londiste_db3.ini change-provider --provider=rnode1
 $ londiste conf/londiste_db1.ini switchover --target=rnode2

But I'm not the one who should be unveiling all of this, which is  
currently being prepared to reach alpha soon'ish.


Regards,
--
dim


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-07 Thread Dimitri Fontaine
On Monday 06 April 2009 14:35:30 Andrew Sullivan wrote:
  *SkyTools/Londiste* - Don't know anything special about it.

 I've been quite impressed by the usability.  It's not quite as
 flexible as Slony, but it has the same theory of operation.  The
 documentation is not as voluminous, although it's also much handier as
 reference material than Slony's (which is, in my experience, a little
 hard to navigate if you don't already know the system pretty well).

As a londiste user I find it really trustworthy solution, and very easy to use 
and understand. We made some recent efforts on documentation front:
  http://wiki.postgresql.org/wiki/SkyTools
  http://wiki.postgresql.org/wiki/Londiste_Tutorial

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [PERFORM] Best replication solution?

2009-04-07 Thread Mark Kirkwood

Andrew Sullivan wrote:

On Sun, Apr 05, 2009 at 11:36:33AM -0700, Lists wrote:

  
*Slony-I* - I've used this in the past, but it's a huge pain to work  
with, caused serious performance issues under heavy load due to long  
running transactions (may not be the case anymore, it's been a while  
since I used it on a large database with many writes), and doesn't seem  
very reliable (I've had replication break on me multiple times).



 
I'm also somewhat puzzled by the claim of unreliability:

most of the actual replication failures I've ever seen under Slony are
due to operator error (these are trivial to induce, alas --
aforementioned pain to work with again).  Slony is baroque and
confusing, but it's specifically designed to fail in safe ways (which
is not true of some of the other systems: several of them have modes
in which it's possible to have systems out of sync with each other,
but with no way to detect as much.  IMO, that's much worse, so we
designed Slony to fail noisily if it was going to fail at all).  

  


From my experience - gained from unwittingly being in the wrong place 
at the wrong time and so being volunteered into helping people with 
Slony failures - it seems to be quite possible to have nodes out of sync 
and not be entirely aware of it - in addition to there being numerous 
ways to shoot yourself in the foot via operator error. Complexity seems 
to be the major evil here.


I've briefly experimented with Londiste, and it is certainly much 
simpler to administer. Currently it lacks a couple of features Slony has 
(chained slaves and partial DDL support), but I'll be following its 
development closely - because if these can be added - whilst keeping the 
operator overhead (and the foot-gun) small, then this looks like a winner.


regards

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-07 Thread Ivan Voras
Lists wrote:
 I'm currently running 32bit FreeBSD so I can't really add more ram (PAE
 doesn't work well under FreeBSD from what I've read) 

That's probably left-over from the time many drivers were not 64-bit
friendly. I've yet to see a new configuration that doesn't work with PAE
(also, the default PAE configuration file is too conservative. Drivers
that work on amd64 should work on PAE without problems). In any case,
it's easy to try it - you can always boot the kernel.old.





signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Best replication solution?

2009-04-07 Thread Andrew Sullivan
On Tue, Apr 07, 2009 at 10:31:02PM +1200, Mark Kirkwood wrote:

 From my experience - gained from unwittingly being in the wrong place at 
 the wrong time and so being volunteered into helping people with Slony 
 failures - it seems to be quite possible to have nodes out of sync and 
 not be entirely aware of it 

I should have stated that differently.  First, you're right that if
you don't know where to look or what to look for, you can easily be
unaware of nodes being out of sync.  What's not a problem with Slony
is that the nodes can get out of internally consistent sync state: if
you have a node that is badly lagged, at least it represents, for
sure, an actual point in time of the origin set's history.  Some of
the replication systems aren't as careful about this, and it's
possible to get the replica into a state that never happened on the
origin.  That's much worse, in my view.

In addition, it is not possible that Slony's system tables report the
replica as being up to date without them actually being so, because
the system tables are updated in the same transaction as the data is
sent.  It's hard to read those tables, however, because you have to
check every node and understand all the states.

  Complexity seems to be the major evil here.

Yes.  Slony is massively complex.

 simpler to administer. Currently it lacks a couple of features Slony has  
 (chained slaves and partial DDL support), but I'll be following its  
 development closely - because if these can be added - whilst keeping the  
 operator overhead (and the foot-gun) small, then this looks like a 
 winner.

Well, those particular features -- which are indeed the source of much
of the complexity in Slony -- were planned in from the beginning.
Londiste aimed to be simpler, so it would be interesting to see
whether those features could be incorporated without the same
complication.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Heikki Linnakangas

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB in 
pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a dramatic 
effect.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Andrew Sullivan
On Sun, Apr 05, 2009 at 11:36:33AM -0700, Lists wrote:

 *Slony-I* - I've used this in the past, but it's a huge pain to work  
 with, caused serious performance issues under heavy load due to long  
 running transactions (may not be the case anymore, it's been a while  
 since I used it on a large database with many writes), and doesn't seem  
 very reliable (I've had replication break on me multiple times).

It is indeed a pain to work with, but I find it hard to believe that
it is the actual source of performance issues.  What's more likely
true is that it wasn't tuned to your write load -- that _will_ cause
performance issues.  Of course, tuning it is a major pain, as
mentioned.  I'm also somewhat puzzled by the claim of unreliability:
most of the actual replication failures I've ever seen under Slony are
due to operator error (these are trivial to induce, alas --
aforementioned pain to work with again).  Slony is baroque and
confusing, but it's specifically designed to fail in safe ways (which
is not true of some of the other systems: several of them have modes
in which it's possible to have systems out of sync with each other,
but with no way to detect as much.  IMO, that's much worse, so we
designed Slony to fail noisily if it was going to fail at all).  

 *Mammoth Replicator* - This is open source now, is it any good? It  
 sounds like it's trigger based like Slony. Is it based on Slony, or  
 simply use a similar solution?

It's completely unrelated, and it doesn't use triggers.  I think the
people programming it are first-rate.  Last I looked at it, I felt a
little uncomfortable with certain design choices, which seemed to me
to be a little hacky.  They were all on the TODO list, though.

 *SkyTools/Londiste* - Don't know anything special about it.

I've been quite impressed by the usability.  It's not quite as
flexible as Slony, but it has the same theory of operation.  The
documentation is not as voluminous, although it's also much handier as
reference material than Slony's (which is, in my experience, a little
hard to navigate if you don't already know the system pretty well).

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Lists
I'm currently running 32bit FreeBSD so I can't really add more ram (PAE 
doesn't work well under FreeBSD from what I've read) and there are 
enough writes that more ram won't solve the problem completely.


However I will add plenty more ram next time I rebuild it.


Heikki Linnakangas wrote:

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB 
in pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a 
dramatic effect.





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Lists

Andrew Sullivan wrote:

On Sun, Apr 05, 2009 at 11:36:33AM -0700, Lists wrote:

  
*Slony-I* - I've used this in the past, but it's a huge pain to work  
with, caused serious performance issues under heavy load due to long  
running transactions (may not be the case anymore, it's been a while  
since I used it on a large database with many writes), and doesn't seem  
very reliable (I've had replication break on me multiple times).



It is indeed a pain to work with, but I find it hard to believe that
it is the actual source of performance issues.  What's more likely
true is that it wasn't tuned to your write load -- that _will_ cause
performance issues.  
Can you point me in the direction of the documentation for tuning it? I 
don't see anything in the documentation for tuning for write load.



Of course, tuning it is a major pain, as
mentioned.  I'm also somewhat puzzled by the claim of unreliability:
most of the actual replication failures I've ever seen under Slony are
due to operator error (these are trivial to induce, alas --
aforementioned pain to work with again).  
Recently I had a problem with duplicate key errors on the slave, which 
shouldn't be possible since they keys are the same.

I've just noticed in the documentation that

   The Duplicate Key Violation
   http://www.slony.info/documentation/faq.html#DUPKEY bug has helped
   track down a number of rather obscure PostgreSQL race conditions, so
   that in modern versions of Slony-I and PostgreSQL, there should be
   little to worry about.

so that may no longer be an issue. However I experienced with this the 
latest Slony (as of late last year) and Postgresql 8.3.


Also the dupe key error linked appears to be duplicate key of slony 
meta-data were as this was a duplicate key of one of my table's primary 
key.

Slony is baroque and
confusing, but it's specifically designed to fail in safe ways (which
is not true of some of the other systems: several of them have modes
in which it's possible to have systems out of sync with each other,
but with no way to detect as much.  IMO, that's much worse, so we
designed Slony to fail noisily if it was going to fail at all).  
  

An error is better than silently failing, but of course neither is optimal.

The slony project could really benefit from a simpler user interface and 
simpler documentation. It's integration into pgadminIII is a good step, 
but even with that it is still a bit of a pain so I hope it continues to 
improve in ease of use.


Being powerful and flexable is good, but ease of use with sensible 
defaults for complex items that can be easily overridden is even better.


  
*Mammoth Replicator* - This is open source now, is it any good? It  
sounds like it's trigger based like Slony. Is it based on Slony, or  
simply use a similar solution?



It's completely unrelated, and it doesn't use triggers.  I think the
people programming it are first-rate.  Last I looked at it, I felt a
little uncomfortable with certain design choices, which seemed to me
to be a little hacky.  They were all on the TODO list, though.

  

*SkyTools/Londiste* - Don't know anything special about it.



I've been quite impressed by the usability.  It's not quite as
flexible as Slony, but it has the same theory of operation.  The
documentation is not as voluminous, although it's also much handier as
reference material than Slony's (which is, in my experience, a little
hard to navigate if you don't already know the system pretty well).

A

  

Thanks, I'll look into both of those as well.



[PERFORM] Best replication solution?

2009-04-05 Thread Lists
I am looking to setup replication of my postgresql database, primarily 
for performance reasons.


The searching I've done shows a lot of different options, can anyone 
give suggestions about which one(s) are best? I've read the archives, 
but there seems to be more replication solutions since the last thread 
on this subject and it seems to change frequently.


I'd really like a solution that replicates DDL, but very few do so I 
think I'm out of luck for that. I can live without it.
Multi-master support would be nice too, but also seems to cause too many 
problems so it looks like I'll have to do without it too.



*Slony-I* - I've used this in the past, but it's a huge pain to work 
with, caused serious performance issues under heavy load due to long 
running transactions (may not be the case anymore, it's been a while 
since I used it on a large database with many writes), and doesn't seem 
very reliable (I've had replication break on me multiple times).


*Mammoth Replicator* - This is open source now, is it any good? It 
sounds like it's trigger based like Slony. Is it based on Slony, or 
simply use a similar solution?


*pgpool* - Won't work for us reliably for replication because we have 
some triggers and stored procedures that write data.


*PGCluster* - Sounds cool, but based on the mailing list traffic and the 
last news post on the site being from 2005, development seems to be near 
dead. Also, no releases seems to make it beyond the RC stage -- for 
multi-master stability is particularly important for data integrity.


*PGReplicator - *Don't know anything special about it.
*
Bucardo* - Don't know anything special about it.

*Postgres-R* - Don't know anything special about it.

*SkyTools/Londiste* - Don't know anything special about it.


Re: [PERFORM] Best replication solution?

2009-04-05 Thread Lists
I have a high traffic database with high volumes of reads, and moderate 
volumes of writes. Millions of queries a day.


Running the latest version of Postgresql 8.2.x (I want to upgrade to 
8.3, but the dump/reload requires an unacceptable amount of downtime)


Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB in 
pgdump format).



The IO on the disks is being maxed out and I don't have the budget to 
add more disks at this time. The web server has a raid10 of sata drives 
with some io bandwidth to spare so I would like to replicate all data 
over, and send some read queries to that server -- in particular the 
very IO intensive FTI based search queries.



ries van Twisk wrote:

Dr Mr No Name,

what replication solution is the best depends on your requirements.
May be you can tell a bit more what your situation is?
Since you didn't gave us to much information about your requirements 
it's hard to give you any advice.


Ries

On Apr 5, 2009, at 1:36 PM, Lists wrote:

I am looking to setup replication of my postgresql database, 
primarily for performance reasons.


The searching I've done shows a lot of different options, can anyone 
give suggestions about which one(s) are best? I've read the archives, 
but there seems to be more replication solutions since the last 
thread on this subject and it seems to change frequently.


I'd really like a solution that replicates DDL, but very few do so I 
think I'm out of luck for that. I can live without it.
Multi-master support would be nice too, but also seems to cause too 
many problems so it looks like I'll have to do without it too.



*Slony-I* - I've used this in the past, but it's a huge pain to work 
with, caused serious performance issues under heavy load due to long 
running transactions (may not be the case anymore, it's been a while 
since I used it on a large database with many writes), and doesn't 
seem very reliable (I've had replication break on me multiple times).


*Mammoth Replicator* - This is open source now, is it any good? It 
sounds like it's trigger based like Slony. Is it based on Slony, or 
simply use a similar solution?


*pgpool* - Won't work for us reliably for replication because we have 
some triggers and stored procedures that write data.


*PGCluster* - Sounds cool, but based on the mailing list traffic and 
the last news post on the site being from 2005, development seems to 
be near dead. Also, no releases seems to make it beyond the RC stage 
-- for multi-master stability is particularly important for data 
integrity.


*PGReplicator - *Don't know anything special about it.
*
Bucardo* - Don't know anything special about it.

*Postgres-R* - Don't know anything special about it.

*SkyTools/Londiste* - Don't know anything special about it.











Re: [PERFORM] Best replication solution?

2009-04-05 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 Running the latest version of Postgresql 8.2.x (I want to upgrade to
 8.3, but the dump/reload requires an unacceptable amount of downtime)

You can use Slony or Bucardo to ugrade in place. Both will incur some
overhead and more overall complexity than a dump/reload, but going to
8.3 is well worth it (and will bring your IO down).

 The IO on the disks is being maxed out and I don't have the budget to
 add more disks at this time. The web server has a raid10 of sata drives
 with some io bandwidth to spare so I would like to replicate all data
 over, and send some read queries to that server -- in particular the
 very IO intensive FTI based search queries.

Sounds like a good solution for a table-based, read-only-slaves solutions,
especially if you only need enough of the schema to perform some of the
more intense queries. Again, Slony and Bucardo are probably the best fit.
All this assumes that the tables in question have some sort of unique key,
you aren't using large objects, or changing DDL frequently. I'd give Slony a
second try and Bucardo a first one on your QA/test cluster and see how
they work out for you. You could even make the read-only slaves 8.3, since
they will be starting from scratch.

Of course, if the underlying problem replication is trying to solve is too
much search traffic (e.g. select queries) on the main database, there are other
solutions you could consider (e.g. external search such as Sphinx or SOLR,
caching solutions such as Squid or Varnish, moving the slaves to the cloud, 
etc.)

- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200904052158
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAknZZMgACgkQvJuQZxSWSsjbcgCfWqTUEDGlDqAnLaCAhcJlSLCk
EVMAni0oCevrnMdZ2Fuw8Tysaxp3q+/U
=0vu6
-END PGP SIGNATURE-



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance