[PERFORM] probelm with alter table add constraint......

2009-04-06 Thread roopasatish
I have an issue with the add foreign key constraint which goes for waiting
and locks other queries as well.

ALTER TABLE ONLY holding_positions ADD CONSTRAINT
holding_positions_stock_id_fkey FOREIGN KEY (stock_id)
  REFERENCES stocks (stock_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION;

The holding_positions table has no data in it as yet.


ps aux | grep postgres
postgres  5855  0.8  1.6 346436 271452 ? Ss   16:34   0:07
postgres: abc stocks 192.100.100.111(60308) SELECT waiting
 postgres  6134  0.0  0.0 346008 4184 ?   Ss   16:44   0:00
postgres: xyz stocks 192.100.100.222(34604) ALTER TABLE waiting



Any suggestions would be appreciated.

Roopa


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Heikki Linnakangas

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB in 
pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a dramatic 
effect.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] probelm with alter table add constraint......

2009-04-06 Thread Robert Haas
On Mon, Apr 6, 2009 at 2:54 AM, roopasatish roopaben...@gmail.com wrote:

 I have an issue with the add foreign key constraint which goes for waiting
 and locks other queries as well.

 ALTER TABLE ONLY holding_positions ADD CONSTRAINT
 holding_positions_stock_id_fkey FOREIGN KEY (stock_id)
   REFERENCES stocks (stock_id) MATCH SIMPLE
   ON UPDATE NO ACTION ON DELETE NO ACTION;

 The holding_positions table has no data in it as yet.


 ps aux | grep postgres
 postgres  5855  0.8  1.6 346436 271452 ? Ss   16:34   0:07
 postgres: abc stocks 192.100.100.111(60308) SELECT waiting
 postgres  6134  0.0  0.0 346008 4184 ?   Ss   16:44   0:00
 postgres: xyz stocks 192.100.100.222(34604) ALTER TABLE waiting



 Any suggestions would be appreciated.

You need to look at what locks they're waiting for.

select locktype, database, relation::regclass, page, tuple,
virtualxid, transactionid, classid, objid, objsubid,
virtualtransaction, pid, mode, granted from pg_locks;

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] plpgsql arrays

2009-04-06 Thread Matthew Wakeling

On Fri, 3 Apr 2009, Tom Lane wrote:

Simon Riggs si...@2ndquadrant.com writes:

On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote:

I don't actually believe that a standard merge join algorithm will work
with an intransitive join condition ...



I think it's a common enough problem that having a non-standard join
algorithm written for that case would be interesting indeed.


Sounds like a great PhD thesis topic.


I agree it'd be very cool to have a non-standard join algorithm for this 
built into Postgres. However it is nowhere near complicated enough for a 
PhD thesis topic.


I'm just putting the finishing touches on a plpgsql implementation - in 
order to perform the join on a asymmetric set of ranges, you just need to 
keep two separate history lists as you sweep through the two incoming 
streams. This would be sufficient for range constraints.


Matthew

--
Surely the value of C++ is zero, but C's value is now 1?
 -- map36, commenting on the No, C++ isn't equal to D. 'C' is undeclared
 [...] C++ should really be called 1 response to C++ -- shouldn't it
 be called D?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] difficulties with time based queries

2009-04-06 Thread Robert Haas
On Sun, Apr 5, 2009 at 11:35 PM, Rainer Mager rai...@vanten.com wrote:
 -Original Message-
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]
 Rainer Mager rai...@vanten.com writes:
  From: Tom Lane [mailto:t...@sss.pgh.pa.us]
  Hmm ... it's pretty unusual to see the index fetch portion of a
 bitmap
  scan take the bulk of the runtime.  Usually that part is fast and
 where
  the pain comes is in fetching from the heap.   I wonder whether that
  index has become bloated.  How big are the table and the index
  physically?  (Look at pg_class.relpages, or if you want a really
  accurate number try pg_relation_size().)

  Can you give me some more info on how to look at these stats?

 Since you've got 8.3 it's easy: select pg_relation_size('tablename')
 (or indexname).  The result is in bytes, so you might want to
 divide by 1K or 1M to keep the number readable.

 Ok, nice and simple...I like it:

 The result for the table ad_log, is 30,063 MB. The result for the index,
 ad_log_date_all, is 17,151 MB. I guess this roughly makes sense since the
 index is on 4 fields and the table only has 6 fields.

 For the particular query I'm trying to optimize at the moment I believe I
 should be able to use an index that references only 2 fields, which, I
 imagine, should reduce the time needed to read it. I'll play with this a bit
 and see what happens.

Even if your query could use an index four fields, a lot of times it
won't be the winning strategy, because it means reading a lot more
data from the disk.  Plus, all of these huge indices are competing for
RAM with data from the table itself.  You might want to think about
getting rid of all of the indices with more than 1 or 2 columns.
ad_log_unique is probably huge and it seems like it's probably not
improving your data integrity as much as you might think...

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] probelm with alter table add constraint......

2009-04-06 Thread Albe Laurenz
roopasatish wrote:
 I have an issue with the add foreign key constraint which 
 goes for waiting and locks other queries as well.
  
 ALTER TABLE ONLY holding_positions ADD CONSTRAINT 
 holding_positions_stock_id_fkey FOREIGN KEY (stock_id)
   REFERENCES stocks (stock_id) MATCH SIMPLE
   ON UPDATE NO ACTION ON DELETE NO ACTION;
  
 The holding_positions table has no data in it as yet.

Look in pg_catalog.pg_locks for a second transaction that
holds a lock on the table holding_positions.

How many backends do you see in pg_stat_activity that
are running or in a transaction?

Any other backend that is in a transaction that has e.g.
selected from the table will block the ALTER TABLE.

Yours,
Laurenz Albe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Forcing seq_scan off for large table joined with tiny table yeilds improved performance

2009-04-06 Thread Mario Splivalo

Scott Marlowe wrote:


It's not really solved, it's just a happy coincidence that the current
plan runs well.  In order to keep the query planner making good
choices you need to increase stats target for the field in the index
above.  The easiest way to do so is to do this:

alter database mydb set default_statistics_target=100;

and run analyze again:

analyze;


So, i removed the index on field_name, set 
default_default_statistics_target to 100, analyzed, and the results are 
the same:


 QUERY PLAN 



 Hash Join  (cost=1.04..58676.31 rows=218048 width=68) (actual 
time=0.067..12268.394 rows=6 loops=1)

   Hash Cond: ((u.field_name)::text = (t.key)::text)
   -  Seq Scan on photo_info_data u  (cost=0.00..47500.30 rows=2398530 
width=50) (actual time=0.013..6426.611 rows=2398446 loops=1)
   -  Hash  (cost=1.02..1.02 rows=2 width=18) (actual 
time=0.015..0.015 rows=2 loops=1)
 -  Seq Scan on t_query_data t  (cost=0.00..1.02 rows=2 
width=18) (actual time=0.002..0.006 rows=2 loops=1)

 Total runtime: 12268.459 ms
(6 rows)

I even changed default_statistics_target to 1000:


 Hash Join  (cost=1.04..58580.29 rows=208561 width=67) (actual 
time=0.054..12434.283 rows=6 loops=1)

   Hash Cond: ((u.field_name)::text = (t.key)::text)
   -  Seq Scan on photo_info_data u  (cost=0.00..47499.46 rows=2398446 
width=49) (actual time=0.012..6129.923 rows=2398446 loops=1)
   -  Hash  (cost=1.02..1.02 rows=2 width=18) (actual 
time=0.015..0.015 rows=2 loops=1)
 -  Seq Scan on t_query_data t  (cost=0.00..1.02 rows=2 
width=18) (actual time=0.002..0.004 rows=2 loops=1)

 Total runtime: 12434.338 ms
(6 rows)


Even when I run this query, I get sequential scan:

explain analyze select * from photo_info_data where field_name = 
'f-spot' or field_name = 'shutter';


QUERY PLAN 


---
 Seq Scan on photo_info_data  (cost=0.00..59491.69 rows=1705 width=49) 
(actual time=0.018..1535.963 rows=6 loops=1)
   Filter: (((field_name)::text = 'f-spot'::text) OR 
((field_name)::text = 'shutter'::text))

 Total runtime: 1536.010 ms
(3 rows)

These are the representations of te values 'f-spot' and 'shutter' for 
the field field_name in photo_info_data table:


xmltest=# select field_name, count(*) from user_info_data where 
field_name in ('visina', 'spol') group by field_name;

 field_name | count
+---
 'f-spot'   | 3
 'shutter'  | 3
(2 rows)


Maybe my test-data is poor? As I've mentioned, photo_info_data has 
little over 230 rows. And this is complete 'distribution' of the data:


xmltest=# select field_name, count(*) from user_info_data group by 
field_name order by count(*) desc;

   field_name   | count
+
 field_Xx1  | 35
 field_Xx2  | 332447
 field_Xx3  | 297414
 field_Xx4  | 262394
 field_Xx5  | 227396
 field_Xx6  | 192547
 field_Xx7  | 157612
 field_Xx8  | 122543
 field_Xx9  |  87442
 field_Xx10 |  52296
 field_1|  5
 field_2|  47389
 field_3|  42412
 field_4|  37390
 field_5|  32366
 field_6|  27238
 field_7|  22360
 field_Xx11 |  17589
 field_8|  17412
 field_9|  12383
 field_10   |   7386
 field_11   |   2410
 f-spot |  3
 shutter|  3
 focal  |  3
 flash  |  3
 m_city |  3
 person |  3
 iso|  2
(29 rows)

No matter what field_name value I enter in WHERE condition, planner 
chooses sequential scan. Only when I add seperate index on field_name, 
planner chooes index scan or bitmap index scan.


Mike

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] difficulties with time based queries

2009-04-06 Thread Matthew Wakeling

On Mon, 6 Apr 2009, Rainer Mager wrote:

The data should be mostly ordered by date. It is all logged in semi-realtime
such that 99% will be logged within an hour of the timestamp. Also, as
stated above, during this query it was about 1.2 MB/s, which I know isn't
great. I admit this isn't the best hardware in the world, but I would expect
better than that for linear queries.


Might you have an unbalanced index tree? Reindexing would also solve that 
problem.


Matthew

--
There are only two kinds of programming languages: those people always
bitch about and those nobody uses. (Bjarne Stroustrup)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Andrew Sullivan
On Sun, Apr 05, 2009 at 11:36:33AM -0700, Lists wrote:

 *Slony-I* - I've used this in the past, but it's a huge pain to work  
 with, caused serious performance issues under heavy load due to long  
 running transactions (may not be the case anymore, it's been a while  
 since I used it on a large database with many writes), and doesn't seem  
 very reliable (I've had replication break on me multiple times).

It is indeed a pain to work with, but I find it hard to believe that
it is the actual source of performance issues.  What's more likely
true is that it wasn't tuned to your write load -- that _will_ cause
performance issues.  Of course, tuning it is a major pain, as
mentioned.  I'm also somewhat puzzled by the claim of unreliability:
most of the actual replication failures I've ever seen under Slony are
due to operator error (these are trivial to induce, alas --
aforementioned pain to work with again).  Slony is baroque and
confusing, but it's specifically designed to fail in safe ways (which
is not true of some of the other systems: several of them have modes
in which it's possible to have systems out of sync with each other,
but with no way to detect as much.  IMO, that's much worse, so we
designed Slony to fail noisily if it was going to fail at all).  

 *Mammoth Replicator* - This is open source now, is it any good? It  
 sounds like it's trigger based like Slony. Is it based on Slony, or  
 simply use a similar solution?

It's completely unrelated, and it doesn't use triggers.  I think the
people programming it are first-rate.  Last I looked at it, I felt a
little uncomfortable with certain design choices, which seemed to me
to be a little hacky.  They were all on the TODO list, though.

 *SkyTools/Londiste* - Don't know anything special about it.

I've been quite impressed by the usability.  It's not quite as
flexible as Slony, but it has the same theory of operation.  The
documentation is not as voluminous, although it's also much handier as
reference material than Slony's (which is, in my experience, a little
hard to navigate if you don't already know the system pretty well).

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] plpgsql arrays

2009-04-06 Thread Matthew Wakeling

On Fri, 3 Apr 2009, Simon Riggs wrote:

On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote:

Matthew Wakeling matt...@flymine.org writes:

On Fri, 3 Apr 2009, Robert Haas wrote:

Why not just use SQL to do the join?


Because the merge condition is:

WHERE l1.start = l2.end AND l2.start = l1.end

and merge joins in postgres only currently cope with the case where the
merge condition is an equals relationship.


I don't actually believe that a standard merge join algorithm will work
with an intransitive join condition ...


I think it's a common enough problem that having a non-standard join
algorithm written for that case would be interesting indeed.


I'm currently trying to persuade my boss to give me time to do some work 
to implement this in Postgres. It's not something I will be able to start 
right away, but maybe in a little while.


I'm currently seeing this as being able to mark overlap constraints ( 
in quite a few data types) as OVERLAP_MERGES, and have the planner be 
able to use the new merge join algorithm. So it wouldn't help with the 
exact query above, but would if I rewrote it to use the bioseg or spacial 
data types' overlap operators.


I will need a little help as I am not incredibly familiar with the 
Postgres innards. Would someone be able to do that?


Matthew

--
Existence is a convenient concept to designate all of the files that an
executable program can potentially process.   -- Fortran77 standard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] probelm with alter table add constraint......

2009-04-06 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 roopasatish wrote:
 I have an issue with the add foreign key constraint which 
 goes for waiting and locks other queries as well.
 
 ALTER TABLE ONLY holding_positions ADD CONSTRAINT 
 holding_positions_stock_id_fkey FOREIGN KEY (stock_id)
 REFERENCES stocks (stock_id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION;
 
 The holding_positions table has no data in it as yet.

 Look in pg_catalog.pg_locks for a second transaction that
 holds a lock on the table holding_positions.

This statement also needs to get lock on the referenced table stocks.
An open transaction that's referenced either table will block it.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Forcing seq_scan off for large table joined with tiny table yeilds improved performance

2009-04-06 Thread Scott Marlowe
On Mon, Apr 6, 2009 at 6:20 AM, Mario Splivalo
mario.spliv...@megafon.hr wrote:
 Scott Marlowe wrote:

 It's not really solved, it's just a happy coincidence that the current
 plan runs well.  In order to keep the query planner making good
 choices you need to increase stats target for the field in the index
 above.  The easiest way to do so is to do this:

 alter database mydb set default_statistics_target=100;

 and run analyze again:

 analyze;

 So, i removed the index on field_name, set default_default_statistics_target
 to 100, analyzed, and the results are the same:

Why did you remove the index?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] plpgsql arrays

2009-04-06 Thread Robert Haas
On Mon, Apr 6, 2009 at 8:52 AM, Matthew Wakeling matt...@flymine.org wrote:
 On Fri, 3 Apr 2009, Simon Riggs wrote:

 On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote:

 Matthew Wakeling matt...@flymine.org writes:

 On Fri, 3 Apr 2009, Robert Haas wrote:

 Why not just use SQL to do the join?

 Because the merge condition is:

 WHERE l1.start = l2.end AND l2.start = l1.end

 and merge joins in postgres only currently cope with the case where the
 merge condition is an equals relationship.

 I don't actually believe that a standard merge join algorithm will work
 with an intransitive join condition ...

 I think it's a common enough problem that having a non-standard join
 algorithm written for that case would be interesting indeed.

 I'm currently trying to persuade my boss to give me time to do some work to
 implement this in Postgres. It's not something I will be able to start right
 away, but maybe in a little while.

 I'm currently seeing this as being able to mark overlap constraints ( in
 quite a few data types) as OVERLAP_MERGES, and have the planner be able to
 use the new merge join algorithm. So it wouldn't help with the exact query
 above, but would if I rewrote it to use the bioseg or spacial data types'
 overlap operators.

 I will need a little help as I am not incredibly familiar with the Postgres
 innards. Would someone be able to do that?

I can help review if you post a patch, even if it's WIP.  But you
should post it to -hackers, not here.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Forcing seq_scan off for large table joined with tiny table yeilds improved performance

2009-04-06 Thread Mario Splivalo

Scott Marlowe wrote:

On Mon, Apr 6, 2009 at 6:20 AM, Mario Splivalo
mario.spliv...@megafon.hr wrote:

Scott Marlowe wrote:

It's not really solved, it's just a happy coincidence that the current
plan runs well.  In order to keep the query planner making good
choices you need to increase stats target for the field in the index
above.  The easiest way to do so is to do this:

alter database mydb set default_statistics_target=100;

and run analyze again:

analyze;

So, i removed the index on field_name, set default_default_statistics_target
to 100, analyzed, and the results are the same:


Why did you remove the index?



Because I already have index on that column, index needed to enforce PK 
constraint. Here is the original DDL for the table:


CREATE TABLE photo_info_data
(
  photo_id integer NOT NULL,
  field_name character varying NOT NULL,
  field_value character varying,
  CONSTRAINT photo_info_data_pk PRIMARY KEY (user_id, field_name)
)

CREATE INDEX photo_info_data_ix_field_value
  ON user_info_data USING btree (field_value);

So, there is index on (user_id, field_name). Postgres is using index for 
user_id (...WHERE user_id = 12345) but not on field-name (...WHERE 
field_name = 'f-spot'). When I add extra index on field name:


CREATE INDEX photo_info_data_ix__field_name
  ON user_info_data USING btree (field_name);

Then that index is used.

Mike

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Forcing seq_scan off for large table joined with tiny table yeilds improved performance

2009-04-06 Thread Scott Marlowe
On Mon, Apr 6, 2009 at 8:37 AM, Mario Splivalo
mario.spliv...@megafon.hr wrote:
 Scott Marlowe wrote:

 On Mon, Apr 6, 2009 at 6:20 AM, Mario Splivalo
 mario.spliv...@megafon.hr wrote:

 Scott Marlowe wrote:

 It's not really solved, it's just a happy coincidence that the current
 plan runs well.  In order to keep the query planner making good
 choices you need to increase stats target for the field in the index
 above.  The easiest way to do so is to do this:

 alter database mydb set default_statistics_target=100;

 and run analyze again:

 analyze;

 So, i removed the index on field_name, set
 default_default_statistics_target
 to 100, analyzed, and the results are the same:

 Why did you remove the index?


 Because I already have index on that column, index needed to enforce PK
 constraint. Here is the original DDL for the table:

 CREATE TABLE photo_info_data
 (
  photo_id integer NOT NULL,
  field_name character varying NOT NULL,
  field_value character varying,
  CONSTRAINT photo_info_data_pk PRIMARY KEY (user_id, field_name)
 )

 CREATE INDEX photo_info_data_ix_field_value
  ON user_info_data USING btree (field_value);

 So, there is index on (user_id, field_name). Postgres is using index for
 user_id (...WHERE user_id = 12345) but not on field-name (...WHERE
 field_name = 'f-spot'). When I add extra index on field name:

 CREATE INDEX photo_info_data_ix__field_name
  ON user_info_data USING btree (field_name);

 Then that index is used.

On older versions of pgsql, the second of two terms in a multicolumn
index can't be used alone.  On newer versions it can, but it is much
less efficient than if it's a single column index or if the term is
the first one not the second.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Forcing seq_scan off for large table joined with tiny table yeilds improved performance

2009-04-06 Thread Mario Splivalo

Scott Marlowe wrote:

CREATE INDEX photo_info_data_ix_field_value
 ON user_info_data USING btree (field_value);

So, there is index on (user_id, field_name). Postgres is using index for
user_id (...WHERE user_id = 12345) but not on field-name (...WHERE
field_name = 'f-spot'). When I add extra index on field name:

CREATE INDEX photo_info_data_ix__field_name
 ON user_info_data USING btree (field_name);

Then that index is used.


On older versions of pgsql, the second of two terms in a multicolumn
index can't be used alone.  On newer versions it can, but it is much
less efficient than if it's a single column index or if the term is
the first one not the second.


I'm using 8.3.7. So, you'd also suggest to keep that extra (in a way 
redundant) index on field_name, since I need PK on (photo_id, field_name) ?


Mike

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Forcing seq_scan off for large table joined with tiny table yeilds improved performance

2009-04-06 Thread Scott Marlowe
On Mon, Apr 6, 2009 at 8:50 AM, Mario Splivalo
mario.spliv...@megafon.hr wrote:
 Scott Marlowe wrote:

 CREATE INDEX photo_info_data_ix_field_value
  ON user_info_data USING btree (field_value);

 So, there is index on (user_id, field_name). Postgres is using index for
 user_id (...WHERE user_id = 12345) but not on field-name (...WHERE
 field_name = 'f-spot'). When I add extra index on field name:

 CREATE INDEX photo_info_data_ix__field_name
  ON user_info_data USING btree (field_name);

 Then that index is used.

 On older versions of pgsql, the second of two terms in a multicolumn
 index can't be used alone.  On newer versions it can, but it is much
 less efficient than if it's a single column index or if the term is
 the first one not the second.

 I'm using 8.3.7. So, you'd also suggest to keep that extra (in a way
 redundant) index on field_name, since I need PK on (photo_id, field_name) ?

Either that or reverse the terms in the pk.

Also, you might want to look at adjusting random_page_access to
something around 1.5 to 2.0.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Lists
I'm currently running 32bit FreeBSD so I can't really add more ram (PAE 
doesn't work well under FreeBSD from what I've read) and there are 
enough writes that more ram won't solve the problem completely.


However I will add plenty more ram next time I rebuild it.


Heikki Linnakangas wrote:

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB 
in pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a 
dramatic effect.





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Lists

Andrew Sullivan wrote:

On Sun, Apr 05, 2009 at 11:36:33AM -0700, Lists wrote:

  
*Slony-I* - I've used this in the past, but it's a huge pain to work  
with, caused serious performance issues under heavy load due to long  
running transactions (may not be the case anymore, it's been a while  
since I used it on a large database with many writes), and doesn't seem  
very reliable (I've had replication break on me multiple times).



It is indeed a pain to work with, but I find it hard to believe that
it is the actual source of performance issues.  What's more likely
true is that it wasn't tuned to your write load -- that _will_ cause
performance issues.  
Can you point me in the direction of the documentation for tuning it? I 
don't see anything in the documentation for tuning for write load.



Of course, tuning it is a major pain, as
mentioned.  I'm also somewhat puzzled by the claim of unreliability:
most of the actual replication failures I've ever seen under Slony are
due to operator error (these are trivial to induce, alas --
aforementioned pain to work with again).  
Recently I had a problem with duplicate key errors on the slave, which 
shouldn't be possible since they keys are the same.

I've just noticed in the documentation that

   The Duplicate Key Violation
   http://www.slony.info/documentation/faq.html#DUPKEY bug has helped
   track down a number of rather obscure PostgreSQL race conditions, so
   that in modern versions of Slony-I and PostgreSQL, there should be
   little to worry about.

so that may no longer be an issue. However I experienced with this the 
latest Slony (as of late last year) and Postgresql 8.3.


Also the dupe key error linked appears to be duplicate key of slony 
meta-data were as this was a duplicate key of one of my table's primary 
key.

Slony is baroque and
confusing, but it's specifically designed to fail in safe ways (which
is not true of some of the other systems: several of them have modes
in which it's possible to have systems out of sync with each other,
but with no way to detect as much.  IMO, that's much worse, so we
designed Slony to fail noisily if it was going to fail at all).  
  

An error is better than silently failing, but of course neither is optimal.

The slony project could really benefit from a simpler user interface and 
simpler documentation. It's integration into pgadminIII is a good step, 
but even with that it is still a bit of a pain so I hope it continues to 
improve in ease of use.


Being powerful and flexable is good, but ease of use with sensible 
defaults for complex items that can be easily overridden is even better.


  
*Mammoth Replicator* - This is open source now, is it any good? It  
sounds like it's trigger based like Slony. Is it based on Slony, or  
simply use a similar solution?



It's completely unrelated, and it doesn't use triggers.  I think the
people programming it are first-rate.  Last I looked at it, I felt a
little uncomfortable with certain design choices, which seemed to me
to be a little hacky.  They were all on the TODO list, though.

  

*SkyTools/Londiste* - Don't know anything special about it.



I've been quite impressed by the usability.  It's not quite as
flexible as Slony, but it has the same theory of operation.  The
documentation is not as voluminous, although it's also much handier as
reference material than Slony's (which is, in my experience, a little
hard to navigate if you don't already know the system pretty well).

A

  

Thanks, I'll look into both of those as well.