Re: [PERFORM] Architecting a database

2010-06-28 Thread Dimitri Fontaine
t...@exquisiteimages.com writes:
 I am wondering how I should architect this in PostgreSQL. Should I follow
 a similar strategy and have a separate database for each client and one
 database that contains the global data? 

As others said already, there's more problems to foresee doing so that
there are advantages. If you must separate data for security concerns,
your situation would be much more comfortable using schema.

If it's all about performances, see about partitioning the data, and
maybe not even on the client id but monthly, e.g., depending on the
queries you run in your application.

Regards,
-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Rajesh Kumar Mallah
Dear List,

just by removing the order by co_name reduces the query time dramatically
from  ~ 9 sec  to 63 ms. Can anyone please help.

Regds
Rajesh Kumar Mallah.


explain analyze SELECT * from   ( SELECT
a.profile_id,a.userid,a.amount,a.category_id,a.catalog_id,a.keywords,b.co_name
from general.catalogs a join general.profile_master using(profile_id) where
1=1  and co_name_vec @@   to_tsquery('manufacturer')  and  b.co_name is not
null and a.ifmain is true ) as c order by co_name
limit 25 offset 0;


Limit  (cost=0.00..3659.13 rows=25 width=129) (actual time=721.075..9241.105
rows=25 loops=1)
   -  Nested Loop  (cost=0.00..1215772.28 rows=8307 width=476) (actual
time=721.073..9241.050 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..1208212.37 rows=8307 width=476)
(actual time=721.052..9240.037 rows=25 loops=1)
   -  Nested Loop  (cost=0.00..1204206.26 rows=6968 width=472)
(actual time=721.032..9239.516 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..1154549.19 rows=6968
width=471) (actual time=721.012..9236.523 rows=25 loops=1)
   -  Index Scan using profile_master_co_name on
profile_master b  (cost=0.00..1125295.59 rows=6968 width=25) (actual
time=0.097..9193.154 rows=2212 loops=1)
 Filter: ((co_name IS NOT NULL) AND
((co_name_vec)::tsvector @@ to_tsquery('manufacturer'::text)))
   -  Index Scan using
catalog_master_profile_id_fkindex on catalog_master  (cost=0.00..4.19 rows=1
width=446) (actual time=0.016..0.016 rows=0 loops=2212)
 Index Cond: (catalog_master.profile_id =
b.profile_id)
 Filter: ((catalog_master.hide IS FALSE) AND
((catalog_master.hosting_status)::text = 'ACTIVE'::text))
 -  Index Scan using profile_master_profile_id_pkey on
profile_master  (cost=0.00..7.11 rows=1 width=9) (actual time=0.105..0.105
rows=1 loops=25)
   Index Cond: (profile_master.profile_id =
catalog_master.profile_id)
   -  Index Scan using
catalog_categories_pkey_catalog_id_category_id on catalog_categories
(cost=0.00..0.56 rows=1 width=8) (actual time=0.014..0.015 rows=1 loops=25)
 Index Cond: (catalog_categories.catalog_id =
catalog_master.catalog_id)
 Filter: (catalog_categories.ifmain IS TRUE)
 -  Index Scan using web_category_master_pkey on
web_category_master  (cost=0.00..0.90 rows=1 width=4) (actual
time=0.034..0.036 rows=1 loops=25)
   Index Cond: (web_category_master.category_id =
catalog_categories.category_id)
   Filter: ((web_category_master.status)::text = 'ACTIVE'::text)
Total runtime: 9241.304 ms

explain analyze SELECT * from   ( SELECT
a.profile_id,a.userid,a.amount,a.category_id,a.catalog_id,a.keywords,b.co_name
from general.catalogs a join general.profile_master b using(profile_id)
where  1=1  and co_name_vec @@   to_tsquery('manufacturer')  and  b.co_name
is not null and a.ifmain is true ) as c  limit 25 offset 0;

QUERY PLAN

--
 Limit  (cost=0.00..358.85 rows=25 width=476) (actual time=0.680..63.176
rows=25 loops=1)
   -  Nested Loop  (cost=0.00..119238.58 rows=8307 width=476) (actual
time=0.677..63.139 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..111678.66 rows=8307 width=476) (actual
time=0.649..62.789 rows=25 loops=1)
   -  Nested Loop  (cost=0.00..107672.56 rows=6968 width=472)
(actual time=0.626..62.436 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..58015.49 rows=6968
width=471) (actual time=0.606..62.013 rows=25 loops=1)
   -  Index Scan using profile_master_co_name_vec
on profile_master b  (cost=0.00..28761.89 rows=6968 width=25) (actual
time=0.071..50.576 rows=1160 loops=1)
 Index Cond: ((co_name_vec)::tsvector @@
to_tsquery('manufacturer'::text))
 Filter: (co_name IS NOT NULL)
   -  Index Scan using
catalog_master_profile_id_fkindex on catalog_master  (cost=0.00..4.19 rows=1
width=446) (actual time=0.008..0.008 rows=0 loops=1160)
 Index Cond: (catalog_master.profile_id =
b.profile_id)
 Filter: ((catalog_master.hide IS FALSE) AND
((catalog_master.hosting_status)::text = 'ACTIVE'::text))
 -  Index Scan using profile_master_profile_id_pkey on
profile_master  (cost=0.00..7.11 rows=1 width=9) (actual time=0.012..0.012
rows=1 loops=25)
   Index Cond: (profile_master.profile_id =
catalog_master.profile_id)
   -  Index Scan using
catalog_categories_pkey_catalog_id_category_id on catalog_categories
(cost=0.00..0.56 rows=1 width=8) (actual time=0.010..0.011 rows=1 loops=25)
 Index Cond: (catalog_categories.catalog_id =

Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Yeb Havinga

Rajesh Kumar Mallah wrote:

Dear List,

just by removing the order by co_name reduces the query time dramatically
from  ~ 9 sec  to 63 ms. Can anyone please help.
The 63 ms query result is probably useless since it returns a limit of 
25 rows from an unordered result. It is not surprising that this is fast.


The pain is here:
Index Scan using profile_master_co_name on profile_master b  
(cost=0.00..1125295.59 rows=6968 width=25) (actual time=0.097..9193.154 
rows=2212 loops=1)
Filter: ((co_name IS NOT NULL) AND 
((co_name_vec)::tsvector @@ to_tsquery('manufacturer'::text)))



It looks like seq_scans are disabled, since the index scan has only a 
filter expression but not an index cond.


regards,
Yeb Havinga



Regds
Rajesh Kumar Mallah.


explain analyze SELECT * from   ( SELECT  
a.profile_id,a.userid,a.amount,a.category_id,a.catalog_id,a.keywords,b.co_name  
from general.catalogs a join general.profile_master using(profile_id) 
where  1=1  and co_name_vec @@   to_tsquery('manufacturer')  and  
b.co_name is not null and a.ifmain is true ) as c order by co_name 
limit 25 offset 0;



Limit  (cost=0.00..3659.13 rows=25 width=129) (actual 
time=721.075..9241.105 rows=25 loops=1)
   -  Nested Loop  (cost=0.00..1215772.28 rows=8307 width=476) 
(actual time=721.073..9241.050 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..1208212.37 rows=8307 width=476) 
(actual time=721.052..9240.037 rows=25 loops=1)
   -  Nested Loop  (cost=0.00..1204206.26 rows=6968 
width=472) (actual time=721.032..9239.516 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..1154549.19 rows=6968 
width=471) (actual time=721.012..9236.523 rows=25 loops=1)
   -  Index Scan using profile_master_co_name 
on profile_master b  (cost=0.00..1125295.59 rows=6968 width=25) 
(actual time=0.097..9193.154 rows=2212 loops=1)
 Filter: ((co_name IS NOT NULL) AND 
((co_name_vec)::tsvector @@ to_tsquery('manufacturer'::text)))
   -  Index Scan using 
catalog_master_profile_id_fkindex on catalog_master  (cost=0.00..4.19 
rows=1 width=446) (actual time=0.016..0.016 rows=0 loops=2212)
 Index Cond: 
(catalog_master.profile_id = b.profile_id)
 Filter: ((catalog_master.hide IS 
FALSE) AND ((catalog_master.hosting_status)::text = 'ACTIVE'::text))
 -  Index Scan using 
profile_master_profile_id_pkey on profile_master  (cost=0.00..7.11 
rows=1 width=9) (actual time=0.105..0.105 rows=1 loops=25)
   Index Cond: (profile_master.profile_id = 
catalog_master.profile_id)
   -  Index Scan using 
catalog_categories_pkey_catalog_id_category_id on catalog_categories  
(cost=0.00..0.56 rows=1 width=8) (actual time=0.014..0.015 rows=1 
loops=25)
 Index Cond: (catalog_categories.catalog_id = 
catalog_master.catalog_id)

 Filter: (catalog_categories.ifmain IS TRUE)
 -  Index Scan using web_category_master_pkey on 
web_category_master  (cost=0.00..0.90 rows=1 width=4) (actual 
time=0.034..0.036 rows=1 loops=25)
   Index Cond: (web_category_master.category_id = 
catalog_categories.category_id)
   Filter: ((web_category_master.status)::text = 
'ACTIVE'::text)

Total runtime: 9241.304 ms

explain analyze SELECT * from   ( SELECT  
a.profile_id,a.userid,a.amount,a.category_id,a.catalog_id,a.keywords,b.co_name  
from general.catalogs a join general.profile_master b 
using(profile_id) where  1=1  and co_name_vec @@   
to_tsquery('manufacturer')  and  b.co_name is not null and a.ifmain is 
true ) as c  limit 25 offset 0;
  
QUERY PLAN   


--
 Limit  (cost=0.00..358.85 rows=25 width=476) (actual 
time=0.680..63.176 rows=25 loops=1)
   -  Nested Loop  (cost=0.00..119238.58 rows=8307 width=476) (actual 
time=0.677..63.139 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..111678.66 rows=8307 width=476) 
(actual time=0.649..62.789 rows=25 loops=1)
   -  Nested Loop  (cost=0.00..107672.56 rows=6968 
width=472) (actual time=0.626..62.436 rows=25 loops=1)
 -  Nested Loop  (cost=0.00..58015.49 rows=6968 
width=471) (actual time=0.606..62.013 rows=25 loops=1)
   -  Index Scan using 
profile_master_co_name_vec on profile_master b  (cost=0.00..28761.89 
rows=6968 width=25) (actual time=0.071..50.576 rows=1160 loops=1)
 Index Cond: ((co_name_vec)::tsvector 
@@ to_tsquery('manufacturer'::text))

 Filter: (co_name IS NOT NULL)
   -  Index Scan using 
catalog_master_profile_id_fkindex on catalog_master  

Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Andres Freund
On Monday 28 June 2010 13:39:27 Yeb Havinga wrote:
 It looks like seq_scans are disabled, since the index scan has only a 
 filter expression but not an index cond.
Or its using it to get an ordered result...

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Rajesh Kumar Mallah
On Mon, Jun 28, 2010 at 5:09 PM, Yeb Havinga yebhavi...@gmail.com wrote:

 Rajesh Kumar Mallah wrote:

 Dear List,

 just by removing the order by co_name reduces the query time dramatically
 from  ~ 9 sec  to 63 ms. Can anyone please help.

 The 63 ms query result is probably useless since it returns a limit of 25
 rows from an unordered result. It is not surprising that this is fast.

 The pain is here:

 Index Scan using profile_master_co_name on profile_master b
  (cost=0.00..1125295.59 rows=6968 width=25) (actual time=0.097..9193.154
 rows=2212 loops=1)
Filter: ((co_name IS NOT NULL) AND
 ((co_name_vec)::tsvector @@ to_tsquery('manufacturer'::text)))


 It looks like seq_scans are disabled, since the index scan has only a
 filter expression but not an index cond.



seq_scans is NOT explicitly disabled. The two queries just differed in the
order by clause.

regds
Rajesh Kumar Mallah.



 regards,
 Yeb Havinga



 Regds
 Rajesh Kumar Mallah.




Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Kevin Grittner
Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:
 
 just by removing the order by co_name reduces the query time
 dramatically from  ~ 9 sec  to 63 ms. Can anyone please help.
 
The reason is that one query allows it to return *any* 25 rows,
while the other query requires it to find a *specific* set of 25
rows.  It happens to be faster to just grab any old set of rows than
to find particular ones.
 
If the actual results you need are the ones sorted by name, then
forget the other query and focus on how you can retrieve the desired
results more quickly.  One valuable piece of information would be to
know how many rows the query would return without the limit.  It's
also possible that your costing factors may need adjustment.  Or you
may need to get finer-grained statistics -- the optimizer thought it
would save time to use an index in the sequence you wanted, but it
had to scan through 2212 rows to find 25 rows which matched the
selection criteria.  It might well have been faster to use a table
scan and sort than to follow the index like that.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:
 just by removing the order by co_name reduces the query time
 dramatically from  ~ 9 sec  to 63 ms. Can anyone please help.
 
 The reason is that one query allows it to return *any* 25 rows,
 while the other query requires it to find a *specific* set of 25
 rows.  It happens to be faster to just grab any old set of rows than
 to find particular ones.

I'm guessing that most of the cost is in repeated evaluations of the
filter clause
(co_name_vec)::tsvector @@ to_tsquery('manufacturer'::text)

There are two extremely expensive functions involved there (cast to
tsvector and to_tsquery) and they're being done over again at what
I suspect is practically every table row.  The unordered query is
so much faster because it stops after having evaluated the text
search clause just a few times.

The way to make this go faster is to set up the actually recommended
infrastructure for full text search, namely create an index on
(co_name_vec)::tsvector (either directly or using an auxiliary tsvector
column).  If you don't want to maintain such an index, fine, but don't
expect full text search queries to be quick.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pgbench results on a new server

2010-06-28 Thread Craig James

On 6/25/10 12:03 PM, Greg Smith wrote:

Craig James wrote:

I've got a new server and want to make sure it's running well.


Any changes to the postgresql.conf file? Generally you need at least a
moderate shared_buffers (1GB or so at a minimum) and checkpoint_segments
(32 or higher) in order for the standard pgbench test to give good results.


max_connections = 500
shared_buffers = 1000MB
work_mem = 128MB
synchronous_commit = off
full_page_writes = off
wal_buffers = 256kB
checkpoint_segments = 30
effective_cache_size = 4GB

For fun I ran it with the installation defaults, and it never got above 1475 
TPS.


pgbench -c20 -t 5000 -U test
tps = 5789
pgbench -c30 -t  -U test
tps = 6961
pgbench -c40 -t 2500 -U test
tps = 2945


General numbers are OK, the major drop going from 30 to 40 clients is
larger than it should be. I'd suggest running the 40 client count one
again to see if that's consistent.


It is consistent.  When I run pgbench from a different server, I get this:

   pgbench -c40 -t 2500 -U test
   tps = 7999

   pgbench -c100 -t 1000 -U test
   tps = 6693

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Rajesh Kumar Mallah
Dear Tom/Kevin/List

thanks for the insight, i will check the suggestion more closely and post
the results.

regds
Rajesh Kumar Mallah.


Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Rajesh Kumar Mallah
The way to make this go faster is to set up the actually recommended
 infrastructure for full text search, namely create an index on
 (co_name_vec)::tsvector (either directly or using an auxiliary tsvector
 column).  If you don't want to maintain such an index, fine, but don't
 expect full text search queries to be quick.

regards, tom lane




Dear Tom/List ,

co_name_vec is actually the auxiliary tsvector column that is mantained via
a
an update trigger. and the index that you suggested is there . consider
simplified
version. When we  order by co_name the index on co_name_vec is not used
some other index is used.

 tradein_clients= explain analyze SELECT  profile_id from
general.profile_master b  where  1=1  and co_name_vec @@   to_tsquery
('manufacturer')   order by co_name  limit 25;

QUERY PLAN
---
 Limit  (cost=0.00..3958.48 rows=25 width=25) (actual time=0.045..19.847
rows=25 loops=1)
   -  Index Scan using profile_master_co_name on profile_master b
(cost=0.00..1125315.59 rows=7107 width=25) (actual time=0.043..19.818
rows=25 loops=1)
 Filter: ((co_name_vec)::tsvector @@
to_tsquery('manufacturer'::text))
 Total runtime: 19.894 ms
(4 rows)

tradein_clients= explain analyze SELECT  profile_id from
general.profile_master b  where  1=1  and co_name_vec @@   to_tsquery
('manufacturer')limit 25;

QUERY PLAN
---
 Limit  (cost=0.00..101.18 rows=25 width=4) (actual time=0.051..0.632
rows=25 loops=1)
   -  Index Scan using profile_master_co_name_vec on profile_master b
(cost=0.00..28761.89 rows=7107 width=4) (actual time=0.049..0.593 rows=25
loops=1)
 Index Cond: ((co_name_vec)::tsvector @@
to_tsquery('manufacturer'::text))
 Total runtime: 0.666 ms
(4 rows)

tradein_clients=


Re: [PERFORM] order by slowing down a query by 80 times

2010-06-28 Thread Tom Lane
Rajesh Kumar Mallah mallah.raj...@gmail.com writes:
 co_name_vec is actually the auxiliary tsvector column that is mantained via
 a
 an update trigger. and the index that you suggested is there .

Well, in that case it's just a costing/statistics issue.  The planner is
probably estimating there are more tsvector matches than there really
are, which causes it to think the in-order indexscan will terminate
earlier than it really will, so it goes for that instead of a full scan
and sort.  If this is 8.4 then increasing the statistics target for the
co_name_vec column should help that.  In previous versions I'm not sure
how much you can do about it other than raise random_page_cost, which is
likely to be a net loss overall.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-28 Thread Bruce Momjian
Benjamin Krajmalnik wrote:
 Rajesh,
 
 I had a similar situation a few weeks ago whereby performance all of a
 sudden decreased.
 The one tunable which resolved the problem in my case was increasing the
 number of checkpoint segments.
 After increasing them, everything went back to its normal state.

Did you get a db server log message suggesting in increasing that
setting? I hope so.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-28 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Dimitri Fontaine dfonta...@hi-media.com writes:
   Josh Berkus j...@agliodbs.com writes:
   a) Eliminate WAL logging entirely
   b) Eliminate checkpointing
   c) Turn off the background writer
   d) Have PostgreSQL refuse to restart after a crash and instead call an
   exteral script (for reprovisioning)
  
   Well I guess I'd prefer a per-transaction setting, allowing to bypass
   WAL logging and checkpointing.
  
  Not going to happen; this is all or nothing.
  
   Forcing the backend to care itself for
   writing the data I'm not sure is a good thing, but if you say so.
  
  Yeah, I think proposal (c) is likely to be a net loss.
  
  (a) and (d) are probably simple, if by reprovisioning you mean
  rm -rf $PGDATA; initdb.  Point (b) will be a bit trickier because
  there are various housekeeping activities tied into checkpoints.
  I think you can't actually remove checkpoints altogether, just
  skip the flush-dirty-pages part.
 
 Based on this thread, I have developed the following documentation patch
 that outlines the performance enhancements possible if durability is not
 required.  The patch also documents that synchronous_commit = false has
 potential committed transaction loss from a database crash (as well as
 an OS crash).

Applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance