[PERFORM] Should changing offset in LIMIT change query plan (at all/so early)?

2010-11-21 Thread goran
Trying to understand why query planer changes the plan from effective
one to ineffective one when I change the offset in the LIMIT. Also,
thankfully accepting RTFM pointers to the actual FMs.

Setup is: 3 tables with 0.5M to 1.5M records
While tuning indexes for the following query

SELECT c.id, c.name, c.owner
FROM catalog c, catalog_securitygroup cs, catalog_university cu
WHERE c.root  50
  AND cs.catalog = c.id
  AND cu.catalog = c.id
  AND cs.securitygroup  200
  AND cu.university  200
ORDER BY c.name
LIMIT 50 OFFSET 100

I managed to bring it to ~3ms with the following plan
QUERY PLAN
---
 Limit  (cost=15141.07..22711.60 rows=50 width=59)
   -  Nested Loop  (cost=0.00..30052749.38 rows=198485 width=59)
 -  Nested Loop  (cost=0.00..705519.23 rows=147500 width=63)
   -  Index Scan using test2 on catalog c
(cost=0.00..241088.93 rows=147500 width=59)
 Index Cond: (root  50)
   -  Index Scan using catalog_university_pkey on
catalog_university cu  (cost=0.00..3.14 rows=1 width=4)
 Index Cond: ((cu.catalog = c.id) AND
(cu.university  200))
 -  Index Scan using catalog_securitygroup_pkey on
catalog_securitygroup cs  (cost=0.00..196.48 rows=199 width=4)
   Index Cond: ((cs.catalog = c.id) AND (cs.securitygroup
 200))


But when I change the OFFSET in the limit to 500 it goes to ~500ms
with following plan
  QUERY
PLAN
---
 Limit  (cost=61421.34..61421.46 rows=50 width=59)
   -  Sort  (cost=61420.09..61916.30 rows=198485 width=59)
 Sort Key: c.name
 -  Merge Join  (cost=45637.87..51393.33 rows=198485
width=59)
   Merge Cond: (c.id = cs.catalog)
   -  Merge Join  (cost=48.95..440699.65 rows=147500
width=63)
 Merge Cond: (c.id = cu.catalog)
 -  Index Scan using catalog_pkey on catalog c
(cost=0.00..78947.35 rows=147500 width=59)
   Filter: (root  50)
 -  Index Scan using catalog_university_pkey on
catalog_university cu  (cost=0.00..358658.68 rows=499950 width=4)
   Index Cond: (cu.university  200)
   -  Materialize  (cost=45527.12..48008.19 rows=198485
width=4)
 -  Sort  (cost=45527.12..46023.34 rows=198485
width=4)
   Sort Key: cs.catalog
   -  Seq Scan on catalog_securitygroup cs
(cost=0.00..25345.76 rows=198485 width=4)
 Filter: (securitygroup  200)

Thanks for your time

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread Pavel Stehule
2010/11/21 Humair Mohammed huma...@hotmail.com:

 1) OS/Configuration
 64-bit Windows 7 Enterprise with 8G RAM and a Dual Core 2.67 Ghz Intel CPU
 postgresql-x64-9.0 (PostgreSQL 9.0.1, compiled by Visual C++ build 1500,
 64-bit)
 work_mem  2GB
 shared_buffers = 2

shared_buffers = 2 ???

Regards

Pavel Stehule


 2) Dataset
 name,pages,tuples,pg_size_pretty
 pivotbad;1870;93496;15 MB
 pivotgood;5025;251212;39 MB
 3) EXPLAIN (ANALYZE ON, BUFFERS ON)
 Hash Join  (cost=16212.30..52586.43 rows=92869 width=17) (actual
 time=25814.222..32296.765 rows=3163 loops=1)
   Hash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text =
 (pg.question)::text))
   Join Filter: ((COALESCE(pb.response, 'MISSING'::character varying))::text
  (COALESCE(pg.response, 'MISSING'::character varying))::text)
   Buffers: shared hit=384 read=6511, temp read=6444 written=6318
   -  Seq Scan on pivotbad pb  (cost=0.00..2804.96 rows=93496 width=134)
 (actual time=0.069..37.143 rows=93496 loops=1)
         Buffers: shared hit=192 read=1678
   -  Hash  (cost=7537.12..7537.12 rows=251212 width=134) (actual
 time=24621.752..24621.752 rows=251212 loops=1)
         Buckets: 1024  Batches: 64  Memory Usage: 650kB
         Buffers: shared hit=192 read=4833, temp written=4524
         -  Seq Scan on pivotgood pg  (cost=0.00..7537.12 rows=251212
 width=134) (actual time=0.038..117.780 rows=251212 loops=1)
               Buffers: shared hit=192 read=4833
 Total runtime: 32297.305 ms
 4) INDEXES
 I can certainly add an index but given the table sizes I am not sure if that
 is a factor. This by no means is a large dataset less than 350,000 rows in
 total and 3 columns. Also this was just a quick dump of data for comparison
 purpose. When I saw the poor performance on the COALESCE, I pointed the data
 load to SQL Server and ran the same query except with the TSQL specific
 ISNULL function.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread tv
 4) INDEXESI can certainly add an index but given the table sizes I am not
 sure if that is a factor. This by no means is a large dataset less than
 350,000 rows in total and 3 columns. Also this was just a quick dump of
 data for comparison purpose. When I saw the poor performance on the
 COALESCE, I pointed the data load to SQL Server and ran the same query
 except with the TSQL specific ISNULL function.

35 rows definitely is a lot of rows, although with 3 INT column it's
just about 13MB of data (including overhead). But indexes can be quite
handy when doing joins, as in this case.

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread Kevin Grittner
Pavel Stehule  wrote:
 2010/11/21 Humair Mohammed :
 
 shared_buffers = 2

 shared_buffers = 2 ???
 
Yeah, if that's not a typo, that's a very serious misconfiguration.
 
With 8 GB of RAM in the machine, this should probably be set to
somewhere between 200 MB and 2 GB, depending on your workload and
what else is running on the machine.
 
Please read through this page and make use of the information:
 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread tv
 First, I modified the work_mem setting to 1GB (reloaded config) from the
 default 1MB and I see a response time of 33 seconds. Results below from
 EXPLAIN ANALYZE:

...

 Second, I modified the work_mem setting to 2GB (reloaded config) and I see
 a response time of 38 seconds. Results below from EXPLAIN ANALYZE:

...

How did you reload the config? Using 'kill -HUP pid'? That should work
fine. Have you cheched 'work_mem' after the reload?

Because the explain plans are exactly the same (structure, estimated
costs). The really interesting bit is this and it did not change at all

   Buckets: 1024 Batches: 64  Memory Usage: 650kB

As Tom Lane already mentioned, splitting hash join into batches (due to
small memory) adds overhead, the optimal number of batches is 1. But I
guess 1GB of work_mem is an overkill - something like 64MB should be fine.

The suspicious thing is the query plans have not changed at all
(especially the number of batches). I think you're not telling us
something important (unintentionally of course).

 By no means I am trying to compare the 2 products. When I noticed the slow
 behavior of COALESCE I tried it on SQL Server. And since they are running
 on the same machine my comment regarding apples to apples. It is possible
 that this is not an apples to apples comparison other than the fact that
 it is running on the same machine.

OK. The point of my post was that you've provided very little info about
the settings etc. so it was difficult to identify why PostgreSQL is so
slow.

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread tv
 4) INDEXESI can certainly add an index but given the table sizes I am
 not
 sure if that is a factor. This by no means is a large dataset less than
 350,000 rows in total and 3 columns. Also this was just a quick dump of
 data for comparison purpose. When I saw the poor performance on the
 COALESCE, I pointed the data load to SQL Server and ran the same query
 except with the TSQL specific ISNULL function.

 35 rows definitely is a lot of rows, although with 3 INT column it's
 just about 13MB of data (including overhead). But indexes can be quite
 handy when doing joins, as in this case.

OK, I've just realized the tables have 3 character columns, not integers.
In that case the tables are probably much bigger (and there are things
like TOAST). In that case indexes may be even more important.

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Should changing offset in LIMIT change query plan (at all/so early)?

2010-11-21 Thread Kevin Grittner
goran  wrote:
 
 Trying to understand why query planer changes the plan from
 effective one to ineffective one when I change the offset in the
 LIMIT. Also, thankfully accepting RTFM pointers to the actual FMs.
 
The query planner will consider offset and limit clauses when
estimating the cost of each plan.  The optimal plan will shift as
more tuples need to be read.  If the plan is not shifting at the
right point, it probably means that you need to tune the costing
factors used by the planner.
 
You didn't report enough information for me to suggest any particular
change; if you follow up, please review the suggested information to
post:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
In particular, information about the hardware and your
postgresql.conf settings would help.
 
You might also want to review this page and see if you can tune
things.
 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
 
In particular, effective_cache_size, random_page_cost, and
seq_page_cost would be likely to need adjustment based on what you've
told us; however, it might pay to review the whole configuration.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread tv
 Pavel Stehule  wrote:
 2010/11/21 Humair Mohammed :

 shared_buffers = 2

 shared_buffers = 2 ???

 Yeah, if that's not a typo, that's a very serious misconfiguration.

I guess that's a typo, as the explain plain in one of the previous posts
contains

   Buffers: shared hit=192 read=4833

for a sequential scan. But I still don't know why is the query so slow :-(

regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread Tom Lane
t...@fuzzy.cz writes:
 Second, I modified the work_mem setting to 2GB (reloaded config) and I see
 a response time of 38 seconds. Results below from EXPLAIN ANALYZE:

 How did you reload the config? Using 'kill -HUP pid'? That should work
 fine. Have you cheched 'work_mem' after the reload?

 Because the explain plans are exactly the same (structure, estimated
 costs). The really interesting bit is this and it did not change at all

Buckets: 1024 Batches: 64  Memory Usage: 650kB

If that didn't change, I'm prepared to bet that the OP didn't actually
manage to change the active value of work_mem.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread Humair Mohammed


1) OS/Configuration64-bit Windows 7 Enterprise with 8G RAM and a Dual Core 2.67 
Ghz Intel CPUpostgresql-x64-9.0 (PostgreSQL 9.0.1, compiled by Visual C++ build 
1500, 64-bit)work_mem  2GBshared_buffers = 22) 
Datasetname,pages,tuples,pg_size_prettypivotbad;1870;93496;15 
MBpivotgood;5025;251212;39 MB
3) EXPLAIN (ANALYZE ON, BUFFERS ON)Hash Join  (cost=16212.30..52586.43 
rows=92869 width=17) (actual time=25814.222..32296.765 rows=3163 loops=1)  
Hash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text = 
(pg.question)::text))  Join Filter: ((COALESCE(pb.response, 
'MISSING'::character varying))::text  (COALESCE(pg.response, 
'MISSING'::character varying))::text)  Buffers: shared hit=384 read=6511, 
temp read=6444 written=6318  -  Seq Scan on pivotbad pb  (cost=0.00..2804.96 
rows=93496 width=134) (actual time=0.069..37.143 rows=93496 loops=1)
Buffers: shared hit=192 read=1678  -  Hash  (cost=7537.12..7537.12 
rows=251212 width=134) (actual time=24621.752..24621.752 rows=251212 loops=1) 
   Buckets: 1024  Batches: 64  Memory Usage: 650kBBuffers: shared 
hit=192 read=4833, temp written=4524-  Seq Scan on pivotgood pg  
(cost=0.00..7537.12 rows=251212 width=134) (actual time=0.038..117.780 
rows=251212 loops=1)  Buffers: shared hit=192 read=4833Total 
runtime: 32297.305 ms
4) INDEXESI can certainly add an index but given the table sizes I am not sure 
if that is a factor. This by no means is a large dataset less than 350,000 rows 
in total and 3 columns. Also this was just a quick dump of data for comparison 
purpose. When I saw the poor performance on the COALESCE, I pointed the data 
load to SQL Server and ran the same query except with the TSQL specific ISNULL 
function.
  

Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread Humair Mohammed

I am running 64-bit Windows 7 Enterprise with 8G RAM and a Dual Core 2.67 Ghz 
Intel CPU. Both the SQL 2008 R2 and Postgresql are installed on the same 
machine. The DISTINCT FROM instead of the COALESCE does not help much. I ran 2 
further tests with work_mem modifications (please note memory usage is quite 
low 650kb, so I am not sure if the work_mem is a factor):
First, I modified the work_mem setting to 1GB (reloaded config) from the 
default 1MB and I see a response time of 33 seconds. Results below from EXPLAIN 
ANALYZE:
Hash Join  (cost=16212.30..52586.43 rows=92869 width=17) (actual 
time=26742.343..33274.317 rows=3163 loops=1)  Hash Cond: (((pb.id)::text = 
(pg.id)::text) AND ((pb.question)::text = (pg.question)::text))  Join Filter: 
((COALESCE(pb.response, 'MISSING'::character varying))::text  
(COALESCE(pg.response, 'MISSING'::character varying))::text)  -  Seq Scan on 
pivotbad pb  (cost=0.00..2804.96 rows=93496 width=134) (actual 
time=0.055..40.710 rows=93496 loops=1)  -  Hash  (cost=7537.12..7537.12 
rows=251212 width=134) (actual time=25603.460..25603.460 rows=251212 loops=1) 
   Buckets: 1024  Batches: 64  Memory Usage: 650kB-  Seq Scan on 
pivotgood pg  (cost=0.00..7537.12 rows=251212 width=134) (actual 
time=0.050..120.269 rows=251212 loops=1)Total runtime: 33275.028 ms

Second, I modified the work_mem setting to 2GB (reloaded config) and I see a 
response time of 38 seconds. Results below from EXPLAIN ANALYZE:
Hash Join  (cost=16212.30..52586.43 rows=92869 width=17) (actual 
time=26574.459..38406.422 rows=3163 loops=1)  Hash Cond: (((pb.id)::text = 
(pg.id)::text) AND ((pb.question)::text = (pg.question)::text))  Join Filter: 
((COALESCE(pb.response, 'MISSING'::character varying))::text  
(COALESCE(pg.response, 'MISSING'::character varying))::text)  -  Seq Scan on 
pivotbad pb  (cost=0.00..2804.96 rows=93496 width=134) (actual 
time=0.067..37.938 rows=93496 loops=1)  -  Hash  (cost=7537.12..7537.12 
rows=251212 width=134) (actual time=26426.127..26426.127 rows=251212 loops=1) 
   Buckets: 1024  Batches: 64  Memory Usage: 650kB-  Seq Scan on 
pivotgood pg  (cost=0.00..7537.12 rows=251212 width=134) (actual 
time=0.038..115.319 rows=251212 loops=1)Total runtime: 38406.927 ms

By no means I am trying to compare the 2 products. When I noticed the slow 
behavior of COALESCE I tried it on SQL Server. And since they are running on 
the same machine my comment regarding apples to apples. It is possible that 
this is not an apples to apples comparison other than the fact that it is 
running on the same machine.

 From: pavel.steh...@gmail.com
 Date: Thu, 18 Nov 2010 07:14:24 +0100
 Subject: Re: Query Performance SQL Server vs. Postgresql
 To: huma...@hotmail.com
 CC: pgsql-performance@postgresql.org
 
 Hello,
 
 there should be a problem in a statistic, they are out of reality.
 Please, try to use a DISTINCT OF operator now - maybe a statistic will
 be better. Next -  try to increase a work_mem. Hash join is
 untypically slow in your comp.
 
 Regards
 
 Pavel Stehule
 
 2010/11/17 Humair Mohammed huma...@hotmail.com:
  Yes strange indeed, I did rerun ANALYZE and VACCUM. Took 70 seconds to rerun
  the query. Results from EXPLAIN ANALYZE below:
  Hash Join  (cost=16212.30..52586.43 rows=92869 width=17) (actual
  time=43200.223..49502.874 rows=3163 loops=1)
Hash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text =
  (pg.question)::text))
Join Filter: ((COALESCE(pb.response, 'MISSING'::character varying))::text
   (COALESCE(pg.response, 'MISSING'::character varying))::text)
-  Seq Scan on pivotbad pb  (cost=0.00..2804.96 rows=93496 width=134)
  (actual time=0.009..48.200 rows=93496 loops=1)
-  Hash  (cost=7537.12..7537.12 rows=251212 width=134) (actual
  time=42919.453..42919.453 rows=251212 loops=1)
  Buckets: 1024  Batches: 64  Memory Usage: 650kB
  -  Seq Scan on pivotgood pg  (cost=0.00..7537.12 rows=251212
  width=134) (actual time=0.119..173.019 rows=251212 loops=1)
  Total runtime: 49503.450 ms
 
  From: pavel.steh...@gmail.com
  Date: Wed, 17 Nov 2010 05:47:51 +0100
  Subject: Re: Query Performance SQL Server vs. Postgresql
  To: huma...@hotmail.com
  CC: pgsql-performance@postgresql.org
 
  2010/11/17 Humair Mohammed huma...@hotmail.com:
  
   There are no indexes on the tables either in SQL Server or Postgresql -
   I am
   comparing apples to apples here. I ran ANALYZE on the postgresql tables,
   after that query performance times are still high 42 seconds with
   COALESCE
   and 35 seconds with IS DISTINCT FROM.
   Here is the execution plan from Postgresql for qurey - select pb.id from
   pivotbad pb inner join pivotgood pg on pb.id = pg.id and pb.question =
   pg.question and coalesce(pb.response,'MISSING') 
   coalesce(pg.response,'MISSING')
   Execution Time: 42 seconds
   Hash Join  (cost=16212.30..48854.24 rows=93477 width=17)
 Hash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text
   =
   

Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-21 Thread Robert Haas
On Nov 21, 2010, at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 t...@fuzzy.cz writes:
 Second, I modified the work_mem setting to 2GB (reloaded config) and I see
 a response time of 38 seconds. Results below from EXPLAIN ANALYZE:
 
 How did you reload the config? Using 'kill -HUP pid'? That should work
 fine. Have you cheched 'work_mem' after the reload?
 
 Because the explain plans are exactly the same (structure, estimated
 costs). The really interesting bit is this and it did not change at all
 
   Buckets: 1024 Batches: 64  Memory Usage: 650kB
 
 If that didn't change, I'm prepared to bet that the OP didn't actually
 manage to change the active value of work_mem.

Yep.  All this speculation about slow disks and/or COALESCE strikes me as 
likely totally off-base. I think the original poster needs to run show 
work_mem right before the EXPLAIN ANALYZE to make sure the new value they set 
actually stuck. There's no reason for the planner to have used only 650kB if 
work_mem is set to anything =2MB.

...Robert
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] autovacuum blocks the operations of other manual vacuum

2010-11-21 Thread Alvaro Herrera
Excerpts from kuopo's message of dom nov 21 11:15:52 -0300 2010:
 In my experiment, I need about 1~3 min to finish the analyze operation
 on the big table (which depends on the value of vacuum_cost_delay). I
 am not surprised because this table is a really big one (now, it has
 over 200M records).

Okay.  You may want to consider lowering the statistics size for all the
column in that table; that would reduce analyze time, at the cost of
possibly worsening the plans for that table, depending on how irregular
the distribution is.  See ALTER TABLE / SET STATISTICS in the
documentation, and the default_statistics_target parameter in
postgresql.conf.

 However, the most of my concerns is the behavior of analyze/vacuum.
 You mentioned that the analyze-only operation cannot be optimized as
 the same way on optimizing vacuum. Does that mean the analyze
 operation on a table would unavoidably affect the vacuum proceeded on
 another one?

That's correct.  I think you can run VACUUM ANALYZE, and it would do
both things at once; AFAIK this is also optimized like VACUUM is, but I
admit I'm not 100% sure (and I can't check right now).

 If this is a normal reaction for an analyze operation,
 maybe I should try to lower vacuum_cost_delay or use more powerful
 hardware to minimize the interfered period. So, the pages for the
 small table would not increase quickly.

I think it would make sense to have as low a cost_delay as possible for
this ANALYZE.  (Note you can change it locally with a SET command; no
need to touch postgresql.conf.  So you can change it when you analyze
just this large table).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance under contention

2010-11-21 Thread Ivan Voras
This is not a request for help but a report, in case it helps developers 
or someone in the future. The setup is:


AMD64 machine, 24 GB RAM, 2x6-core Xeon CPU + HTT (24 logical CPUs)
FreeBSD 8.1-stable, AMD64
PostgreSQL 9.0.1, 10 GB shared buffers, using pgbench with a scale 
factor of 500 (7.5 GB database)


with pgbench -S (SELECT-queries only) the performance curve is:

-c# result
4   33549
8   64864
12  79491
16  79887
20  66957
24  52576
28  50406
32  49491
40  45535
50  39499
75  29415

After 16 clients (which is still good since there are only 12 real 
cores in the system), the performance drops sharply, and looking at the 
processes' state, most of them seem to eat away system call (i.e. 
executing in the kernel) in states semwait and sbwait, i.e. 
semaphore wait and socket buffer wait, for example:


 3047 pgsql   1  600 10533M   283M sbwait 12   0:01  6.79% postgres
 3055 pgsql   1  640 10533M   279M sbwait 15   0:01  6.79% postgres
 3033 pgsql   1  640 10533M   279M semwai  6   0:01  6.69% postgres
 3038 pgsql   1  640 10533M   283M CPU5   13   0:01  6.69% postgres
 3037 pgsql   1  620 10533M   279M sbwait 23   0:01  6.69% postgres
 3048 pgsql   1  650 10533M   280M semwai  4   0:01  6.69% postgres
 3056 pgsql   1  650 10533M   277M semwai  1   0:01  6.69% postgres
 3002 pgsql   1  620 10533M   284M CPU19   0   0:01  6.59% postgres
 3042 pgsql   1  630 10533M   279M semwai 21   0:01  6.59% postgres
 3029 pgsql   1  630 10533M   277M semwai 23   0:01  6.59% postgres
 3046 pgsql   1  630 10533M   278M RUN 5   0:01  6.59% postgres
 3036 pgsql   1  630 10533M   278M CPU1   12   0:01  6.59% postgres
 3051 pgsql   1  630 10533M   277M semwai  1   0:01  6.59% postgres
 3030 pgsql   1  630 10533M   281M semwai  1   0:01  6.49% postgres
 3050 pgsql   1  600 10533M   276M semwai  1   0:01  6.49% postgres

The sbwait part is from FreeBSD - IPC sockets, but so much blocking on 
semwait indicates large contention in PostgreSQL.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance under contention

2010-11-21 Thread Kevin Grittner
Ivan Voras  wrote:
 
 After 16 clients (which is still good since there are only 12
 real cores in the system), the performance drops sharply
 
Yet another data point to confirm the importance of connection
pooling.  :-)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance under contention

2010-11-21 Thread Ivan Voras

On 11/22/10 02:47, Kevin Grittner wrote:

Ivan Voras  wrote:


After 16 clients (which is still good since there are only 12
real cores in the system), the performance drops sharply


Yet another data point to confirm the importance of connection
pooling.  :-)


I agree, connection pooling will get rid of the symptom. But not the 
underlying problem. I'm not saying that having 1000s of connections to 
the database is a particularly good design, only that there shouldn't be 
a sharp decline in performance when it does happen. Ideally, the 
performance should remain the same as it was at its peek.


I've been monitoring the server some more and it looks like there are 
periods where almost all servers are in the semwait state followed by 
periods of intensive work - approximately similar to the thundering 
herd problem, or maybe to what Josh Berkus has posted a few days ago.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance under contention

2010-11-21 Thread Jignesh Shah
On Sun, Nov 21, 2010 at 9:18 PM, Ivan Voras ivo...@freebsd.org wrote:
 On 11/22/10 02:47, Kevin Grittner wrote:

 Ivan Voras  wrote:

 After 16 clients (which is still good since there are only 12
 real cores in the system), the performance drops sharply

 Yet another data point to confirm the importance of connection
 pooling.  :-)

 I agree, connection pooling will get rid of the symptom. But not the
 underlying problem. I'm not saying that having 1000s of connections to the
 database is a particularly good design, only that there shouldn't be a sharp
 decline in performance when it does happen. Ideally, the performance should
 remain the same as it was at its peek.

 I've been monitoring the server some more and it looks like there are
 periods where almost all servers are in the semwait state followed by
 periods of intensive work - approximately similar to the thundering herd
 problem, or maybe to what Josh Berkus has posted a few days ago.



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


Try it with systemtap or dtrace and see if you find the same
bottlenecks as I do in
http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html

I will probably retry it with pgBench and see what  I find ..

Regards,
Jignesh

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance