Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Arjen van der Meijden


On 11-4-2011 22:04 da...@lang.hm wrote:

in your case, try your new servers without hyperthreading. you will end
up with a 4x4 core system, which should handily outperform the 2x4 core
system you are replacing.

the limit isn't 8 cores, it's that the hyperthreaded cores don't work
well with the postgres access patterns.


It would be really weird if disabling HT would turn these 8-core cpu's 
in 4-core cpu's ;) They have 8 physical cores and 16 threads each. So he 
basically has a 32-core machine with 64 threads in total (if HT were 
enabled). Still, HT may or may not improve things, back when we had time 
to benchmark new systems we had one of the first HT-Xeon's (a dual 5080, 
with two cores + HT each) available:

http://ic.tweakimg.net/ext/i/1155958729.png

The blue lines are all slightly above the orange/red lines. So back then 
HT slightly improved our read-mostly Postgresql benchmark score.


We also did benchmarks with Sun's UltraSparc T2 back then:
http://ic.tweakimg.net/ext/i/1214930814.png

Adding full cores (including threads) made things much better, but we 
also tested full cores with more threads each:

http://ic.tweakimg.net/ext/i/1214930816.png

As you can see, with that benchmark, it was better to have 4 cores with 
8 threads each, than 8 cores with 2 threads each.


The T2-threads where much heavier duty than the HT-threads back then, 
but afaik Intel has improved its technology with this re-introduction of 
them quite a bit.


So I wouldn't dismiss hyper threading for a read-mostly Postgresql 
workload too easily.


Then again, keeping 32 cores busy, without them contending for every 
resource will already be quite hard. So adding 32 additional threads 
may indeed make matters much worse.


Best regards,

Arjen

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query postgres 8.3

2011-04-12 Thread Claudio Freire
I actually implemented a statistical system for measuring these kinds
of correlations.

It's complex, but it might be adaptable to pgsql. Furthermore, one of
the latest projects of mine was to replace the purely statistical
approach with SVCs.
Too bad I won't be able to devote any time to that project before september.

On Mon, Apr 11, 2011 at 6:59 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 There has been much talk lately about how to make
 it smarter about that, but right now there's no general solution,
 and workarounds can be tricky.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance problem with LIMIT (order BY in DESC order). Wrong index used?

2011-04-12 Thread Claudio Freire
On Tue, Apr 12, 2011 at 7:20 AM, Dieter Rehbein
dieter.rehb...@skiline.cc wrote:
 Hi everybody,

 I have a performance-problem with a query using a LIMIT. There are other 
 threads rergading performance issues with LIMIT, but I didn't find useful 
 hints for our problem and it might
 be interesting for other postgres-users.

Did you perform an ANALYZE or VACUUM ANALYZE?
Did you try increasing the statistic targets?

AFAIK, it looks a lot like the planner is missing stats, since it
estimates the index query on idx_nfi_newsfeed will fetch 10k rows -
instead of 25.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] DBT-5 Postgres 9.0.3

2011-04-12 Thread Sethu Prasad
Hi,

Anyone lucky to have dbt5 run for PostgreSQL 9.0.3?!

I am trying on Novell SuSE Linux Enterprise Server 11 SP1 x86_64 with a
virtual machine and bit hard with no success run yet. If you can help me
with any docs will be more of a support.

Regards,

Sethu Prasad


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Glyn Astill
--- On Tue, 12/4/11, Merlin Moncure mmonc...@gmail.com wrote:

  The issue I'm seeing is that 8 real cores
 outperform 16 real
  cores, which outperform 32 real cores under high
 concurrency.
 
  With every benchmark I've done of PostgreSQL, the
 knee in the
  performance graph comes right around ((2 * cores) +
  effective_spindle_count).  With the database fully
 cached (as I
  believe you mentioned), effective_spindle_count is
 zero.  If you
  don't use a connection pool to limit active
 transactions to the
  number from that formula, performance drops off.  The
 more CPUs you
  have, the sharper the drop after the knee.
 
 I was about to say something similar with some canned
 advice to use a
 connection pooler to control this.  However, OP
 scaling is more or
 less topping out at cores / 4...yikes!.  Here are my
 suspicions in
 rough order:
 
 1. There is scaling problem in client/network/etc. 
 Trivially
 disproved, convert the test to pgbench -f and post results
 2. The test is in fact i/o bound. Scaling is going to be
 hardware/kernel determined.  Can we see
 iostat/vmstat/top snipped
 during test run?  Maybe no-op is burning you?

This is during my 80 clients test, this is a point at which the performance is 
well below that of the same machine limited to 8 cores.

http://www.privatepaste.com/dc131ff26e

 3. Locking/concurrency issue in heavy_seat_function()
 (source for
 that?)  how much writing does it do?
 

No writing afaik - its a select with a few joins and subqueries - I'm pretty 
sure it's not writing out temp data either, but all clients are after the same 
data in the test - maybe theres some locks there?

 Can we see some iobound and cpubound pgbench runs on both
 servers?
 

Of course, I'll post when I've gotten to that.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Glyn Astill
--- On Tue, 12/4/11, Scott Marlowe scott.marl...@gmail.com wrote:

 From: Scott Marlowe scott.marl...@gmail.com
 Subject: Re: [PERFORM] Linux: more cores = less concurrency.
 To: Glyn Astill glynast...@yahoo.co.uk
 Cc: pgsql-performance@postgresql.org
 Date: Tuesday, 12 April, 2011, 6:55
 On Mon, Apr 11, 2011 at 7:04 AM, Glyn
 Astill glynast...@yahoo.co.uk
 wrote:
  Hi Guys,
 
  I'm just doing some tests on a new server running one
 of our heavy select functions (the select part of a plpgsql
 function to allocate seats) concurrently.  We do use
 connection pooling and split out some selects to slony
 slaves, but the tests here are primeraly to test what an
 individual server is capable of.
 
  The new server uses 4 x 8 core Xeon X7550 CPUs at
 2Ghz, our current servers are 2 x 4 core Xeon E5320 CPUs at
 2Ghz.
 
  What I'm seeing is when the number of clients is
 greater than the number of cores, the new servers perform
 better on fewer cores.
 
 O man, I completely forgot the issue I ran into in my
 machines, and
 that was that zone_reclaim completely screwed postgresql
 and file
 system performance.  On machines with more CPU nodes
 and higher
 internode cost it gets turned on automagically and
 destroys
 performance for machines that use a lot of kernel cache /
 shared
 memory.
 
 Be sure and use sysctl.conf to turn it off:
 
 vm.zone_reclaim_mode = 0
 

I've made this change, not seen any immediate changes however it's good to 
know. Thanks Scott.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance problem with LIMIT (order BY in DESC order). Wrong index used?

2011-04-12 Thread Dieter Rehbein
what I did, was an ANALYZE, which did not change anything. 

I just executed a VACUUM ANALYZE and now everything performs well. hm, strange.

thanks
Dieter



Am 12.04.2011 um 09:42 schrieb Claudio Freire:

On Tue, Apr 12, 2011 at 7:20 AM, Dieter Rehbein
dieter.rehb...@skiline.cc wrote:
 Hi everybody,
 
 I have a performance-problem with a query using a LIMIT. There are other 
 threads rergading performance issues with LIMIT, but I didn't find useful 
 hints for our problem and it might
 be interesting for other postgres-users.

Did you perform an ANALYZE or VACUUM ANALYZE?
Did you try increasing the statistic targets?

AFAIK, it looks a lot like the planner is missing stats, since it
estimates the index query on idx_nfi_newsfeed will fetch 10k rows -
instead of 25.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance problem with LIMIT (order BY in DESC order). Wrong index used?

2011-04-12 Thread Claudio Freire
On Tue, Apr 12, 2011 at 10:59 AM, Dieter Rehbein
dieter.rehb...@skiline.cc wrote:
 I just executed a VACUUM ANALYZE and now everything performs well. hm, 
 strange.

That probably means you need more statistics - try increasing the
newsfeed's statistics target count.

ALTER TABLE newsfeed_item ALTER COLUMN newsfeed SET STATISTICS n;

Try different n numbers, you can crank it up to 4000 or perhaps more
in 9.0, but you should start lower I guess.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance problem with LIMIT (order BY in DESC order). Wrong index used?

2011-04-12 Thread tv
 On Tue, Apr 12, 2011 at 10:59 AM, Dieter Rehbein
 dieter.rehb...@skiline.cc wrote:
 I just executed a VACUUM ANALYZE and now everything performs well. hm,
 strange.

 That probably means you need more statistics - try increasing the
 newsfeed's statistics target count.

 ALTER TABLE newsfeed_item ALTER COLUMN newsfeed SET STATISTICS n;

 Try different n numbers, you can crank it up to 4000 or perhaps more
 in 9.0, but you should start lower I guess.

AFAIK the max value is 1 and the default is 100. Higher numbers mean
higher overhead, so do not jump to 1 directly. Set it to 1000 and see
if that helps, etc.

regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance problem with LIMIT (order BY in DESC order). Wrong index used?

2011-04-12 Thread Dieter Rehbein
thank's a lot guys,  I will try that out.

regards 
Dieter



Am 12.04.2011 um 11:07 schrieb Claudio Freire:

On Tue, Apr 12, 2011 at 10:59 AM, Dieter Rehbein
dieter.rehb...@skiline.cc wrote:
 I just executed a VACUUM ANALYZE and now everything performs well. hm, 
 strange.

That probably means you need more statistics - try increasing the
newsfeed's statistics target count.

ALTER TABLE newsfeed_item ALTER COLUMN newsfeed SET STATISTICS n;

Try different n numbers, you can crank it up to 4000 or perhaps more
in 9.0, but you should start lower I guess.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Glyn Astill
--- On Mon, 11/4/11, Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 From: Kevin Grittner kevin.gritt...@wicourts.gov
 Subject: Re: [PERFORM] Linux: more cores = less concurrency.
 To: da...@lang.hm, Steve Clark scl...@netwolves.com, Kevin Grittner 
 kevin.gritt...@wicourts.gov, Glyn Astill glynast...@yahoo.co.uk
 Cc: Joshua D. Drake j...@commandprompt.com, Scott Marlowe 
 scott.marl...@gmail.com, pgsql-performance@postgresql.org
 Date: Monday, 11 April, 2011, 22:35
 Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:
  
  I don't know why you were hitting the knee sooner than
 I've seen
  in my benchmarks
  
 If you're compiling your own executable, you might try
 boosting
 LOG2_NUM_LOCK_PARTITIONS (defined in lwlocks.h) to 5 or
 6.  The
 current value of 4 means that there are 16 partitions to
 spread
 contention for the lightweight locks which protect the
 heavyweight
 locking, and this corresponds to your best throughput
 point.  It
 might be instructive to see what happens when you tweak the
 number
 of partitions.
  

Tried tweeking LOG2_NUM_LOCK_PARTITIONS between 5 and 7. My results took a dive 
when I changed to 32 partitions, and improved as I increaced to 128, but 
appeared to be happiest at the default of 16.

 Also, if you can profile PostgreSQL at the sweet spot and
 again at a
 pessimal load, comparing the profiles should give good
 clues about
 the points of contention.
  

Results for the same machine on 8 and 32 cores are here:

http://www.8kb.co.uk/server_benchmarks/dblt_results.csv

Here's the sweet spot for 32 cores, and the 8 core equivalent:

http://www.8kb.co.uk/server_benchmarks/iostat-32cores_32Clients.txt
http://www.8kb.co.uk/server_benchmarks/vmstat-32cores_32Clients.txt

http://www.8kb.co.uk/server_benchmarks/iostat-8cores_32Clients.txt
http://www.8kb.co.uk/server_benchmarks/vmstat-8cores_32Clients.txt

... and at the pessimal load for 32 cores, and the 8 core equivalent:

http://www.8kb.co.uk/server_benchmarks/iostat-32cores_100Clients.txt
http://www.8kb.co.uk/server_benchmarks/vmstat-32cores_100Clients.txt

http://www.8kb.co.uk/server_benchmarks/iostat-8cores_100Clients.txt
http://www.8kb.co.uk/server_benchmarks/vmstat-8cores_100Clients.txt
   
vmstat shows double the context switches on 32 cores, could this be a factor? 
Is there anything else I'm missing there?

Cheers
Glyn

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance when joining against inherited tables

2011-04-12 Thread Shaun Thomas

On 04/11/2011 03:11 PM, Lucas Madar wrote:


EXPLAIN ANALYZE SELECT * FROM objects INNER JOIN item f USING ( id );

This scans everything over everything, and obviously takes forever
(there are millions of rows in the objects table, and tens of thousands
in each itemXX table).


What is your constraint_exclusion setting? This needs to be 'ON' for the 
check constraints you use to enforce your inheritance rules to work right.


You *do* have check constraints on all your child tables, right? Just in 
case, please refer to the doc on table partitioning:


http://www.postgresql.org/docs/current/static/ddl-partitioning.html

Also, your example has no where clause. Without a where clause, 
constraint exclusion won't even function. How is the database supposed 
to know that matching a 4M row table against several partitioned tables 
will result in few matches? All it really has are stats on your joined 
id for this particular query, and you're basically telling to join all 
of them. That usually calls for a sequence scan, because millions of 
index seeks will almost always be slower than a few sequence scans.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@peak6.com

__

See  http://www.peak6.com/email_disclaimer.php
for terms and conditions related to this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Merlin Moncure
On Tue, Apr 12, 2011 at 3:54 AM, Glyn Astill glynast...@yahoo.co.uk wrote:
 --- On Tue, 12/4/11, Merlin Moncure mmonc...@gmail.com wrote:

  The issue I'm seeing is that 8 real cores
 outperform 16 real
  cores, which outperform 32 real cores under high
 concurrency.
 
  With every benchmark I've done of PostgreSQL, the
 knee in the
  performance graph comes right around ((2 * cores) +
  effective_spindle_count).  With the database fully
 cached (as I
  believe you mentioned), effective_spindle_count is
 zero.  If you
  don't use a connection pool to limit active
 transactions to the
  number from that formula, performance drops off.  The
 more CPUs you
  have, the sharper the drop after the knee.

 I was about to say something similar with some canned
 advice to use a
 connection pooler to control this.  However, OP
 scaling is more or
 less topping out at cores / 4...yikes!.  Here are my
 suspicions in
 rough order:

 1. There is scaling problem in client/network/etc.
 Trivially
 disproved, convert the test to pgbench -f and post results
 2. The test is in fact i/o bound. Scaling is going to be
 hardware/kernel determined.  Can we see
 iostat/vmstat/top snipped
 during test run?  Maybe no-op is burning you?

 This is during my 80 clients test, this is a point at which the performance 
 is well below that of the same machine limited to 8 cores.

 http://www.privatepaste.com/dc131ff26e

 3. Locking/concurrency issue in heavy_seat_function()
 (source for
 that?)  how much writing does it do?


 No writing afaik - its a select with a few joins and subqueries - I'm pretty 
 sure it's not writing out temp data either, but all clients are after the 
 same data in the test - maybe theres some locks there?

 Can we see some iobound and cpubound pgbench runs on both
 servers?


 Of course, I'll post when I've gotten to that.

Ok, there's no writing going on -- so the i/o tets aren't necessary.
Context switches are also not too high -- the problem is likely in
postgres or on your end.

However, I Would still like to see:
pgbench select only tests:
pgbench -i -s 1
pgbench -S -c 8 -t 500
pgbench -S -c 32 -t 500
pgbench -S -c 80 -t 500

pgbench -i -s 500
pgbench -S -c 8 -t 500
pgbench -S -c 32 -t 500
pgbench -S -c 80 -t 500

write out bench.sql with:
begin;
select * from heavy_seat_function();
select * from heavy_seat_function();
commit;

pgbench -n bench.sql -c 8 -t 500
pgbench -n bench.sql -c 8 -t 500
pgbench -n bench.sql -c 8 -t 500

I'm still suspecting an obvious problem here.  One thing we may have
overlooked is that you are connecting and disconnecting one per
benchmarking step (two query executions).  If you have heavy RSA
encryption enabled on connection establishment, this could eat you.

If pgbench results confirm your scaling problems and our issue is not
in the general area of connection establishment, it's time to break
out the profiler :/.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Merlin Moncure
On Tue, Apr 12, 2011 at 8:23 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Apr 12, 2011 at 3:54 AM, Glyn Astill glynast...@yahoo.co.uk wrote:
 --- On Tue, 12/4/11, Merlin Moncure mmonc...@gmail.com wrote:

  The issue I'm seeing is that 8 real cores
 outperform 16 real
  cores, which outperform 32 real cores under high
 concurrency.
 
  With every benchmark I've done of PostgreSQL, the
 knee in the
  performance graph comes right around ((2 * cores) +
  effective_spindle_count).  With the database fully
 cached (as I
  believe you mentioned), effective_spindle_count is
 zero.  If you
  don't use a connection pool to limit active
 transactions to the
  number from that formula, performance drops off.  The
 more CPUs you
  have, the sharper the drop after the knee.

 I was about to say something similar with some canned
 advice to use a
 connection pooler to control this.  However, OP
 scaling is more or
 less topping out at cores / 4...yikes!.  Here are my
 suspicions in
 rough order:

 1. There is scaling problem in client/network/etc.
 Trivially
 disproved, convert the test to pgbench -f and post results
 2. The test is in fact i/o bound. Scaling is going to be
 hardware/kernel determined.  Can we see
 iostat/vmstat/top snipped
 during test run?  Maybe no-op is burning you?

 This is during my 80 clients test, this is a point at which the performance 
 is well below that of the same machine limited to 8 cores.

 http://www.privatepaste.com/dc131ff26e

 3. Locking/concurrency issue in heavy_seat_function()
 (source for
 that?)  how much writing does it do?


 No writing afaik - its a select with a few joins and subqueries - I'm pretty 
 sure it's not writing out temp data either, but all clients are after the 
 same data in the test - maybe theres some locks there?

 Can we see some iobound and cpubound pgbench runs on both
 servers?


 Of course, I'll post when I've gotten to that.

 Ok, there's no writing going on -- so the i/o tets aren't necessary.
 Context switches are also not too high -- the problem is likely in
 postgres or on your end.

 However, I Would still like to see:
 pgbench select only tests:
 pgbench -i -s 1
 pgbench -S -c 8 -t 500
 pgbench -S -c 32 -t 500
 pgbench -S -c 80 -t 500

 pgbench -i -s 500
 pgbench -S -c 8 -t 500
 pgbench -S -c 32 -t 500
 pgbench -S -c 80 -t 500

 write out bench.sql with:
 begin;
 select * from heavy_seat_function();
 select * from heavy_seat_function();
 commit;

 pgbench -n bench.sql -c 8 -t 500
 pgbench -n bench.sql -c 8 -t 500
 pgbench -n bench.sql -c 8 -t 500

whoops:
pgbench -n bench.sql -c 8 -t 500
pgbench -n bench.sql -c 32 -t 500
pgbench -n bench.sql -c 80 -t 500

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Two servers - One Replicated - Same query

2011-04-12 Thread Kevin Grittner
Ozer, Pam po...@automotive.com wrote:
 
 I have two servers one has replication the other does not. The
 same query on both servers.  One takes 225seconds on the
 replicated server the first time it runs and only 125ms on the
 other server the first time it runs.  The second time you execute
 the query it drops to the 125ms. They are using the same query
 plan.  What kind of things should I be looking at?
 
Caching.
 
Apparently the usage pattern on one server tends to keep the
necessary data in cache, while the usage pattern on the other is
flushing it out occasionally to make room for other data.  Adding
RAM to the server might help.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance problem with LIMIT (order BY in DESC order). Wrong index used?

2011-04-12 Thread Tom Lane
Claudio Freire klaussfre...@gmail.com writes:
 Did you try increasing the statistic targets?

 AFAIK, it looks a lot like the planner is missing stats, since it
 estimates the index query on idx_nfi_newsfeed will fetch 10k rows -
 instead of 25.

BTW, this is the right suggestion, but for the wrong reason.  You seem
to be looking at

Limit  (cost=0.00..980.09 rows=25 width=963) (actual time=48.592..4060.779 
rows=25 loops=1)
  -  Index Scan Backward using IDX_NFI_DATETIME on newsfeed_item  
(cost=0.00..409365.16 rows=10442 width=963) (actual time=48.581..4060.542 
rows=25 loops=1)

Here, the actual row count is constrained to 25 because the LIMIT node
stops calling the indexscan node once it's got 25.  So this case proves
little about whether the planner's estimates are any good.  You need to
check the estimates in the unconstrained plan:

  -  Bitmap Heap Scan on newsfeed_item  (cost=421.41..34450.72 rows=10442 
width=963) (actual time=0.644..12.601 rows=477 loops=1)

Here we can see that there really are only 477 rows in the table that
satisfy the WHERE clause, versus an estimate of 10K.  So sure enough,
the statistics are bad, and an increase in stats target might help.
But you can't conclude that from an explain that involves LIMIT.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Kevin Grittner
Glyn Astill glynast...@yahoo.co.uk wrote:
 
 Tried tweeking LOG2_NUM_LOCK_PARTITIONS between 5 and 7. My
 results took a dive when I changed to 32 partitions, and improved
 as I increaced to 128, but appeared to be happiest at the default
 of 16.
 
Good to know.
 
 Also, if you can profile PostgreSQL at the sweet spot and again
 at a pessimal load, comparing the profiles should give good clues
 about the points of contention.
 
 [iostat and vmstat output]
 
Wow, zero idle and zero wait, and single digit for system.  Did you
ever run those RAM speed tests?  (I don't remember seeing results
for that -- or failed to recognize them.)  At this point, my best
guess at this point is that you don't have the bandwidth to RAM to
support the CPU power.  Databases tend to push data around in RAM a
lot.
 
When I mentioned profiling, I was thinking more of oprofile or
something like it.  If it were me, I'd be going there by now.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Glyn Astill
--- On Tue, 12/4/11, Merlin Moncure mmonc...@gmail.com wrote:

  Can we see some iobound and cpubound pgbench
 runs on both
  servers?
 
 
  Of course, I'll post when I've gotten to that.
 
  Ok, there's no writing going on -- so the i/o tets
 aren't necessary.
  Context switches are also not too high -- the problem
 is likely in
  postgres or on your end.
 
  However, I Would still like to see:
  pgbench select only tests:
  pgbench -i -s 1
  pgbench -S -c 8 -t 500
  pgbench -S -c 32 -t 500
  pgbench -S -c 80 -t 500
 
  pgbench -i -s 500
  pgbench -S -c 8 -t 500
  pgbench -S -c 32 -t 500
  pgbench -S -c 80 -t 500
 
  write out bench.sql with:
  begin;
  select * from heavy_seat_function();
  select * from heavy_seat_function();
  commit;
 
  pgbench -n bench.sql -c 8 -t 500
  pgbench -n bench.sql -c 8 -t 500
  pgbench -n bench.sql -c 8 -t 500
 
 whoops:
 pgbench -n bench.sql -c 8 -t 500
 pgbench -n bench.sql -c 32 -t 500
 pgbench -n bench.sql -c 80 -t 500
 
 merlin
 

Right, here they are:

http://www.privatepaste.com/3dd777f4db



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Glyn Astill
--- On Tue, 12/4/11, Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 Wow, zero idle and zero wait, and single digit for
 system.  Did you
 ever run those RAM speed tests?  (I don't remember
 seeing results
 for that -- or failed to recognize them.)  At this
 point, my best
 guess at this point is that you don't have the bandwidth to
 RAM to
 support the CPU power.  Databases tend to push data
 around in RAM a
 lot.

I mentioned sysbench was giving me something like 3000 MB/sec on memory write 
tests, but nothing more.

Results from Greg Smiths stream_scaling test are here:

http://www.privatepaste.com/4338aa1196

  
 When I mentioned profiling, I was thinking more of oprofile
 or
 something like it.  If it were me, I'd be going there
 by now.
  

Advice taken, it'll be my next step.

Glyn

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Merlin Moncure
On Tue, Apr 12, 2011 at 11:01 AM, Glyn Astill glynast...@yahoo.co.uk wrote:
 --- On Tue, 12/4/11, Merlin Moncure mmonc...@gmail.com wrote:

  Can we see some iobound and cpubound pgbench
 runs on both
  servers?
 
 
  Of course, I'll post when I've gotten to that.
 
  Ok, there's no writing going on -- so the i/o tets
 aren't necessary.
  Context switches are also not too high -- the problem
 is likely in
  postgres or on your end.
 
  However, I Would still like to see:
  pgbench select only tests:
  pgbench -i -s 1
  pgbench -S -c 8 -t 500
  pgbench -S -c 32 -t 500
  pgbench -S -c 80 -t 500
 
  pgbench -i -s 500
  pgbench -S -c 8 -t 500
  pgbench -S -c 32 -t 500
  pgbench -S -c 80 -t 500
 
  write out bench.sql with:
  begin;
  select * from heavy_seat_function();
  select * from heavy_seat_function();
  commit;
 
  pgbench -n bench.sql -c 8 -t 500
  pgbench -n bench.sql -c 8 -t 500
  pgbench -n bench.sql -c 8 -t 500

 whoops:
 pgbench -n bench.sql -c 8 -t 500
 pgbench -n bench.sql -c 32 -t 500
 pgbench -n bench.sql -c 80 -t 500

 merlin


 Right, here they are:

 http://www.privatepaste.com/3dd777f4db

your results unfortunately confirmed the worst -- no easy answers on
this one :(.  Before breaking out the profiler, can you take some
random samples of:

select count(*) from pg_stat_activity where waiting;

to see if you have any locking issues?
Also, are you sure your function executions are relatively free of
side effects?
I can take a look at the code off list if you'd prefer to keep it discrete.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance

2011-04-12 Thread Ogden
I have been wrestling with the configuration of the dedicated Postges 9.0.3 
server at work and granted, there's more activity on the production server, but 
the same queries take twice as long on the beefier server than my mac at home. 
I have pasted what I have changed in postgresql.conf - I am wondering if 
there's any way one can help me change things around to be more efficient.

Dedicated PostgreSQL 9.0.3 Server with 16GB Ram

Heavy write and read (for reporting and calculations) server. 

max_connections = 350 
shared_buffers = 4096MB  
work_mem = 32MB
maintenance_work_mem = 512MB


seq_page_cost = 0.02# measured on an arbitrary scale
random_page_cost = 0.03 
cpu_tuple_cost = 0.02  
effective_cache_size = 8192MB



The planner costs seem a bit low but this was from suggestions from this very 
list a while ago. 


Thank you

Ogden

Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Kevin Grittner
Glyn Astill glynast...@yahoo.co.uk wrote:
 
 Results from Greg Smiths stream_scaling test are here:
 
 http://www.privatepaste.com/4338aa1196
 
Well, that pretty much clinches it.  Your RAM access tops out at 16
processors.  It appears that your processors are spending most of
their time waiting for and contending for the RAM bus.
 
I have gotten machines in where moving a jumper, flipping a DIP
switch, or changing BIOS options from the default made a big
difference.  I'd be looking at the manuals for my motherboard and
BIOS right now to see what options there might be to improve that.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Claudio Freire
On Tue, Apr 12, 2011 at 6:40 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:

 Well, that pretty much clinches it.  Your RAM access tops out at 16
 processors.  It appears that your processors are spending most of
 their time waiting for and contending for the RAM bus.

It tops, but it doesn't drop.

I'd propose that the perceived drop in TPS is due to cache contention
- ie, more processes fighting for the scarce cache means less
efficient use of the (constant upwards of 16 processes) bandwidth.

So... the solution would be to add more servers, rather than just sockets.
(or a server with more sockets *and* more bandwidth)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread F. BROUARD / SQLpro

Hi,

I think that a NUMA architecture machine can solve the problem

A +
Le 11/04/2011 15:04, Glyn Astill a écrit :


Hi Guys,

I'm just doing some tests on a new server running one of our heavy select 
functions (the select part of a plpgsql function to allocate seats) 
concurrently.  We do use connection pooling and split out some selects to slony 
slaves, but the tests here are primeraly to test what an individual server is 
capable of.

The new server uses 4 x 8 core Xeon X7550 CPUs at 2Ghz, our current servers are 
2 x 4 core Xeon E5320 CPUs at 2Ghz.

What I'm seeing is when the number of clients is greater than the number of 
cores, the new servers perform better on fewer cores.

Has anyone else seen this behaviour?  I'm guessing this is either a hardware 
limitation or something to do with linux process management / scheduling? Any 
idea what to look into?

My benchmark utility is just using a little .net/npgsql app that runs 
increacing numbers of clients concurrently, each client runs a specified number 
of iterations of any sql I specify.

I've posted some results and the test program here:

http://www.8kb.co.uk/server_benchmarks/





--
Frédéric BROUARD - expert SGBDR et SQL - MVP SQL Server - 06 11 86 40 66
Le site sur le langage SQL et les SGBDR  :  http://sqlpro.developpez.com
Enseignant Arts  Métiers PACA, ISEN Toulon et CESI/EXIA Aix en Provence
Audit, conseil, expertise, formation, modélisation, tuning, optimisation
*** http://www.sqlspot.com *


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Greg Smith

Kevin Grittner wrote:

Glyn Astill glynast...@yahoo.co.uk wrote:
 
  

Results from Greg Smiths stream_scaling test are here:

http://www.privatepaste.com/4338aa1196

 
Well, that pretty much clinches it.  Your RAM access tops out at 16

processors.  It appears that your processors are spending most of
their time waiting for and contending for the RAM bus.
  


I've pulled Glyn's results into 
https://github.com/gregs1104/stream-scaling so they're easy to compare 
against similar processors, his system is the one labled 4 X X7550.  I'm 
hearing this same story from multiple people lately:  these 32+ core 
servers bottleneck on aggregate memory speed with running PostgreSQL 
long before the CPUs are fully utilized.  This server is close to 
maximum memory utilization at 8 cores, and the small increase in gross 
throughput above that doesn't seem to be making up for the loss in L1 
and L2 thrashing from trying to run more.  These systems with many cores 
can only be used fully if you have a program that can work efficiency 
some of the time with just local CPU resources.  That's very rarely the 
case for a database that's moving 8K pages, tuple caches, and other 
forms of working memory around all the time.




I have gotten machines in where moving a jumper, flipping a DIP
switch, or changing BIOS options from the default made a big
difference.  I'd be looking at the manuals for my motherboard and
BIOS right now to see what options there might be to improve that


I already forwarded Glyn a good article about tuning these Dell BIOSs in 
particular from an interesting blog series others here might like too:


http://bleything.net/articles/postgresql-benchmarking-memory.html

Ben Bleything is doing a very thorough walk-through of server hardware 
validation, and as is often the case he's already found one major 
problem with the vendor config he had to fix to get expected results.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Greg Smith

Scott Marlowe wrote:

Have you tried running the memory stream benchmark Greg Smith had
posted here a while back?  It'll let you know if you're memory is
bottlenecking.  Right now my 48 core machines are the king of that
benchmark with something like 70+Gig a second.
  


The big Opterons are still the front-runners here, but not with 70GB/s 
anymore.  Earlier versions of stream-scaling didn't use nearly enough 
data to avoid L3 cache in the processors interfering with results.  More 
recent tests I've gotten in done after I expanded the default test size 
for them show the Opterons normally hitting the same ~35GB/s maximum 
throughput that the Intel processors get out of similar DDR3/1333 sets.  
There are some outliers where 50GB/s still shows up.  I'm not sure if I 
really believe them though; attempts to increase the test size now hit a 
32-bit limit inside stream.c, and I think that's not really big enough 
to avoid L3 cache effects here.


In the table at https://github.com/gregs1104/stream-scaling the 4 X 6172 
server is similar to Scott's system.  I believe the results for 8 
(37613) and 48 cores (32301) there.  I remain somewhat suspicious that 
the higher reuslts of 40 - 51GB/s shown between 16 and 32 cores may be 
inflated by caching.  At this point I'll probably need direct access to 
one of them to resolve this for sure.  I've made a lot of progress with 
other people's servers, but complete trust in those particular results 
still isn't there yet.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Merlin Moncure
On Tue, Apr 12, 2011 at 12:00 PM, Greg Smith g...@2ndquadrant.com wrote:
 Kevin Grittner wrote:

 Glyn Astill glynast...@yahoo.co.uk wrote:


 Results from Greg Smiths stream_scaling test are here:

 http://www.privatepaste.com/4338aa1196


  Well, that pretty much clinches it.  Your RAM access tops out at 16
 processors.  It appears that your processors are spending most of
 their time waiting for and contending for the RAM bus.


 I've pulled Glyn's results into https://github.com/gregs1104/stream-scaling
 so they're easy to compare against similar processors, his system is the one
 labled 4 X X7550.  I'm hearing this same story from multiple people lately:
  these 32+ core servers bottleneck on aggregate memory speed with running
 PostgreSQL long before the CPUs are fully utilized.  This server is close to
 maximum memory utilization at 8 cores, and the small increase in gross
 throughput above that doesn't seem to be making up for the loss in L1 and L2
 thrashing from trying to run more.  These systems with many cores can only
 be used fully if you have a program that can work efficiency some of the
 time with just local CPU resources.  That's very rarely the case for a
 database that's moving 8K pages, tuple caches, and other forms of working
 memory around all the time.


 I have gotten machines in where moving a jumper, flipping a DIP
 switch, or changing BIOS options from the default made a big
 difference.  I'd be looking at the manuals for my motherboard and
 BIOS right now to see what options there might be to improve that

 I already forwarded Glyn a good article about tuning these Dell BIOSs in
 particular from an interesting blog series others here might like too:

 http://bleything.net/articles/postgresql-benchmarking-memory.html

 Ben Bleything is doing a very thorough walk-through of server hardware
 validation, and as is often the case he's already found one major problem
 with the vendor config he had to fix to get expected results.

For posterity, since it looks like you guys have nailed this one, I
took a look at some of the code off list and I can confirm there is no
obvious bottleneck coming from locking type issues.  The functions are
'stable' as implemented with no fancy tricks.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Andreas Kretschmer
Ogden li...@darkstatic.com wrote:

 I have been wrestling with the configuration of the dedicated Postges 9.0.3
 server at work and granted, there's more activity on the production server, 
 but
 the same queries take twice as long on the beefier server than my mac at home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more efficient.
 
 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram
 
 Heavy write and read (for reporting and calculations) server. 
 
 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB

That's okay.


 
 
 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 

Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
and random_page_cost) are completly wrong.



 cpu_tuple_cost = 0.02  
 effective_cache_size = 8192MB
 
 
 
 The planner costs seem a bit low but this was from suggestions from this very
 list a while ago. 

Sure? Can you tell us a link into the archive?


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
If I was god, I would recompile penguin with --enable-fly.   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Ogden

On Apr 12, 2011, at 12:18 PM, Andreas Kretschmer wrote:

 Ogden li...@darkstatic.com wrote:
 
 I have been wrestling with the configuration of the dedicated Postges 9.0.3
 server at work and granted, there's more activity on the production server, 
 but
 the same queries take twice as long on the beefier server than my mac at 
 home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more efficient.
 
 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram
 
 Heavy write and read (for reporting and calculations) server. 
 
 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB
 
 That's okay.
 
 
 
 
 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 
 
 Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
 and random_page_cost) are completly wrong.
 

No, I don't have super fast disks. Just the 15K SCSI over RAID. I find by 
raising them to:

seq_page_cost = 1.0
random_page_cost = 3.0
cpu_tuple_cost = 0.3
#cpu_index_tuple_cost = 0.005   # same scale as above - 0.005
#cpu_operator_cost = 0.0025 # same scale as above
effective_cache_size = 8192MB 

That this is better, some queries run much faster. Is this better?

I will find the archive and post. 

Thank you

Ogden



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Tomas Vondra
Dne 12.4.2011 19:23, Ogden napsal(a):
 
 On Apr 12, 2011, at 12:18 PM, Andreas Kretschmer wrote:
 
 Ogden li...@darkstatic.com wrote:

 I have been wrestling with the configuration of the dedicated Postges 9.0.3
 server at work and granted, there's more activity on the production server, 
 but
 the same queries take twice as long on the beefier server than my mac at 
 home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more efficient.

 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram

 Heavy write and read (for reporting and calculations) server. 

 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB

 That's okay.




 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 

 Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
 and random_page_cost) are completly wrong.

 
 No, I don't have super fast disks. Just the 15K SCSI over RAID. I
 find by raising them to:
 
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.3
 #cpu_index_tuple_cost = 0.005   # same scale as above - 0.005
 #cpu_operator_cost = 0.0025 # same scale as above
 effective_cache_size = 8192MB 
 
 That this is better, some queries run much faster. Is this better?

I guess it is. What really matters with those cost variables is the
relative scale - the original values

seq_page_cost = 0.02
random_page_cost = 0.03
cpu_tuple_cost = 0.02

suggest that the random reads are almost as expensive as sequential
reads (which usually is not true - the random reads are significantly
more expensive), and that processing each row is about as expensive as
reading the page from disk (again, reading data from disk is much more
expensive than processing them).

So yes, the current values are much more likely to give good results.

You've mentioned those values were recommended on this list - can you
point out the actual discussion?

regards
Tomas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Ogden

On Apr 12, 2011, at 1:16 PM, Tomas Vondra wrote:

 Dne 12.4.2011 19:23, Ogden napsal(a):
 
 On Apr 12, 2011, at 12:18 PM, Andreas Kretschmer wrote:
 
 Ogden li...@darkstatic.com wrote:
 
 I have been wrestling with the configuration of the dedicated Postges 9.0.3
 server at work and granted, there's more activity on the production 
 server, but
 the same queries take twice as long on the beefier server than my mac at 
 home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more efficient.
 
 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram
 
 Heavy write and read (for reporting and calculations) server. 
 
 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB
 
 That's okay.
 
 
 
 
 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 
 
 Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
 and random_page_cost) are completly wrong.
 
 
 No, I don't have super fast disks. Just the 15K SCSI over RAID. I
 find by raising them to:
 
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.3
 #cpu_index_tuple_cost = 0.005   # same scale as above - 0.005
 #cpu_operator_cost = 0.0025 # same scale as above
 effective_cache_size = 8192MB 
 
 That this is better, some queries run much faster. Is this better?
 
 I guess it is. What really matters with those cost variables is the
 relative scale - the original values
 
 seq_page_cost = 0.02
 random_page_cost = 0.03
 cpu_tuple_cost = 0.02
 
 suggest that the random reads are almost as expensive as sequential
 reads (which usually is not true - the random reads are significantly
 more expensive), and that processing each row is about as expensive as
 reading the page from disk (again, reading data from disk is much more
 expensive than processing them).
 
 So yes, the current values are much more likely to give good results.
 
 You've mentioned those values were recommended on this list - can you
 point out the actual discussion?
 
 

Thank you for your reply. 

http://archives.postgresql.org/pgsql-performance/2010-09/msg00169.php is how I 
first played with those values...

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-12 Thread Strange, John W
When purchasing the intel 7500 series, please make sure to check the hemisphere 
mode of your memory configuration.  There is a HUGE difference in the memory 
configuration around 50% speed if you don't populate all the memory slots on 
the controllers properly.

https://globalsp.ts.fujitsu.com/dmsp/docs/wp-nehalem-ex-memory-performance-ww-en.pdf

- John

-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Merlin Moncure
Sent: Tuesday, April 12, 2011 12:14 PM
To: Greg Smith
Cc: Kevin Grittner; da...@lang.hm; Steve Clark; Glyn Astill; Joshua D. Drake; 
Scott Marlowe; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Linux: more cores = less concurrency.

On Tue, Apr 12, 2011 at 12:00 PM, Greg Smith g...@2ndquadrant.com wrote:
 Kevin Grittner wrote:

 Glyn Astill glynast...@yahoo.co.uk wrote:


 Results from Greg Smiths stream_scaling test are here:

 http://www.privatepaste.com/4338aa1196


  Well, that pretty much clinches it.  Your RAM access tops out at 16 
 processors.  It appears that your processors are spending most of 
 their time waiting for and contending for the RAM bus.


 I've pulled Glyn's results into 
 https://github.com/gregs1104/stream-scaling
 so they're easy to compare against similar processors, his system is 
 the one labled 4 X X7550.  I'm hearing this same story from multiple people 
 lately:
  these 32+ core servers bottleneck on aggregate memory speed with 
 running PostgreSQL long before the CPUs are fully utilized.  This 
 server is close to maximum memory utilization at 8 cores, and the 
 small increase in gross throughput above that doesn't seem to be 
 making up for the loss in L1 and L2 thrashing from trying to run more.  
 These systems with many cores can only be used fully if you have a 
 program that can work efficiency some of the time with just local CPU 
 resources.  That's very rarely the case for a database that's moving 
 8K pages, tuple caches, and other forms of working memory around all the time.


 I have gotten machines in where moving a jumper, flipping a DIP 
 switch, or changing BIOS options from the default made a big 
 difference.  I'd be looking at the manuals for my motherboard and 
 BIOS right now to see what options there might be to improve that

 I already forwarded Glyn a good article about tuning these Dell BIOSs 
 in particular from an interesting blog series others here might like too:

 http://bleything.net/articles/postgresql-benchmarking-memory.html

 Ben Bleything is doing a very thorough walk-through of server hardware 
 validation, and as is often the case he's already found one major 
 problem with the vendor config he had to fix to get expected results.

For posterity, since it looks like you guys have nailed this one, I took a look 
at some of the code off list and I can confirm there is no obvious bottleneck 
coming from locking type issues.  The functions are 'stable' as implemented 
with no fancy tricks.


merlin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase  Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Tomas Vondra
Dne 12.4.2011 20:28, Ogden napsal(a):
 
 On Apr 12, 2011, at 1:16 PM, Tomas Vondra wrote:
 
 Dne 12.4.2011 19:23, Ogden napsal(a):

 On Apr 12, 2011, at 12:18 PM, Andreas Kretschmer wrote:

 Ogden li...@darkstatic.com wrote:

 I have been wrestling with the configuration of the dedicated Postges 
 9.0.3
 server at work and granted, there's more activity on the production 
 server, but
 the same queries take twice as long on the beefier server than my mac at 
 home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more efficient.

 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram

 Heavy write and read (for reporting and calculations) server. 

 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB

 That's okay.




 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 

 Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
 and random_page_cost) are completly wrong.


 No, I don't have super fast disks. Just the 15K SCSI over RAID. I
 find by raising them to:

 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.3
 #cpu_index_tuple_cost = 0.005   # same scale as above - 0.005
 #cpu_operator_cost = 0.0025 # same scale as above
 effective_cache_size = 8192MB 

 That this is better, some queries run much faster. Is this better?

 I guess it is. What really matters with those cost variables is the
 relative scale - the original values

 seq_page_cost = 0.02
 random_page_cost = 0.03
 cpu_tuple_cost = 0.02

 suggest that the random reads are almost as expensive as sequential
 reads (which usually is not true - the random reads are significantly
 more expensive), and that processing each row is about as expensive as
 reading the page from disk (again, reading data from disk is much more
 expensive than processing them).

 So yes, the current values are much more likely to give good results.

 You've mentioned those values were recommended on this list - can you
 point out the actual discussion?


 
 Thank you for your reply. 
 
 http://archives.postgresql.org/pgsql-performance/2010-09/msg00169.php is how 
 I first played with those values...
 

OK, what JD said there generally makes sense, although those values are
a bit extreme - in most cases it's recommended to leave seq_page_cost=1
and decrease the random_page_cost (to 2, the dafault value is 4). That
usually pushes the planner towards index scans.

I'm not saying those small values (0.02 etc.) are bad, but I guess the
effect is about the same and it changes the impact of the other cost
variables (cpu_tuple_cost, etc.)

I see there is 16GB of RAM but shared_buffers are just 4GB. So there's
nothing else running and the rest of the RAM is used for pagecache? I've
noticed the previous discussion mentions there are 8GB of RAM and the DB
size is 7GB (so it might fit into memory). Is this still the case?

regards
Tomas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Ogden

On Apr 12, 2011, at 4:09 PM, Tomas Vondra wrote:

 Dne 12.4.2011 20:28, Ogden napsal(a):
 
 On Apr 12, 2011, at 1:16 PM, Tomas Vondra wrote:
 
 Dne 12.4.2011 19:23, Ogden napsal(a):
 
 On Apr 12, 2011, at 12:18 PM, Andreas Kretschmer wrote:
 
 Ogden li...@darkstatic.com wrote:
 
 I have been wrestling with the configuration of the dedicated Postges 
 9.0.3
 server at work and granted, there's more activity on the production 
 server, but
 the same queries take twice as long on the beefier server than my mac at 
 home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more 
 efficient.
 
 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram
 
 Heavy write and read (for reporting and calculations) server. 
 
 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB
 
 That's okay.
 
 
 
 
 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 
 
 Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
 and random_page_cost) are completly wrong.
 
 
 No, I don't have super fast disks. Just the 15K SCSI over RAID. I
 find by raising them to:
 
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.3
 #cpu_index_tuple_cost = 0.005   # same scale as above - 0.005
 #cpu_operator_cost = 0.0025 # same scale as above
 effective_cache_size = 8192MB 
 
 That this is better, some queries run much faster. Is this better?
 
 I guess it is. What really matters with those cost variables is the
 relative scale - the original values
 
 seq_page_cost = 0.02
 random_page_cost = 0.03
 cpu_tuple_cost = 0.02
 
 suggest that the random reads are almost as expensive as sequential
 reads (which usually is not true - the random reads are significantly
 more expensive), and that processing each row is about as expensive as
 reading the page from disk (again, reading data from disk is much more
 expensive than processing them).
 
 So yes, the current values are much more likely to give good results.
 
 You've mentioned those values were recommended on this list - can you
 point out the actual discussion?
 
 
 
 Thank you for your reply. 
 
 http://archives.postgresql.org/pgsql-performance/2010-09/msg00169.php is how 
 I first played with those values...
 
 
 OK, what JD said there generally makes sense, although those values are
 a bit extreme - in most cases it's recommended to leave seq_page_cost=1
 and decrease the random_page_cost (to 2, the dafault value is 4). That
 usually pushes the planner towards index scans.
 
 I'm not saying those small values (0.02 etc.) are bad, but I guess the
 effect is about the same and it changes the impact of the other cost
 variables (cpu_tuple_cost, etc.)
 
 I see there is 16GB of RAM but shared_buffers are just 4GB. So there's
 nothing else running and the rest of the RAM is used for pagecache? I've
 noticed the previous discussion mentions there are 8GB of RAM and the DB
 size is 7GB (so it might fit into memory). Is this still the case?
 
 regards
 Tomas


Thomas,

By decreasing random_page_cost to 2 (instead of 4), there is a slight 
performance decrease as opposed to leaving it just at 4. For example, if I set 
it 3 (or 4), a query may take 0.057 seconds. The same query takes 0.144s when I 
set random_page_cost to 2. Should I keep it at 3 (or 4) as I have done now?

Yes there is 16GB of RAM but the database is much bigger than that. Should I 
increase shared_buffers?

Thank you so very much

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-12 Thread Tomas Vondra
Dne 12.4.2011 23:19, Ogden napsal(a):
 
 On Apr 12, 2011, at 4:09 PM, Tomas Vondra wrote:
 
 Dne 12.4.2011 20:28, Ogden napsal(a):

 On Apr 12, 2011, at 1:16 PM, Tomas Vondra wrote:

 Dne 12.4.2011 19:23, Ogden napsal(a):

 On Apr 12, 2011, at 12:18 PM, Andreas Kretschmer wrote:

 Ogden li...@darkstatic.com wrote:

 I have been wrestling with the configuration of the dedicated Postges 
 9.0.3
 server at work and granted, there's more activity on the production 
 server, but
 the same queries take twice as long on the beefier server than my mac 
 at home.
 I have pasted what I have changed in postgresql.conf - I am wondering if
 there's any way one can help me change things around to be more 
 efficient.

 Dedicated PostgreSQL 9.0.3 Server with 16GB Ram

 Heavy write and read (for reporting and calculations) server. 

 max_connections = 350 
 shared_buffers = 4096MB  
 work_mem = 32MB
 maintenance_work_mem = 512MB

 That's okay.




 seq_page_cost = 0.02# measured on an arbitrary scale
 random_page_cost = 0.03 

 Do you have super, Super, SUPER fast disks? I think, this (seq_page_cost
 and random_page_cost) are completly wrong.


 No, I don't have super fast disks. Just the 15K SCSI over RAID. I
 find by raising them to:

 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.3
 #cpu_index_tuple_cost = 0.005   # same scale as above - 0.005
 #cpu_operator_cost = 0.0025 # same scale as above
 effective_cache_size = 8192MB 

 That this is better, some queries run much faster. Is this better?

 I guess it is. What really matters with those cost variables is the
 relative scale - the original values

 seq_page_cost = 0.02
 random_page_cost = 0.03
 cpu_tuple_cost = 0.02

 suggest that the random reads are almost as expensive as sequential
 reads (which usually is not true - the random reads are significantly
 more expensive), and that processing each row is about as expensive as
 reading the page from disk (again, reading data from disk is much more
 expensive than processing them).

 So yes, the current values are much more likely to give good results.

 You've mentioned those values were recommended on this list - can you
 point out the actual discussion?



 Thank you for your reply. 

 http://archives.postgresql.org/pgsql-performance/2010-09/msg00169.php is 
 how I first played with those values...


 OK, what JD said there generally makes sense, although those values are
 a bit extreme - in most cases it's recommended to leave seq_page_cost=1
 and decrease the random_page_cost (to 2, the dafault value is 4). That
 usually pushes the planner towards index scans.

 I'm not saying those small values (0.02 etc.) are bad, but I guess the
 effect is about the same and it changes the impact of the other cost
 variables (cpu_tuple_cost, etc.)

 I see there is 16GB of RAM but shared_buffers are just 4GB. So there's
 nothing else running and the rest of the RAM is used for pagecache? I've
 noticed the previous discussion mentions there are 8GB of RAM and the DB
 size is 7GB (so it might fit into memory). Is this still the case?

 regards
 Tomas
 
 
 Thomas,
 
 By decreasing random_page_cost to 2 (instead of 4), there is a slight 
 performance decrease as opposed to leaving it just at 4. For example, if I 
 set it 3 (or 4), a query may take 0.057 seconds. The same query takes 0.144s 
 when I set random_page_cost to 2. Should I keep it at 3 (or 4) as I have done 
 now?
 
 Yes there is 16GB of RAM but the database is much bigger than that. Should I 
 increase shared_buffers?

OK, that's a very important information and it kinda explains all the
problems you had. When the planner decides what execution plan to use,
it computes a 'virtual cost' for different plans and then chooses the
cheapest one.

Decreasing 'random_page_cost' decreases the expected cost of plans
involving index scans, so that at a certain point it seems cheaper than
a plan using sequential scans etc.

You can see this when using EXPLAIN - do it with the original cost
values, then change the values (for that session only) and do the
EXPLAIN only. You'll see how the execution plan suddenly changes and
starts to use index scans.

The problem with random I/O is that it's usually much more expensive
than sequential I/O as the drives need to seek etc. The only case when
random I/O is just as cheap as sequential I/O is when all the data is
cached in memory, because within RAM there's no difference between
random and sequential access (right, that's why it's called Random
Access Memory).

So in the previous post setting both random_page_cost and seq_page_cost
to the same value makes sense, because when the whole database fits into
the memory, there's no difference and index scans are favorable.

In this case (the database is much bigger than the available RAM) this
no longer holds - index scans hit the drives, resulting in a lot of
seeks etc. So it's a serious performance killer ...

Not sure about increasing 

[PERFORM] poor execution plan because column dependence

2011-04-12 Thread Václav Ovsík
Hi,
I have done migration of the Request Tracker 3.8.9
(http://requesttracker.wikia.com/wiki/HomePage) from Mysql to
PostgreSQL in testing environment.
The RT schema used can be viewed at
https://github.com/bestpractical/rt/blob/3.8-trunk/etc/schema.Pg.
I have added full text search on table Attachments based on trigrams
(and still experimenting with it), but is is not interesting for the
problem (the problem is not caused by it directly).
The full text search alone works quite good. A user testing a new RT instance
reported a poor performance problem with a bit more complex query (more
conditions resulting in table joins).
Queries are constructed by module DBIx::SearchBuilder.
The problematic query logged:

rt=# EXPLAIN ANALYZE SELECT DISTINCT  main.* FROM Tickets main JOIN 
Transactions Transactions_1  ON ( Transactions_1.ObjectId = main.id ) JOIN 
Attachments Attachments_2  ON ( Attachments_2.TransactionId = Transactions_1.id 
)  WHERE (Transactions_1.ObjectType = 'RT::Ticket') AND (main.Status != 
'deleted') AND (main.Status = 'resolved' AND main.LastUpdated  '2008-12-31 
23:00:00' AND main.Created  '2005-12-31 23:00:00' AND main.Queue = '15' AND  ( 
Attachments_2.trigrams @@ text_to_trgm_tsquery('uir') AND Attachments_2.Content 
ILIKE '%uir%' ) ) AND (main.Type = 'ticket') AND (main.EffectiveId = main.id)  
ORDER BY main.id ASC;


   QUERY PLAN   

 
-
 Unique  (cost=23928.60..23928.67 rows=1 width=162) (actual 
time=5201.139..5207.965 rows=649 loops=1)
   -  Sort  (cost=23928.60..23928.61 rows=1 width=162) (actual 
time=5201.137..5201.983 rows=5280 loops=1)
 Sort Key: main.effectiveid, main.issuestatement, main.resolution, 
main.owner, main.subject, main.initialpriority, main.finalpriority, 
main.priority, main.timeestimated, main.timeworked, main.timeleft, main.told, 
main.starts, main.started, main.due, main.resolved, main.lastupdatedby, 
main.lastupdated, main.creator, main.created, main.disabled
 Sort Method:  quicksort  Memory: 1598kB
 -  Nested Loop  (cost=0.00..23928.59 rows=1 width=162) (actual 
time=10.060..5120.834 rows=5280 loops=1)
   -  Nested Loop  (cost=0.00..10222.38 rows=1734 width=166) 
(actual time=8.702..1328.970 rows=417711 loops=1)
 -  Seq Scan on tickets main  (cost=0.00..5687.88 rows=85 
width=162) (actual time=8.258..94.012 rows=25410 loops=1)
   Filter: (((status)::text  'deleted'::text) AND 
(lastupdated  '2008-12-31 23:00:00'::timestamp without time zone) AND (created 
 '2005-12-31 23:00:00'::timestamp without time zone) AND (effectiveid = id) 
AND (queue = 15) AND ((type)::text = 'ticket'::text) AND ((status)::text = 
'resolved'::text))
 -  Index Scan using transactions1 on transactions 
transactions_1  (cost=0.00..53.01 rows=27 width=8) (actual time=0.030..0.039 
rows=16 loops=25410)
   Index Cond: (((transactions_1.objecttype)::text = 
'RT::Ticket'::text) AND (transactions_1.objectid = main.effectiveid))
   -  Index Scan using attachments2 on attachments attachments_2  
(cost=0.00..7.89 rows=1 width=4) (actual time=0.008..0.009 rows=0 loops=417711)
 Index Cond: (attachments_2.transactionid = 
transactions_1.id)
 Filter: ((attachments_2.trigrams @@ '''uir'''::tsquery) 
AND (attachments_2.content ~~* '%uir%'::text))
 Total runtime: 5208.149 ms
(14 rows)

The above times are for already cached data (repeated query).
I think the execution plan is poor. Better would be to filter table attachments
at first and then join the rest. The reason is a bad estimate on number of rows
returned from table tickets (85 estimated - 25410 in the reality).
Eliminating sub-condition...


rt=# explain analyze select * from tickets where effectiveid = id;
  QUERY PLAN
  
--
 Seq Scan on tickets  (cost=0.00..4097.40 rows=530 width=162) (actual 
time=0.019..38.130 rows=101869 loops=1)
   Filter: (effectiveid = id)
 Total runtime: 54.318 ms
(3 rows)

Estimated 530 rows, but reality is 101869 rows.

The problem is the strong dependance between id 

Re: [PERFORM] poor execution plan because column dependence

2011-04-12 Thread Bob Lunney
Zito,

Using psql log in as the database owner and run analyze verbose.  Happiness 
will ensue.

Also, when requesting help with a query its important to state the database 
version (select version();) and what, if any, configuration changes you have 
made in postgresql.conf.   Listing ony the ones that have changed is sufficient.

Finally, the wiki has some good information on the care and feeding of a 
PostgreSQL database:

http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT



Bob Lunney

--- On Tue, 4/12/11, Václav Ovsík vaclav.ov...@i.cz wrote:

 From: Václav Ovsík vaclav.ov...@i.cz
 Subject: [PERFORM] poor execution plan because column dependence
 To: pgsql-performance@postgresql.org
 Date: Tuesday, April 12, 2011, 7:23 PM
 Hi,
 I have done migration of the Request Tracker 3.8.9
 (http://requesttracker.wikia.com/wiki/HomePage) from
 Mysql to
 PostgreSQL in testing environment.
 The RT schema used can be viewed at
 https://github.com/bestpractical/rt/blob/3.8-trunk/etc/schema.Pg.
 I have added full text search on table Attachments based on
 trigrams
 (and still experimenting with it), but is is not
 interesting for the
 problem (the problem is not caused by it directly).
 The full text search alone works quite good. A user testing
 a new RT instance
 reported a poor performance problem with a bit more complex
 query (more
 conditions resulting in table joins).
 Queries are constructed by module DBIx::SearchBuilder.
 The problematic query logged:
 
 rt=# EXPLAIN ANALYZE SELECT DISTINCT  main.* FROM
 Tickets main JOIN Transactions Transactions_1  ON (
 Transactions_1.ObjectId = main.id ) JOIN Attachments
 Attachments_2  ON ( Attachments_2.TransactionId =
 Transactions_1.id )  WHERE (Transactions_1.ObjectType =
 'RT::Ticket') AND (main.Status != 'deleted') AND
 (main.Status = 'resolved' AND main.LastUpdated 
 '2008-12-31 23:00:00' AND main.Created  '2005-12-31
 23:00:00' AND main.Queue = '15' AND  (
 Attachments_2.trigrams @@ text_to_trgm_tsquery('uir') AND
 Attachments_2.Content ILIKE '%uir%' ) ) AND (main.Type =
 'ticket') AND (main.EffectiveId = main.id)  ORDER BY
 main.id ASC;
                
                
                
                
                
                
                
                
                
                
            QUERY
 PLAN               
                
                
                
                
                
                
                
                
                
             
 -
  Unique  (cost=23928.60..23928.67 rows=1 width=162)
 (actual time=5201.139..5207.965 rows=649 loops=1)
    -  Sort 
 (cost=23928.60..23928.61 rows=1 width=162) (actual
 time=5201.137..5201.983 rows=5280 loops=1)
          Sort Key:
 main.effectiveid, main.issuestatement, main.resolution,
 main.owner, main.subject, main.initialpriority,
 main.finalpriority, main.priority, main.timeestimated,
 main.timeworked, main.timeleft, main.told, main.starts,
 main.started, main.due, main.resolved, main.lastupdatedby,
 main.lastupdated, main.creator, main.created, main.disabled
          Sort Method: 
 quicksort  Memory: 1598kB
          -  Nested
 Loop  (cost=0.00..23928.59 rows=1 width=162) (actual
 time=10.060..5120.834 rows=5280 loops=1)
            
    -  Nested Loop 
 (cost=0.00..10222.38 rows=1734 width=166) (actual
 time=8.702..1328.970 rows=417711 loops=1)
                
      -  Seq Scan on tickets
 main  (cost=0.00..5687.88 rows=85 width=162) (actual
 time=8.258..94.012 rows=25410 loops=1)
                
            Filter:
 (((status)::text  'deleted'::text) AND (lastupdated
  '2008-12-31 23:00:00'::timestamp without time zone) AND
 (created  '2005-12-31 23:00:00'::timestamp without time
 zone) AND (effectiveid = id) AND (queue = 15) AND
 ((type)::text = 'ticket'::text) AND ((status)::text =
 'resolved'::text))
                
      -  Index Scan using
 transactions1 on transactions transactions_1 
 (cost=0.00..53.01 rows=27 width=8) (actual time=0.030..0.039
 rows=16 loops=25410)
                
            Index Cond:
 (((transactions_1.objecttype)::text = 'RT::Ticket'::text)
 AND (transactions_1.objectid = main.effectiveid))
            
    -  Index Scan using attachments2
 on attachments attachments_2  (cost=0.00..7.89 rows=1
 width=4) (actual time=0.008..0.009 rows=0 loops=417711)
                
      Index Cond:
 (attachments_2.transactionid = transactions_1.id)
                
      Filter: ((attachments_2.trigrams @@
 '''uir'''::tsquery) AND (attachments_2.content ~~*
 '%uir%'::text))
  Total runtime: 5208.149 ms
 (14 rows)
 
 The above times are for already 

Re: [PERFORM] poor execution plan because column dependence

2011-04-12 Thread Tom Lane
=?iso-8859-1?Q?V=E1clav_Ovs=EDk?= vaclav.ov...@i.cz writes:
 I think the execution plan is poor. Better would be to filter table 
 attachments
 at first and then join the rest. The reason is a bad estimate on number of 
 rows
 returned from table tickets (85 estimated - 25410 in the reality).
 ...
 The problem is the strong dependance between id and effectiveid. The RT
 documentation says:

 EffectiveId:
 By default, a ticket's EffectiveId is the same as its ID. RT supports the
 ability to merge tickets together. When you merge a ticket into
 another one, RT sets the first ticket's EffectiveId to the second
 ticket's ID. RT uses this data to quickly look up which ticket
 you're really talking about when you reference a merged ticket.

 I googled the page http://wiki.postgresql.org/wiki/Cross_Columns_Stats

 Maybe I identified the already documented problem. What I can do with this
 situation? Some workaround?

Yeah, that main.EffectiveId = main.id clause is going to be
underestimated by a factor of about 200, which is most though not all of
your rowcount error for that table.  Not sure whether you can do much
about it, if the query is coming from a query generator that you can't
change.  If you can change it, try replacing main.EffectiveId = main.id
with the underlying function, eg if they're integers use
int4eq(main.EffectiveId, main.id).  This will bypass the overoptimistic
estimator for the = operator and get you a default selectivity
estimate of (IIRC) 0..  Which is still off, but only by 3x not 200x,
and that should be close enough to get a decent plan.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance