Re: [PERFORM] turn off caching for performance test

2010-10-01 Thread Willy-Bas Loos
 I found one query that did a seqscan anyway(with enable_seqscan off),
 because doing an index scan would be more than 1M points more
 expensive (to the planner).

Hmm, i guess that says it all :)
-- 
Patriotism is the conviction that your country is superior to all
others because you were born in it. -- George Bernard Shaw

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] turn off caching for performance test

2010-09-30 Thread Willy-Bas Loos
Hi,

Sorry for the late answer.
I found the query i was looking for in the log (duration) and could
prove that the seqscan is faster if the data were not cached.
This particular one was 22% faster.
It is a query which will get turned into a nested loop index scan for
a lot of rows, on a huge table, but it's only 22% slower without a
seqscan.
(there's no advantage with seqscans off, as long as the cache is empty)

I found few queries that did sequential scans in the normal mode on
tables that matter.
I found one query that did a seqscan anyway(with enable_seqscan off),
because doing an index scan would be more than 1M points more
expensive (to the planner).

$ grep ^[^#] /etc/postgresql/8.3/main/postgresql.conf|grep -e ^[^[:space:]]
data_directory =blah  # use data in another directory
hba_file = blah   # host-based authentication file
ident_file = blah # ident configuration file
external_pid_file = blah  # write an extra PID file
listen_addresses = '*'  # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 200   # (change requires restart)
unix_socket_directory = '/var/run/postgresql'   # (change requires 
restart)
tcp_keepalives_idle = 120   # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 0# TCP_KEEPCNT;
shared_buffers = 2GB# min 128kB or max_connections*16kB
temp_buffers = 24MB # min 800kB
work_mem = 100MB# min 64kB
maintenance_work_mem = 256MB# min 1MB
max_fsm_pages = 60  # min max_fsm_relations*16, 6 bytes each
synchronous_commit = off# immediate fsync at commit
checkpoint_segments = 16# in logfile segments, min 1, 16MB each
effective_cache_size = 4GB
log_min_duration_statement = 2000   # -1 is disabled, 0 logs all
statements -- milliseconds
log_line_prefix = '%t ' # special values:
autovacuum = on # Enable autovacuum subprocess?  'on'
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8' # locale for system error 
message
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'  # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 128 # min 10

We have 15K rpm SAS disks in RAID10.
We have 16 GB of RAM and 4 modern processor cores (i think xeons,
might also be opteron)
We run Debian Lenny.
It's a dedicated DB server, there is one other cluster on it without
very much data and with few connections to it daily.
df -h on the data dir gives me 143G
we're growing
there are many queries that should be optimized
the seqscan option is in the connection string, not in the postgresql.conf

Cheers,

On Fri, Aug 27, 2010 at 7:57 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Thu, Aug 26, 2010 at 4:32 AM, Willy-Bas Loos willy...@gmail.com wrote:
 Hi,

 I have a colleague that is convinced that the website is faster if
 enable_seqscan is turned OFF.
 I'm convinced of the opposite (better to leave it ON), but i would like to
 show it, prove it to him.

 Stop, you're both doing it wrong.  The issue isn't whether or not
 turning off seq scans will make a few things faster here and there,
 it's why is the query planner choosing sequential scans when it should
 be choosing index scans.

 So, what are your non-default settings in postgresql.conf?
 Have you increased effective_cache_size yet?
 Lowered random_page_cost?
 Raised default stats target and re-analyzed?

 Have you been looking at the problem queries with explain analyze?
 What does it have to say about the planners choices?




-- 
Patriotism is the conviction that your country is superior to all
others because you were born in it. -- George Bernard Shaw

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] turn off caching for performance test

2010-09-25 Thread Robert Haas
On Fri, Aug 27, 2010 at 1:57 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Thu, Aug 26, 2010 at 4:32 AM, Willy-Bas Loos willy...@gmail.com wrote:
 Hi,

 I have a colleague that is convinced that the website is faster if
 enable_seqscan is turned OFF.
 I'm convinced of the opposite (better to leave it ON), but i would like to
 show it, prove it to him.

 Stop, you're both doing it wrong.  The issue isn't whether or not
 turning off seq scans will make a few things faster here and there,
 it's why is the query planner choosing sequential scans when it should
 be choosing index scans.

 So, what are your non-default settings in postgresql.conf?
 Have you increased effective_cache_size yet?
 Lowered random_page_cost?
 Raised default stats target and re-analyzed?

 Have you been looking at the problem queries with explain analyze?
 What does it have to say about the planners choices?

[a bit behind on my email]

This was exactly my thought on first reading this post.  If the
indexes are faster and PG thinks they are slower, it's a good bet that
there are some parameters that need tuning.  Specifically,
effective_cache_size may be too low, and random_page_cost and
seq_page_cost are almost certainly too high.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] turn off caching for performance test

2010-09-15 Thread Merlin Moncure
On Thu, Aug 26, 2010 at 6:32 AM, Willy-Bas Loos willy...@gmail.com wrote:
 I have a colleague that is convinced that the website is faster if
 enable_seqscan is turned OFF.
 I'm convinced of the opposite (better to leave it ON), but i would like to
 show it, prove it to him.
 Now the first query we tried, would do a bitmap heap scan instead of a
 seqscan when the latter were disabled, to exclude about 50% of the records
 (18K of 37K records).
 The bitmap heap scan is 3% faster, so that didn't really plea my case.
 The thing is that by the time we tried it, the data had been cached, so
 there is no penalty for the use of the index (HDD retention on random
 access). So it's logical that the index lookup is faster, it looks up less
 records.

 Now i'm looking for a way to turn off the caching, so that we'll have a fair
 test.

 It makes no sense to me to set shared_buffers really low. Any tips?

setting shared_buffers low or high is not going to flush the cache. it
only controls whether the o/s cache or the pg buffer cache is used.

Disabling sequential scans is going to un-optimize a large class of
operations where a sequential scan is really the best choice of
action.   In the old days of postgres, where the planner wasn't as
smart as it is today and some of the plan invalidation mechanics
weren't there, it wasn't that uncommon to disable them.  Today, it's
really not a good idea unless you have a very specific reason to, and
even then I'd advise temporarily setting it and then setting it back
when your operation is done.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] turn off caching for performance test

2010-08-27 Thread Willy-Bas Loos
@Pierre: i know.. but first i'd have to find such a query from real-life.
And also, i'm convinced that this query would be faster with a seqscan if
the data wenen't cached.


@Arjen: thanks, that helps.
But that's only the OS cache. There's also the shared_buffers, which are a
postgres specific thing.
I've found DISCARD in the
manualhttp://www.postgresql.org/docs/8.3/interactive/sql-discard.html,
but that only influences a single session, not the shared buffers.

I reckon restarting the cluster should help, would it wipe out the cache?
(pg_ctlcluster 8.3 main restart)
Or is there a more graceful way?

Cheers,

WBL


On Thu, Aug 26, 2010 at 7:37 PM, Pierre C li...@peufeu.com wrote:


  The bitmap heap scan is 3% faster,


 3% isn't really significant. Especially if the new setting makes one query
 100 times slower... Like a query which will, by bad luck, get turned into a
 nested loop index scan for a lot of rows, on a huge table which isn't in
 cache...




-- 
Patriotism is the conviction that your country is superior to all others
because you were born in it. -- George Bernard Shaw


Re: [PERFORM] turn off caching for performance test

2010-08-27 Thread Greg Smith

Willy-Bas Loos wrote:
But that's only the OS cache. There's also the shared_buffers, which 
are a postgres specific thing.
I've found DISCARD in the manual 
http://www.postgresql.org/docs/8.3/interactive/sql-discard.html, but 
that only influences a single session, not the shared buffers.


I reckon restarting the cluster should help, would it wipe out the 
cache? (pg_ctlcluster 8.3 main restart)

Or is there a more graceful way?


Stop the cluster; flush the OS cache; start the cluster again.  Now you 
have a clean cache to retest again.  No easier way that's reliable.  If 
you try to clear out the database by doing things like scanning large 
tables not involved in the query, you'll discover features in PostgreSQL 
will specifically defeat that from using more than a small portion of 
the cache.  Better to just do a full shutdown.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us



Re: [PERFORM] turn off caching for performance test

2010-08-27 Thread Scott Marlowe
On Thu, Aug 26, 2010 at 4:32 AM, Willy-Bas Loos willy...@gmail.com wrote:
 Hi,

 I have a colleague that is convinced that the website is faster if
 enable_seqscan is turned OFF.
 I'm convinced of the opposite (better to leave it ON), but i would like to
 show it, prove it to him.

Stop, you're both doing it wrong.  The issue isn't whether or not
turning off seq scans will make a few things faster here and there,
it's why is the query planner choosing sequential scans when it should
be choosing index scans.

So, what are your non-default settings in postgresql.conf?
Have you increased effective_cache_size yet?
Lowered random_page_cost?
Raised default stats target and re-analyzed?

Have you been looking at the problem queries with explain analyze?
What does it have to say about the planners choices?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] turn off caching for performance test

2010-08-26 Thread Willy-Bas Loos
Hi,

I have a colleague that is convinced that the website is faster if
enable_seqscan is turned OFF.
I'm convinced of the opposite (better to leave it ON), but i would like to
show it, prove it to him.
Now the first query we tried, would do a bitmap heap scan instead of a
seqscan when the latter were disabled, to exclude about 50% of the records
(18K of 37K records).
The bitmap heap scan is 3% faster, so that didn't really plea my case.
The thing is that by the time we tried it, the data had been cached, so
there is no penalty for the use of the index (HDD retention on random
access). So it's logical that the index lookup is faster, it looks up less
records.

Now i'm looking for a way to turn off the caching, so that we'll have a fair
test.

It makes no sense to me to set shared_buffers really low. Any tips?

Cheers,

WBL


-- 
Patriotism is the conviction that your country is superior to all others
because you were born in it. -- George Bernard Shaw


Re: [PERFORM] turn off caching for performance test

2010-08-26 Thread Arjen van der Meijden
Isn't it more fair to just flush the cache before doing each of the 
queries? In real-life, you'll also have disk caching... Flushing the 
buffer pool is easy, just restart PostgreSQL (or perhaps there is a 
admin command for it too?). Flushing the OS-disk cache is obviously 
OS-dependent, for linux its trivial: http://linux-mm.org/Drop_Caches


Best regards,

Arjen

On 26-8-2010 12:32 Willy-Bas Loos wrote:

Hi,

I have a colleague that is convinced that the website is faster if
enable_seqscan is turned OFF.
I'm convinced of the opposite (better to leave it ON), but i would like
to show it, prove it to him.
Now the first query we tried, would do a bitmap heap scan instead of a
seqscan when the latter were disabled, to exclude about 50% of the
records (18K of 37K records).
The bitmap heap scan is 3% faster, so that didn't really plea my case.
The thing is that by the time we tried it, the data had been cached, so
there is no penalty for the use of the index (HDD retention on random
access). So it's logical that the index lookup is faster, it looks up
less records.

Now i'm looking for a way to turn off the caching, so that we'll have a
fair test.

It makes no sense to me to set shared_buffers really low. Any tips?

Cheers,

WBL


--
Patriotism is the conviction that your country is superior to all
others because you were born in it. -- George Bernard Shaw


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] turn off caching for performance test

2010-08-26 Thread Pierre C



The bitmap heap scan is 3% faster,


3% isn't really significant. Especially if the new setting makes one query  
100 times slower... Like a query which will, by bad luck, get turned into  
a nested loop index scan for a lot of rows, on a huge table which isn't in  
cache...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance