Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Marti Raudsepp
On Tue, Feb 15, 2011 at 20:01, Scott Marlowe scott.marl...@gmail.com wrote:
 run htop and look for red.  if youi've got lots of red bar on each CPU
 but no io wait then it's waiting for memory access.

I don't think this is true. AFAICT the red bar refers to system
time, time that's spent in the kernel -- either in syscalls or kernel
background threads.

Operating systems don't generally account memory accesses (cache
misses) for processes, if you don't specially ask for it. The closest
thing I know of is using Linux perf tools, e.g. perf top -e
cache-misses. OProfile, DTrace and SystemTap can probably do
something similar.

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Greg Smith

Kevin Grittner wrote:

In fact, I wonder whether we shouldn't leave a couple items you've
excluded, since they are sometimes germane to problems posted, like
lc_collate and TimeZone.


I pulled some of them out only because they're not really 
postgresql.conf settings; lc_collate and lc_ctype for example are set at 
initdb time.  Feel free to hack on that example if you feel it could be 
improved, just be aware which of those things are not really in the main 
config file when pondering if they should be included.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Scott Marlowe
On Wed, Feb 16, 2011 at 6:44 AM, Marti Raudsepp ma...@juffo.org wrote:
 On Tue, Feb 15, 2011 at 20:01, Scott Marlowe scott.marl...@gmail.com wrote:
 run htop and look for red.  if youi've got lots of red bar on each CPU
 but no io wait then it's waiting for memory access.

 I don't think this is true. AFAICT the red bar refers to system
 time, time that's spent in the kernel -- either in syscalls or kernel
 background threads.

My point being that if you've got a lot of RED it'll be the OS waiting
for memory access.  Trust me, when we start to hit our memory
bandwidth (in the 70 to 80 GB/s range) we start to get more and more
red and more and more kernel wait time.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Scott Marlowe
Yeah, at max load we are.  We're running quad 12 core AMD Magny Cours.
 Under max load all of our cores go about 20 to 30% red (i.e. kernel)
and we wind up waiting on the kernel much more.  Could be a mix of
context switching and waiting on memory, so it's just a guesstimate
I'm making based on previous testing with Greg Smith's memory
streaming test and familiarity with this system.

On Wed, Feb 16, 2011 at 8:53 AM, Strange, John W
john.w.stra...@jpmchase.com wrote:
 Scott, are you really moving that much data through memory, 70-80GB/sec is 
 the limit of the new intel 7500 series in a best case scenario.

 - John

 -Original Message-
 From: pgsql-performance-ow...@postgresql.org 
 [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Scott Marlowe
 Sent: 16 February 2011 15:43
 To: Marti Raudsepp
 Cc: Thomas Pöhler; pgsql-performance@postgresql.org; Felix Feinhals; 
 Verteiler_A-Team; Björn Metzdorf
 Subject: Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

 On Wed, Feb 16, 2011 at 6:44 AM, Marti Raudsepp ma...@juffo.org wrote:
 On Tue, Feb 15, 2011 at 20:01, Scott Marlowe scott.marl...@gmail.com wrote:
 run htop and look for red.  if youi've got lots of red bar on each CPU
 but no io wait then it's waiting for memory access.

 I don't think this is true. AFAICT the red bar refers to system
 time, time that's spent in the kernel -- either in syscalls or kernel
 background threads.

 My point being that if you've got a lot of RED it'll be the OS waiting
 for memory access.  Trust me, when we start to hit our memory
 bandwidth (in the 70 to 80 GB/s range) we start to get more and more
 red and more and more kernel wait time.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 This communication is for informational purposes only. It is not
 intended as an offer or solicitation for the purchase or sale of
 any financial instrument or as an official confirmation of any
 transaction. All market prices, data and other information are not
 warranted as to completeness or accuracy and are subject to change
 without notice. Any comments or statements made herein do not
 necessarily reflect those of JPMorgan Chase  Co., its subsidiaries
 and affiliates.

 This transmission may contain information that is privileged,
 confidential, legally privileged, and/or exempt from disclosure
 under applicable law. If you are not the intended recipient, you
 are hereby notified that any disclosure, copying, distribution, or
 use of the information contained herein (including any reliance
 thereon) is STRICTLY PROHIBITED. Although this transmission and any
 attachments are believed to be free of any virus or other defect
 that might affect any computer system into which it is received and
 opened, it is the responsibility of the recipient to ensure that it
 is virus free and no responsibility is accepted by JPMorgan Chase 
 Co., its subsidiaries and affiliates, as applicable, for any loss
 or damage arising in any way from its use. If you received this
 transmission in error, please immediately contact the sender and
 destroy the material in its entirety, whether in electronic or hard
 copy format. Thank you.

 Please refer to http://www.jpmorgan.com/pages/disclosures for
 disclosures relating to European legal entities.




-- 
To understand recursion, one must first understand recursion.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Kevin Grittner
Justin Pitts justinpi...@gmail.com wrote: 
 I think adding
 
 UNION ALL SELECT 'postgres version', version();
 
 might be a good thing.
 
Good point.  Added.
 
 Greg Smith g...@2ndquadrant.com wrote:
 Kevin Grittner wrote:

 In fact, I wonder whether we shouldn't leave a couple items
 you've excluded, since they are sometimes germane to problems
 posted, like lc_collate and TimeZone.

 I pulled some of them out only because they're not really
 postgresql.conf settings; lc_collate and lc_ctype for example are
 set at initdb time.  Feel free to hack on that example if you
 feel it could be improved, just be aware which of those things
 are not really in the main config file when pondering if they
 should be included.
 
Basically, the ones I could remember us needing to ask about on
multiple occasions, I put back -- provisionally.  If someone thinks
they're pointless, I won't worry about them being dropped again:
time zone, character encoding scheme, character set, and collation. 
I'm pretty sure I've seen us ask about all of those in trying to
sort out a problem.
 
I also tried the query on a newly installed HEAD build which had no
manual changes to the postgresql.conf file and found a few others
which seemed to me to be worth suppressing.
 
I took my shot -- anyone else is welcome to do so  :-)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Really really slow select count(*)

2011-02-16 Thread Ross J. Reedstrom
On Tue, Feb 08, 2011 at 03:52:31PM -0600, Kevin Grittner wrote:
 Scott Marlowe scott.marl...@gmail.com wrote:
  Greg Smith g...@2ndquadrant.com wrote:
  
  Kevin and I both suggested a fast plus timeout then immediate
  behavior is what many users seem to want.
  
  Are there any settings in postgresql.conf that would make it
  unsafe to use -m immediate?
  
 I don't think so.  There could definitely be problems if someone
 cuts power before your shutdown completes, though.  (I hear that
 those firefighters like to cut power to a building before they grab
 those big brass nozzles to spray a stream of water into a building. 
 Go figure...)

Following you off topic, I know of one admin type who has stated I don't
care what sort of fine the power company wants to give me, if my
property's on fire, I'm going to pull the meter, in order to hand it to
the first responder, rather than have them sit there waiting for the
power tech to arrive while my house burns.

Back on topic, I like the the idea of a timed escalation. That means
there's two things to configure though, timeout(s?) and the set of
states to escalate through. I can see different use cases for different
sets. Hmmm:

pg_ctl -m s:10:f:5:i restart

for smart, 5 sec. timeout, escalate to fast, 5 sec., then immediate?
Not sure how rhat would interact w/ -t.

Perhaps:

pg_ctl -t 10 -m s -t 5 -m f -m i restart

Some video-processing tools do things like that: the order of options
impacts their interaction.

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer  Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Thomas Pöhler
Hi,

we are using two instances of pgbouncer v1.4 for connection pooling.
One for prepared statements (pool_mode session) and one without (pool_mode 
transaction). 

Pgbouncer.ini:
[pgbouncer]
pool_mode = transaction/session
server_reset_query = DISCARD ALL;
server_check_query = select 1
server_check_delay = 10
max_client_conn = 1
default_pool_size = 450
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
client_login_timeout = 0


I will examine htop next time during a peak. 

If I remember correctly vmstat showed lots of context switches during a peak 
above 50k. 

We are running a biweekly downtime where we do a complete reindex and vaccum 
full. We cannot identify certain queries causing this. 

The last graph in ganglia (http://dl.dropbox.com/u/183323/CPUloadprobsdb1.jpg) 
shows the avg_queries from pgbouncers stats. I think this is a symptom of many 
waiting queries which accumulate.

Our iscsi is connected with 3Gibt/s. But that's more than enough. We don't have 
high traffic throughput.

This is the result of the query you gave me:

version PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc 
(Debian 4.3.2-1.1) 4.3.2, 64-bit
checkpoint_segments 40
custom_variable_classes pg_stat_statements
effective_cache_size48335MB
escape_string_warning   off
fsync   on
lc_collate  C
lc_ctypeC
listen_addresses*
log_destination stderr
log_line_prefix %t %p %d %u %r
log_lock_waits  on
log_min_duration_statement  1s
log_min_messagesnotice
log_rotation_size   10MB
log_temp_files  50MB
logging_collector   on
maintenance_work_mem1GB
max_connections 1000
max_prepared_transactions   5
max_stack_depth 2MB
pg_stat_statements.max  1
pg_stat_statements.trackall
port5433
server_encoding UTF8
shared_buffers  16GB
TimeZoneEurope/Berlin
update_process_titleon
wal_buffers 1MB
work_mem32MB


Seems like connection limit 1 is way too much on pgbouncer? Our queries 
overall are not that CPU intensive. If they are slow, they are mostly waiting 
for disk io. When having a look at the traffic of this database server we see 
2/3 of the traffic is going to san/disk and only 1/3 going to the server. In 
other words from the traffic view, 2/3 of our operations are writes and 1/3 are 
reads. The database is fitting completely into ram, so reads should not be a 
problem.

Appreciate your help!
Thomas

-Ursprüngliche Nachricht-
Von: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] 
Gesendet: Mittwoch, 16. Februar 2011 17:09
An: Greg Smith; Justin Pitts
Cc: pgsql-performance@postgresql.org; Verteiler_A-Team; Björn Metzdorf; Felix 
Feinhals; Thomas Pöhler
Betreff: Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

Justin Pitts justinpi...@gmail.com wrote: 
 I think adding
 
 UNION ALL SELECT 'postgres version', version();
 
 might be a good thing.
 
Good point.  Added.
 
 Greg Smith g...@2ndquadrant.com wrote:
 Kevin Grittner wrote:

 In fact, I wonder whether we shouldn't leave a couple items
 you've excluded, since they are sometimes germane to problems
 posted, like lc_collate and TimeZone.

 I pulled some of them out only because they're not really
 postgresql.conf settings; lc_collate and lc_ctype for example are
 set at initdb time.  Feel free to hack on that example if you
 feel it could be improved, just be aware which of those things
 are not really in the main config file when pondering if they
 should be included.
 
Basically, the ones I could remember us needing to ask about on
multiple occasions, I put back -- provisionally.  If someone thinks
they're pointless, I won't worry about them being dropped again:
time zone, character encoding scheme, character set, and collation. 
I'm pretty sure I've seen us ask about all of those in trying to
sort out a problem.
 
I also tried the query on a newly installed HEAD build which had no
manual changes to the postgresql.conf file and found a few others
which seemed to me to be worth suppressing.
 
I took my shot -- anyone else is welcome to do so  :-)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Cédric Villemain
2011/2/16 Thomas Pöhler t...@turtle-entertainment.de:
 Hi,

 we are using two instances of pgbouncer v1.4 for connection pooling.
 One for prepared statements (pool_mode session) and one without (pool_mode 
 transaction).

 Pgbouncer.ini:
 [pgbouncer]
 pool_mode = transaction/session
 server_reset_query = DISCARD ALL;
 server_check_query = select 1
 server_check_delay = 10
 max_client_conn = 1
 default_pool_size = 450
 log_connections = 0
 log_disconnections = 0
 log_pooler_errors = 1
 client_login_timeout = 0


 I will examine htop next time during a peak.

 If I remember correctly vmstat showed lots of context switches during a peak 
 above 50k.

 We are running a biweekly downtime where we do a complete reindex and vaccum 
 full. We cannot identify certain queries causing this.

 The last graph in ganglia 
 (http://dl.dropbox.com/u/183323/CPUloadprobsdb1.jpg) shows the avg_queries 
 from pgbouncers stats. I think this is a symptom of many waiting queries 
 which accumulate.

 Our iscsi is connected with 3Gibt/s. But that's more than enough. We don't 
 have high traffic throughput.

 This is the result of the query you gave me:

 version PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc 
 (Debian 4.3.2-1.1) 4.3.2, 64-bit
 checkpoint_segments     40
 custom_variable_classes pg_stat_statements
 effective_cache_size    48335MB
 escape_string_warning   off
 fsync   on
 lc_collate      C
 lc_ctype        C
 listen_addresses        *
 log_destination stderr
 log_line_prefix %t %p %d %u %r
 log_lock_waits  on
 log_min_duration_statement      1s
 log_min_messages        notice
 log_rotation_size       10MB
 log_temp_files  50MB
 logging_collector       on
 maintenance_work_mem    1GB
 max_connections 1000
 max_prepared_transactions       5
 max_stack_depth 2MB
 pg_stat_statements.max  1
 pg_stat_statements.track        all
 port    5433
 server_encoding UTF8
 shared_buffers  16GB
 TimeZone        Europe/Berlin
 update_process_title    on
 wal_buffers     1MB
 work_mem        32MB


 Seems like connection limit 1 is way too much on pgbouncer? Our queries 
 overall are not that CPU intensive. If they are slow, they are mostly waiting 
 for disk io. When having a look at the traffic of this database server we see 
 2/3 of the traffic is going to san/disk and only 1/3 going to the server. In 
 other words from the traffic view, 2/3 of our operations are writes and 1/3 
 are reads. The database is fitting completely into ram, so reads should not 
 be a problem.

I used pgbouncer with way more than that, not an issue on its own
*but* can you export the pgbouncers in another box ?
I get issues in very high-mem usage (more than IO) and ton's of
connection via pgbouncer, then moving the bouncer in a 3rd box salve
the situation.


 Appreciate your help!
 Thomas

 -Ursprüngliche Nachricht-
 Von: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
 Gesendet: Mittwoch, 16. Februar 2011 17:09
 An: Greg Smith; Justin Pitts
 Cc: pgsql-performance@postgresql.org; Verteiler_A-Team; Björn Metzdorf; Felix 
 Feinhals; Thomas Pöhler
 Betreff: Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

 Justin Pitts justinpi...@gmail.com wrote:
 I think adding

 UNION ALL SELECT 'postgres version', version();

 might be a good thing.

 Good point.  Added.

 Greg Smith g...@2ndquadrant.com wrote:
 Kevin Grittner wrote:

 In fact, I wonder whether we shouldn't leave a couple items
 you've excluded, since they are sometimes germane to problems
 posted, like lc_collate and TimeZone.

 I pulled some of them out only because they're not really
 postgresql.conf settings; lc_collate and lc_ctype for example are
 set at initdb time.  Feel free to hack on that example if you
 feel it could be improved, just be aware which of those things
 are not really in the main config file when pondering if they
 should be included.

 Basically, the ones I could remember us needing to ask about on
 multiple occasions, I put back -- provisionally.  If someone thinks
 they're pointless, I won't worry about them being dropped again:
 time zone, character encoding scheme, character set, and collation.
 I'm pretty sure I've seen us ask about all of those in trying to
 sort out a problem.

 I also tried the query on a newly installed HEAD build which had no
 manual changes to the postgresql.conf file and found a few others
 which seemed to me to be worth suppressing.

 I took my shot -- anyone else is welcome to do so  :-)

 -Kevin

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Really really slow select count(*)

2011-02-16 Thread Bob Lunney
Ross,

Way off topic now, but from my time programming electrical meters I can tell 
you pulling the meter from its socket is potentially an extremely dangerous 
thing to do.  If there is a load across the meter's poles the spark that 
results on disconnect could kill the puller instantly.  (You don't want to know 
what happens if the person isn't killed.)  

I don't know what property your admin type is trying to protect, but I'm 
inclined to let it burn and live to work through the insurance collection 
process.

Oh, and +1 for timed escalation of a shutdown.

Bob Lunney

--- On Wed, 2/16/11, Ross J. Reedstrom reeds...@rice.edu wrote:

 From: Ross J. Reedstrom reeds...@rice.edu
 Subject: Re: [PERFORM] Really really slow select count(*)

big snip

 
 Following you off topic, I know of one admin type who has
 stated I don't
 care what sort of fine the power company wants to give me,
 if my
 property's on fire, I'm going to pull the meter, in order
 to hand it to
 the first responder, rather than have them sit there
 waiting for the
 power tech to arrive while my house burns.


  

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Kevin Grittner
Thomas Pöhlert...@turtle-entertainment.de wrote:
 
 we are using two instances of pgbouncer v1.4 for connection
 pooling.  One for prepared statements (pool_mode session) and one
 without (pool_mode transaction).
 
 max_client_conn = 1
 default_pool_size = 450
 
Your best defense against the thundering herd issues you describe
would be to eliminate the session pool (if you can), and drop the
default_pool_size for the transaction pool to where at peak the
number of backends actually busy is about twice your number of
*actual* cores.  (Don't count hyperthreading logical cores for
this purpose.)  max_client_conn can be as high as you need; the
point is for the connection pool to funnel the requests through a
much smaller pool of database connections.
 
 If I remember correctly vmstat showed lots of context switches
 during a peak above 50k.
 
Yeah, that's part of the reason throughput tanks when your active
connection count gets too high.
 
 We are running a biweekly downtime where we do a complete reindex
 and vacuum full. We cannot identify certain queries causing this.
 
If you really get bloat which requires VACUUM FULL, tracking down
the reason should be a high priority.  You normally shouldn't need
to run that.
 
Also, I hope when you run that it is VACUUM FULL followed by
REINDEX, not the other way around.  In fact, it would probably be
faster to CLUSTER (if you have room) or drop the indexes, VACUUM
FULL, and then create the indexes again.
 
 The last graph in ganglia 
 (http://dl.dropbox.com/u/183323/CPUloadprobsdb1.jpg) shows the
 avg_queries from pgbouncers stats. I think this is a symptom of
 many waiting queries which accumulate.
 
While it seems counter-intuitive, you're likely to have fewer
queries waiting a long time there if you reduce
default_pool_size so that contention doesn't kill performance when
the queries *do* get to run.
 
 max_connections   1000
 
This is what you need to try to reduce.
 
 max_prepared_transactions 5
 
If you're actually using prepared transactions, make sure none are
lingering about for a long time during these incidents.  Well,
*ever*, really -- but I would definitely check during problem
periods.
 
 wal_buffers   1MB
 
You should bump this to 16MB.
 
 The database is fitting completely into ram
 
Then you should probably be adjusting sequential_page_cost and
rand_page_cost.  You'll probably get plans which run faster, which
should help overall.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

2011-02-16 Thread Greg Smith

Thomas Pöhler wrote:

We are running a biweekly downtime where we do a complete reindex and vaccum 
full. We cannot identify certain queries causing this.


If you feel that you need VACUUM FULL, either something terribly wrong 
has happened, or someone has gotten confused.  In both cases it's 
unlikely you want to keep doing that.  See 
http://wiki.postgresql.org/wiki/VACUUM_FULL for a nice document leading 
through figuring what to do instead.


Note that if you have a database that fits in RAM, but is filled with 
the sort of index bloat garbage that using VACUUM FULL will leave 
behind, it will cause excessive CPU use when running queries.  If you 
already have planned downtime, you really should try to use use CLUSTER 
instead, to remove that from the list of possible causes for your issue.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Estimating hot data size

2011-02-16 Thread Tomas Vondra
Dne 16.2.2011 21:51, Chris Hoover napsal(a):
 All,
 
 I'm trying to estimate the size of my hot data set, and wanted to get
 some validation that I'm doing this correctly.
 
 Basically, I'm using the sum(heap_blks_read + idx_blks_read) from
 pg_statio_all_tables, and diffing the numbers over a period of time (1
 hour at least).  Is this a fair estimate?  The reason for doing this is
 we are looking at new server hardware, and I want to try and get enough
 ram on the machine to keep the hot data in memory plus provide room for
 growth.
 
 Thanks,
 
 Chris
 
 Example:
 
 
 
 *Time*
 
   
 
 *Total Blocks*
 
 2011-02-16 11:25:34.621874-05
 
   
 
 123,260,464,427.00
 
 2011-02-16 12:25:46.486719-05
 
   
 
 123,325,880,943.00
 
 
 
 To get the hot data for this hour (in KB), I'm taking:
 
 
  (123,325,880,943.00 - 123,260,464,427.00)* 8 = 523,332,128KB
 
 
 Correct?

I doubt that, although I'm not sure what exactly you mean by hot data
set. I guess it's the data set you're working with frequently, right?

The first gotcha is that heap_blks_read counts only blocks not found in
shared buffers, so those 500MB is actually the amount of data read from
the disk (or filesystem cache). It does not say anything about how
frequently the data are used.

The second gotcha is that the same block may be counted repeatedly,
especially if it is not frequently used. It's counted for query A, then
it's removed from the cache (to be replaced by another block), and then
for another query B. So the number heap_blks_read does not mean there
were that many different blocks read from the disk.

What I'd recommend is to measure the cache hit ratio, i.e. this

   heap_blks_hit / (heap_blks_read + heap_blks_hit)

which means how efficient the cache is. Increase shared buffers until it
stops to increase - that's the hot data set size.

regards
Tomas

PS: The value heap_blks_hit does not actually mean the blocks were read
from the disk - it might be read from filesystem cache (and there's
not easy way to find out this AFAIK).

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...

2011-02-16 Thread Bruce Momjian
Kevin Grittner wrote:
 Shaun Thomas stho...@peak6.com wrote:
  
  how difficult would it be to add that syntax to the JOIN
  statement, for example?
  
 Something like this syntax?:
  
 JOIN WITH (correlation_factor=0.3)
  
 Where 1.0 might mean that for each value on the left there was only
 one distinct value on the right, and 0.0 would mean that they were
 entirely independent?  (Just as an off-the-cuff example -- I'm not
 at all sure that this makes sense, let alone is the best thing to
 specify.  I'm trying to get at *syntax* here, not particular knobs.)

I am not excited about the idea of putting these correlations in
queries.  What would be more intesting would be for analyze to build a
correlation coeffficent matrix showing how columns are correlated:

a   b   c
a   1   .4  0
b   .1  1   -.3
c   .2  .3  1

and those correlations could be used to weigh how the single-column
statistics should be combined.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Does exclusive locking improve performance?

2011-02-16 Thread Jeremy Palmer
In normal circumstances does locking a table in access exclusive mode improve 
insert, update and delete operation performance on that table.

Is MVCC disabled or somehow has less work to do?

Cheers
Jeremy

__

This message contains information, which is confidential and may be subject to 
legal privilege. 
If you are not the intended recipient, you must not peruse, use, disseminate, 
distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 
0800 665 463 or i...@linz.govt.nz) and destroy the original message.
LINZ accepts no responsibility for changes to this email, or for any 
attachments, after its transmission from LINZ.

Thank you.
__