Re: [PERFORM] query using incorrect index
On Thu, Aug 2, 2012 at 4:54 PM, Russell Keane russell.ke...@inps.co.ukwrote: ** ** Using PG 9.0 and given 2 queries (messageq_current is a view on the messageq_table): ** ** select entity_id from messageq_current where entity_id = 123456; ** ** select entity_id from messageq_current where incoming = true and inactive = false and staff_ty = 2 and staff_id = 2 order by entity_id desc limit 1; ** ** and 2 indexes (there are 15 indexes in total but they are left out here for brevity): ** ** messageq1: CREATE INDEX messageq1 ON messageq_table USING btree (entity_id); ** ** And messageq4: ** ** CREATE INDEX messageq4 ON messageq_table USING btree (inactive, staff_ty, staff_id, incoming, tran_dt); ** Of course *a lot* of detail is missing (full schema of table, all the other indexes) but with inactive a boolean column I suspect selectivity might not be too good here and so having it as a first column in a covering index is at least questionable. If query 2 is frequent you might also want to consider creating a partial index only on (staff_ty, staff_id) with filtering criteria on incoming and active as present in query 2. Btw, why don't you formulate query 2 as max query? select max(entity_id) as entity_id from messageq_current where incoming = true and inactive = false and staff_ty = 2 and staff_id = 2; ** With the messageq1 index present, query 1 is very quick (0.094ms) and query 2 is very slow (241.515ms). If I remove messageq1 then query 2 uses messageq4 and is very quick (0.098ms) but then query 1 must use a different index and is therefore slower ( 5ms). ** ** So, to the Query plans: Of which query? Shouldn't there be four plans in total? I'd post plans here: http://explain.depesz.com/ With messageq1: Limit (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481 rows=0 loops=1) Output: messageq_table.entity_id Buffers: shared hit=32 read=18870 written=12 - Index Scan Backward using messageq1 on prac_live_10112.messageq_table (cost=0.00..66762.53 rows=25 width=4) (actual time=241.479..241.479 rows=0 loops=1) Output: messageq_table.entity_id Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared hit=32 read=18870 written=12 Total runtime: 241.515 ms ** ** Without messageq1: Limit (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055 rows=0 loops=1) Output: messageq_table.entity_id Buffers: shared read=3 - Sort (cost=12534.45..12534.51 rows=25 width=4) (actual time=0.054..0.054 rows=0 loops=1) Output: messageq_table.entity_id Sort Key: messageq_table.entity_id Sort Method: quicksort Memory: 17kB - Bitmap Heap Scan on prac_live_10112.messageq_table (cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0 loops=1) Output: messageq_table.entity_id Recheck Cond: ((messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2)) Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared read=3 - Bitmap Index Scan on messageq4 (cost=0.00..174.08 rows=4920 width=0) (actual time=0.040..0.040 rows=0 loops=1) Index Cond: ((messageq_table.inactive = false) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (messageq_table.incoming = true)) Buffers: shared read=3 Total runtime: 0.098 ms ** ** Clearly the statistics are off somehow but I really don’t know where to start. ** ** Any help you can give me would be very much appreciated. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
Re: [PERFORM] query using incorrect index
You're right, a lot of information is missing but I'm unsure that the other information will make too much difference. I could drop all the other indexes on the table which aren't used here and the queries would still use the indexes they are currently using. I appreciate the idea that a boolean column selectivity might not be great. I've just tried creating indexes as follows: CREATE INDEX messageq17 ON messageq_table USING btree (staff_ty, staff_id, incoming, inactive, entity_id); CREATE INDEX messageq18 ON messageq_table USING btree (staff_ty, staff_id); When running query 2 as it stands the same thing happens, it still uses the messageq1 index. The query is logically the same as using max, you are correct, but it's generated on the fly so the limit or the queried column may change. The query plans were for the second query as I'm unsure that the first query is really relevant, it was simply there to justify the messageq1 index. Thanks, From: Robert Klemme [mailto:shortcut...@googlemail.com] Sent: 03 August 2012 10:18 To: Russell Keane; pgsql-performance Subject: Re: [PERFORM] query using incorrect index On Thu, Aug 2, 2012 at 4:54 PM, Russell Keane russell.ke...@inps.co.ukmailto:russell.ke...@inps.co.uk wrote: Using PG 9.0 and given 2 queries (messageq_current is a view on the messageq_table): select entity_id from messageq_current where entity_id = 123456; select entity_id from messageq_current where incoming = true and inactive = false and staff_ty = 2 and staff_id = 2 order by entity_id desc limit 1; and 2 indexes (there are 15 indexes in total but they are left out here for brevity): messageq1: CREATE INDEX messageq1 ON messageq_table USING btree (entity_id); And messageq4: CREATE INDEX messageq4 ON messageq_table USING btree (inactive, staff_ty, staff_id, incoming, tran_dt); Of course a lot of detail is missing (full schema of table, all the other indexes) but with inactive a boolean column I suspect selectivity might not be too good here and so having it as a first column in a covering index is at least questionable. If query 2 is frequent you might also want to consider creating a partial index only on (staff_ty, staff_id) with filtering criteria on incoming and active as present in query 2. Btw, why don't you formulate query 2 as max query? select max(entity_id) as entity_id from messageq_current where incoming = true and inactive = false and staff_ty = 2 and staff_id = 2; With the messageq1 index present, query 1 is very quick (0.094ms) and query 2 is very slow (241.515ms). If I remove messageq1 then query 2 uses messageq4 and is very quick (0.098ms) but then query 1 must use a different index and is therefore slower ( 5ms). So, to the Query plans: Of which query? Shouldn't there be four plans in total? I'd post plans here: http://explain.depesz.com/ With messageq1: Limit (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481 rows=0 loops=1) Output: messageq_table.entity_id Buffers: shared hit=32 read=18870 written=12 - Index Scan Backward using messageq1 on prac_live_10112.messageq_table (cost=0.00..66762.53 rows=25 width=4) (actual time=241.479..241.479 rows=0 loops=1) Output: messageq_table.entity_id Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared hit=32 read=18870 written=12 Total runtime: 241.515 ms Without messageq1: Limit (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055 rows=0 loops=1) Output: messageq_table.entity_id Buffers: shared read=3 - Sort (cost=12534.45..12534.51 rows=25 width=4) (actual time=0.054..0.054 rows=0 loops=1) Output: messageq_table.entity_id Sort Key: messageq_table.entity_id Sort Method: quicksort Memory: 17kB - Bitmap Heap Scan on prac_live_10112.messageq_table (cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0 loops=1) Output: messageq_table.entity_id Recheck Cond: ((messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2)) Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared read=3 - Bitmap Index Scan on messageq4 (cost=0.00..174.08 rows=4920 width=0) (actual time=0.040..0.040 rows=0 loops=1) Index Cond: ((messageq_table.inactive = false) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (messageq_table.incoming = true)) Buffers: shared read=3 Total runtime: 0.098 ms Clearly the statistics are off somehow but I really don't know where to start. Any help you can give me would be very much appreciated. Kind regards robert -- remember.guy do |as, often| as.you_can -
Re: [PERFORM] query using incorrect index
Settings query: version;PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 32-bit bytea_output;escape client_encoding;UNICODE lc_collate;English_United Kingdom.1252 lc_ctype;English_United Kingdom.1252 listen_addresses;* log_destination;stderr log_duration;off log_line_prefix;%t log_min_duration_statement;1ms log_statement;none logging_collector;on max_connections;100 max_stack_depth;2MB port;5433 search_path;prac_live_10112, prac_shared_10112, global server_encoding;UTF8 shared_buffers;32MB TimeZone;Europe/London work_mem;1MB Hardware: It's important to note that this is a (purposely) low spec development machine but the performance story is a similar one on our test setup which is a lot closer to our live environment. (I'm in the process of getting figures on this). E8400 Core 2 Duo (2.99GHz) 4GB ram xp (latest sp and all updates) 1 300GB SATA2 drive with 170 GB free space Explain analyse with both indexes present but without the limit (uses the correct index): Sort (cost=12534.90..12534.97 rows=25 width=4) (actual time=0.055..0.055 rows=0 loops=1) Output: messageq_table.entity_id Sort Key: messageq_table.entity_id Sort Method: quicksort Memory: 17kB Buffers: shared read=3 - Bitmap Heap Scan on prac_live_10112.messageq_table (cost=174.09..12534.32 rows=25 width=4) (actual time=0.040..0.040 rows=0 loops=1) Output: messageq_table.entity_id Recheck Cond: ((messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2)) Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared read=3 - Bitmap Index Scan on messageq4 (cost=0.00..174.08 rows=4920 width=0) (actual time=0.037..0.037 rows=0 loops=1) Index Cond: ((messageq_table.inactive = false) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (messageq_table.incoming = true)) Buffers: shared read=3 Total runtime: 0.092 ms -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: 02 August 2012 21:13 To: Russell Keane; pgsql-performance@postgresql.org Subject: Re: [PERFORM] query using incorrect index Russell Keane russell.ke...@inps.co.uk wrote: Clearly the statistics are off somehow but I really don't know where to start. Any help you can give me would be very much appreciated. It would help to know your more about your hardware and PostgreSQL configuration. The latter can probably best be communicated by copy/paste of the results of the query on this page: http://wiki.postgresql.org/wiki/Server_Configuration Can you also post the EXPLAIN ANALYZE output for the slow query with both indexes present but without the LIMIT clause? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query using incorrect index
Russell Keane russell.ke...@inps.co.uk wrote: log_min_duration_statement;1ms shared_buffers;32MB work_mem;1MB Those are pretty low values even for a 4GB machine. I suggest the following changes and additions, based on the fact that you seem to have the active portion of the database fully cached. shared_buffers = '160MB' work_mem = '8MB' seq_page_cost = 0.1 random_page_cost = 0.1 cpu_tuple_cost = 0.03 effective_cache_size = '2GB' Explain analyse with both indexes present but without the limit (uses the correct index): Total runtime: 0.092 ms Part of problem is that it thinks it will find a matching row fairly quickly, and having done so using the index it chose will mean it is the *right* row. The problem is that there are no matching rows, so it has to scan the entire index. More fine-grained statistics *might* help. If other techniques don't help, you can rewrite the query slightly to create an optimization fence, but that should be a last resort. I agree with Robert that if you have a lot of queries that select on incoming and/or inactive, a conditional index (with a WHERE clause in its definition) is likely to be very helpful. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query using incorrect index
I tried creating the following index: CREATE INDEX messageq17 ON messageq_table USING btree (staff_ty, staff_id, entity_id) WHERE inactive = false; 'inactive = false' (active would be much easy but this is legacy) records should make up a smaller proportion of the overall dataset (and much more of the queries will specify this clause) and the results are very promising. I will also try changing the settings and report back. Thanks again guys, -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: 03 August 2012 15:34 To: Russell Keane; pgsql-performance@postgresql.org Subject: Re: [PERFORM] query using incorrect index Russell Keane russell.ke...@inps.co.uk wrote: log_min_duration_statement;1ms shared_buffers;32MB work_mem;1MB Those are pretty low values even for a 4GB machine. I suggest the following changes and additions, based on the fact that you seem to have the active portion of the database fully cached. shared_buffers = '160MB' work_mem = '8MB' seq_page_cost = 0.1 random_page_cost = 0.1 cpu_tuple_cost = 0.03 effective_cache_size = '2GB' Explain analyse with both indexes present but without the limit (uses the correct index): Total runtime: 0.092 ms Part of problem is that it thinks it will find a matching row fairly quickly, and having done so using the index it chose will mean it is the *right* row. The problem is that there are no matching rows, so it has to scan the entire index. More fine-grained statistics *might* help. If other techniques don't help, you can rewrite the query slightly to create an optimization fence, but that should be a last resort. I agree with Robert that if you have a lot of queries that select on incoming and/or inactive, a conditional index (with a WHERE clause in its definition) is likely to be very helpful. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [ADMIN] Messed up time zones
All the above are the exact same point in time merely stated as relevant to each location. Note that given a timestamp with time zone and a zone, PostgreSQL returns a timestamp without time zone (you know the zone since you specified it). Yes, I know the zone. But I don't know the offset from UTC. Example: template1= set timezone to 'UTC'; SET template1= select ('2011-10-30 01:00:00'::timestamptz) at time zone 'Europe/Budapest'; timezone - 2011-10-30 02:00:00-- Is it winter or summer time? (1 row) template1= select ('2011-10-30 00:00:00'::timestamptz) at time zone 'Europe/Budapest'; timezone - 2011-10-30 02:00:00-- Is it winter or summer time? What is the offset from UTC here? Can you tell me when it was in UTC? (1 row) template1= What is more: template1= select (('2011-10-30 00:00:00'::timestamptz) at time zone 'Europe/Budapest') is distinct from (('2011-10-30 01:00:00'::timestamptz) at time zone 'Europe/Budapest'); ?column? -- f (1 row) template1= Yeah, we know what time zone it is in, but we don't know when it was, thanks a lot. :-( It would be unambiguous to store the UTC offset along with the value. But it is not how it was implemented. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [ADMIN] Messed up time zones
On 08/03/2012 10:21 AM, Laszlo Nagy wrote: All the above are the exact same point in time merely stated as relevant to each location. Note that given a timestamp with time zone and a zone, PostgreSQL returns a timestamp without time zone (you know the zone since you specified it). Yes, I know the zone. But I don't know the offset from UTC. Example: template1= set timezone to 'UTC'; SET template1= select ('2011-10-30 01:00:00'::timestamptz) at time zone 'Europe/Budapest'; timezone - 2011-10-30 02:00:00-- Is it winter or summer time? (1 row) template1= select ('2011-10-30 00:00:00'::timestamptz) at time zone 'Europe/Budapest'; timezone - 2011-10-30 02:00:00-- Is it winter or summer time? What is the offset from UTC here? Can you tell me when it was in UTC? (1 row) template1= I can not from the given information. Can you? The given information is ambiguous as are all times during the hour of fall-back everywhere. That leaves developers with a choice: choose an interpretation or throw an error. PostgreSQL chooses to use an interpretation. It would be nice if there were a specification as to how such ambiguous data should be interpreted. Perhaps someone can point me to one and to any relevant documentation detailing how PostgreSQL handles such data. As it is, you need to be aware of how each part of your system deals with such. For example (using my local time zone) using the date command on Linux I see that date -d '2012-11-04 0130' returns Sun Nov 4 01:30:00 PDT 2012 (Still in Daylight Saving Time) But given the same input, PostgreSQL interprets it as standard time (offset -08): select '2012-11-04 0130'::timestamptz; timestamptz 2012-11-04 01:30:00-08 What is more: template1= select (('2011-10-30 00:00:00'::timestamptz) at time zone 'Europe/Budapest') is distinct from (('2011-10-30 01:00:00'::timestamptz) at time zone 'Europe/Budapest'); ?column? -- f (1 row) template1= Yeah, we know what time zone it is in, but we don't know when it was, thanks a lot. :-( It would be unambiguous to store the UTC offset along with the value. But it is not how it was implemented. So you took two distinct points in time, threw away some critical information, and are surprised why they are now equal? Then don't do that. It's the equivalent of being surprised that www.microsoft.com is the same as www.apple.com when comparing them on the short hostname only. If you want to know if two points in time differ, just compare them. Spending a couple hours reading http://www.postgresql.org/docs/current/static/datatype-datetime.html will be time well spent. Cheers, Steve -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [ADMIN] Messed up time zones
So you took two distinct points in time, threw away some critical information, and are surprised why they are now equal? Well, I did not want to throw away any information. The actual representation could be something like: 2012-11-04 01:30:00-08 in Europe/Budapest, Winter time and 2012-11-04 01:30:00-08 in Europe/Budapest, Summer time. It would be unambiguous, everybody would know the time zone, the UTC offset and the time value, and conversion back to UTC would be unambiguous too. I presumed that the representation is like that. But I was wrong. I have checked other programming languages. As it turns out, nobody wants to change the representation just because there can be an ambiguous hour in every year. Now I think that most systems treat ambiguous time stamps as if they were in standard time. And who am I to go against the main flow? I'm sorry, I admit that the problem was in my head. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux memory zone reclaim
On 7/30/12 10:09 AM, Scott Marlowe wrote: I think the zone_reclaim gets turned on with a high ratio. If the inter node costs were the same, and the intranode costs dropped in half, zone reclaim would likely get turned on at boot time. We've been seeing a major problem with zone_reclaim and Linux, in that Linux won't use the FS cache on the distant RAM *at all* if it thinks that RAM is distant enough. Thus, you get instances of seeing only half of RAM used for FS cache, even though the database is 5X larger than RAM. This is poor design on Linux's part, since even the distant RAM is faster than disk. For now, we've been disabling zone_reclaim entirely. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux memory zone reclaim
On Fri, Aug 3, 2012 at 4:30 PM, Josh Berkus j...@agliodbs.com wrote: On 7/30/12 10:09 AM, Scott Marlowe wrote: I think the zone_reclaim gets turned on with a high ratio. If the inter node costs were the same, and the intranode costs dropped in half, zone reclaim would likely get turned on at boot time. We've been seeing a major problem with zone_reclaim and Linux, in that Linux won't use the FS cache on the distant RAM *at all* if it thinks that RAM is distant enough. Thus, you get instances of seeing only half of RAM used for FS cache, even though the database is 5X larger than RAM. This is poor design on Linux's part, since even the distant RAM is faster than disk. For now, we've been disabling zone_reclaim entirely. I haven't run into this, but we were running ubuntu 10.04 LTS. What kernel were you running when this happened? I'd love to see a test case on this, as it seems like a major regression if it's on newer kernels, and we're looking at running 12.04 LTS soon on one of our bigger machines. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux memory zone reclaim
This is poor design on Linux's part, since even the distant RAM is faster than disk. For now, we've been disabling zone_reclaim entirely. I haven't run into this, but we were running ubuntu 10.04 LTS. What kernel were you running when this happened? I'd love to see a test case on this, as it seems like a major regression if it's on newer kernels, and we're looking at running 12.04 LTS soon on one of our bigger machines. Jeff Frost will have a blog up about it later; we're still collecting data. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] slow query, different plans
I'm having a problem with a query on our production server, but not on a laptop running a similar postgres version with a recent backup copy of the same table. I tried reindexing the table on the production server, but it didn't make any difference. Other queries on the same table are plenty fast. This query has been slow, but never like this, particularly during a period when there are only a couple of connections in use. Vacuum and analyze are run nightly (and show as such in pg_stat_user_tables) in addition to autovacuum during the day. Here are my autovacuum settings, but when I checked last_autovacuum last_autoanalyze in pg_stat_user_tables those fields were blank. autovacuum = on log_autovacuum_min_duration = 10 autovacuum_max_workers = 3 autovacuum_naptime = 1min autovacuum_vacuum_threshold = 50 autovacuum_analyze_threshold = 50 autovacuum_vacuum_scale_factor = 0.2 autovacuum_analyze_scale_factor = 0.1 autovacuum_freeze_max_age = 2 autovacuum_vacuum_cost_delay = 10ms (changed earlier today from 1000ms) autovacuum_vacuum_cost_limit = -1 wal_level = minimal wal_buffers = 16MB The only recent change was moving the 3 databases we have from multiple raid 1 drives with tablespaces spread all over to one large raid10 with indexes and data in pg_default. WAL for this table was moved as well. Does anyone have any suggestions on where to look for the problem? clientlog table info: Size: 1.94G Column |Type | Modifiers --+-+--- pid0 | integer | not null rid | integer | not null verb | character varying(32) | not null noun | character varying(32) | not null detail | text| path | character varying(256) | not null ts | timestamp without time zone | applies2 | integer | toname | character varying(128) | byname | character varying(128) | Indexes: clientlog_applies2 btree (applies2) clientlog_pid0_key btree (pid0) clientlog_rid_key btree (rid) clientlog_ts btree (ts) The query, hardware info, and links to both plans: explain analyze select max(ts) as ts from clientlog where applies2=256; Production server: - 4 dual-core AMD Opteron 2212 processors, 2010.485 MHz - 64GB RAM - 464GB RAID10 drive - Linux 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit http://explain.depesz.com/s/8R4 From laptop running Linux 2.6.34.9-69.fc13.868 with 3G ram against a copy of the same table: PostgreSQL 9.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.4.4 20100630 (Red Hat 4.4.4-10), 32-bit http://explain.depesz.com/s/NQl Thank you, Midge
Re: [PERFORM] slow query, different plans
Midge -- Sorry for top-quoting -- challenged mail. Perhaps a difference in the stats estimates -- default_statistics_target ? Can you show us a diff between the postgres config files for each instance ? Maybe something there ... Greg Williamson From: Midge Brown midg...@sbcglobal.net To: pgsql-performance@postgresql.org Sent: Friday, August 3, 2012 5:38 PM Subject: [PERFORM] slow query, different plans I'm having a problem with a query on our production server, but not on a laptop running a similar postgres version with a recent backup copy of the same table. I tried reindexing the table on the production server, but it didn't make any difference. Other queries on the same table are plenty fast. This query has been slow, but never like this, particularly during a period when there are only a couple of connections in use. Vacuum and analyze are run nightly (and show as such in pg_stat_user_tables) in addition to autovacuum during the day. Here are my autovacuum settings, but when I checked last_autovacuum last_autoanalyze in pg_stat_user_tables those fields were blank. autovacuum = on log_autovacuum_min_duration = 10 autovacuum_max_workers = 3 autovacuum_naptime = 1min autovacuum_vacuum_threshold = 50 autovacuum_analyze_threshold = 50 autovacuum_vacuum_scale_factor = 0.2 autovacuum_analyze_scale_factor = 0.1 autovacuum_freeze_max_age = 2 autovacuum_vacuum_cost_delay = 10ms (changed earlier today from 1000ms) autovacuum_vacuum_cost_limit = -1 wal_level = minimal wal_buffers = 16MB The only recent change was moving the 3 databases we have from multiple raid 1 drives with tablespaces spread all over to one large raid10 with indexes and data in pg_default. WAL for this table was moved as well. Does anyone have any suggestions on where to look for the problem? clientlog table info: Size: 1.94G Column | Type | Modifiers --+-+--- pid0 | integer | not null rid | integer | not null verb | character varying(32) | not null noun | character varying(32) | not null detail | text | path | character varying(256) | not null ts | timestamp without time zone | applies2 | integer | toname | character varying(128) | byname | character varying(128) | Indexes: clientlog_applies2 btree (applies2) clientlog_pid0_key btree (pid0) clientlog_rid_key btree (rid) clientlog_ts btree (ts) The query, hardware info, and links to both plans: explain analyze select max(ts) as ts from clientlog where applies2=256; Production server: - 4 dual-core AMD Opteron 2212 processors, 2010.485 MHz - 64GB RAM - 464GB RAID10 drive - Linux 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit http://explain.depesz.com/s/8R4 From laptop running Linux 2.6.34.9-69.fc13.868 with 3G ram against a copy of the same table: PostgreSQL 9.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.4.4 20100630 (Red Hat 4.4.4-10), 32-bit http://explain.depesz.com/s/NQl Thank you, Midge