Re: [PERFORM] Are bitmap index scans slow to start?
Sorry this took so long to get back to you. Here is where we were: I said: So, this query is not called often, but the fact is that if it takes over 30 seconds to load an item (because the audit report takes so long to prepare the bitmap index scan when passed new query parameters) then it severely restricts how much data we can resurrect at any one time. Your reply: Is that a restriction you have observed, or are you extrapolating based on a single query? If you run a bunch of similar queries in close succession, it is likely that the first few queries will warm up the cache, and following queries will then run much faster. Also, if you restructure the series of queries into a large one that reconstructs many rows simultaneously, it might choose a more efficient path than if it is fed the queries one at a time. Actual observation. The first run with a new parameter actually takes 90 seconds. Another run with the same parameter takes 15-30 seconds. Running the query immediately afterwards with different parameters starts with a new 90 seconds query. Unfortunately, since going to LINUX, our sys ops hiss and snarl at anyone who comes anywhere near machine or DB server configs, so I am no longer well informed on how well optimized the machines are. Ultimately, the machines need to be optimized by an expert. As I mentioned before, our ETL is entirely single-load reads-and-writes (I didn't go into the why of this because the nature of the data and the product dictates this). And this is an example of one of the few complex joins that return hundreds/thousands of rows. The problem is that a full index scan has to be done before we can start building the results. So, if clustering will help such that the index scan KNOWS that there's no point is scanning the rest of the index because we've gone beyond the maximum value in the list of possible values, then that would help, as each table being scanned has 50 - 100 million rows (there is one table for every month of production). As always, thanks. From: Jeff Janes [mailto:jeff.ja...@gmail.com] Sent: March 5, 2013 4:21 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Are bitmap index scans slow to start? On Wed, Feb 27, 2013 at 1:38 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Is the original query you posted part of the transform process, rather than being the production query you run after the ETL is over? Neither, it is part of our auditing and maintenance processes. It is not called with any great frequency. The audit report generates rows defining how the a particular item (an item being a particular table/row) was created: it returns the names of the import tables, the row ids, the write operations and any transformation messages that may have been generated - all in the order they occurred. ... So, this query is not called often, but the fact is that if it takes over 30 seconds to load an item (because the audit report takes so long to prepare the bitmap index scan when passed new query parameters) then it severely restricts how much data we can resurrect at any one time. Is that a restriction you have observed, or are you extrapolating based on a single query? If you run a bunch of similar queries in close succession, it is likely that the first few queries will warm up the cache, and following queries will then run much faster. Also, if you restructure the series of queries into a large one that reconstructs many rows simultaneously, it might choose a more efficient path than if it is fed the queries one at a time. Cheers, Jeff
Re: [PERFORM] Are bitmap index scans slow to start?
Could you use CLUSTER on the table after it had been closed off? If appropriate, that should make the queries run much faster, as elated entries will be in the same or nearby blocks on disk. Technically, yes. That would really help, but the issue is scheduling. Although the logs are closed off for writes, they aren't closed off for reads, ref PG documentation: When a table is being clustered, an ACCESS EXCLUSIVE lock is acquired on it. This prevents any other database operations (both reads and writes) from operating on the table until the CLUSTER is finished. Not ideal, but a lot better than doing nothing at all!
Re: [PERFORM] Are bitmap index scans slow to start?
pgbouncer is more for making connections line up single-file when the line is moving at a very fast clip, say 0.01 second per turn. If I were trying to make tasks that can each last for hours or days line up and take turns, I don't think pgbouncer would be the way to go. The recommendation at the time was assuming that write contention was slowing things down and consuming resources, since I can't stop people from creating big multi-threaded imports. Each import consists of about 50 writes Is the original query you posted part of the transform process, rather than being the production query you run after the ETL is over? Neither, it is part of our auditing and maintenance processes. It is not called with any great frequency. The audit report generates rows defining how the a particular item (an item being a particular table/row) was created: it returns the names of the import tables, the row ids, the write operations and any transformation messages that may have been generated - all in the order they occurred. You can imagine how useful this in creating a document describing what happened and why. The same data generated by the report is used to resurrect an item. If - for example - our business logic has changed, but the change only affects a small sub-set of our core data, then we perform a rollback (a logical cascading delete) on the affected items. Then we create a rebuild which is a script that is generated to re-import ONLY the import table rows defined in the audit report. So, this query is not called often, but the fact is that if it takes over 30 seconds to load an item (because the audit report takes so long to prepare the bitmap index scan when passed new query parameters) then it severely restricts how much data we can resurrect at any one time.
Re: [PERFORM] Are bitmap index scans slow to start?
I had thought you were saying that any one ETL procedure into one database used 14 concurrent threads. But really, each ETL procedure is single-threaded, and there can be up to 5 (or theoretically up to 14) of them running at a time into different databases? Sorry, just caught this. Your first interpretation was correct. Each DB runs an ETL that can have up to 14 concurrent threads. I don't think the number should be that high, but the engineering team insists the load time is better than fewer threads running faster.
Re: [PERFORM] Are bitmap index scans slow to start?
Is each of these write operations just covering a single row? Does this description apply to just one of the many (how many?) databases, so that there are really 14*N concurrent sessions? All writes are single row. All DB's have exactly the same structure, only the content is different. Currently the server is hosting five active DB's - although there 14 DB's actually on the host, the balance are backups and or testing environments. When a feed comes in, it can be anything from dozens to millions of rows, and may take minutes or days to run. I had asked that PG bouncer be installed in front of the host to act as a traffic cop. Try as I may to convince the engineering team that fewer sessions running faster is optimal, they say that the 14 concurrent sessions is based on real-world experience of what imports the fastest. You really need to know whether those reads and writes are concentrated in a small region (relative to the amount of your RAM), or widely scattered. If you are reading and writing intensively (which you do seem to be doing) but only within a compact region, then it should not drive other data out of the cache. But, since you do seem to have IO problems from cache misses, and you do have a high level of activity, the easy conclusion is that you have too little RAM to hold the working size of your data. It won't be a problem of physical RAM, I believe there is at least 32GB of RAM. What constitutes a compact region? The ETL process takes the feed and distributes it to 85 core tables. I have been through many PG configuration cycles with the generous help of people in this forum. I think the big problem when getting help has been this issue of those offering assistance understanding that the whopping majority of the time, the system is performing single row reads and writes. The assumption tends to be that the end point of an ETL should just be a series of COPY statements, and it should all happen very quickly in classic SQL bulk queries.
Re: [PERFORM] Are bitmap index scans slow to start?
Hi Jeff, thanks for the insight. And then the next question would be, once they are in the cache, why don't they stay there? For that you would have to know what other types of activities are going on that might be driving the data out of the cache. To give you an idea of the activity level, each physical machine hosts multiple DB's with the same structure - one DB per client. We run automated ETL processes which digests client feeds (E) normalizes them (T) and then stores them in our DB (L). Looking at the stats from our audit log, the average feed load is 4 hours, divided up into 14 client sessions. Each session averages about 50 write (update, insert, no deletes) operations per second, representing 700 write operations per second. The ratio of reads per write is pretty high as the system goes through the transformation process. Since I don't know how this compares to other PG installations, the question of using periodic REINDEX and CLUSTER brings up these questions: 1) Because we are hosting multiple DB's, what is the impact on OS and disk caches? 2) Is there an automated CLUSTER and REINDEX strategy that will not interfere with normal operations? 3) By PG standards, is this a busy DB - and does explain why the general caches expire? Thanks, Carlo
Re: [PERFORM] Are bitmap index scans slow to start?
Hi Jeff, thanks for the reply. What is going on during the interregnum? Whatever it is, it seems to be driving the log_2013_01_session_idx index out of the cache, but not the log_2013_01 table. (Or perhaps the table visit is getting the benefit of effective_io_concurrency?) . Rebuilding the index might help, as it would put all the leaf pages holding values for session_id=27 adjacent to each other, so they would read from disk faster. But with a name like session_id, I don't know how long such clustering would last though. Technically, nothing should be happening. We used to keep one massive audit log, and was impossible to manage due to its size. We then changed to a strategy where every month a new audit log would be spawned, and since log_2013_01 represents January, the log should be closed and nothing should have changed (it is technically possible that a long-running process would spill over into February, but not by this much). So, assuming that it's stable, it should be a very good candidate for reindexing, no? Our effective_io_concurrency is 1, and last I heard the PG host was a LINUX 4 drive RAID10, so I don't know if there is any benefit to raising this number - and if there was any benfit, it would be to the Bitmap Scan, and the problem is the data building before the fact. the bitmap itself doesn't get cached. But the data needed to construct the bitmap does get cached. It gets cached by the generic caching methods of PG and the OS, not through something specific to bitmaps. This has always been a problem for me. I spend hours trying different strategies and think I've solved the problem, when in fact it seems like a cache has spun up, and then something else expires it and the problem is back. Is there a way around this problem, can I force the expiration of a cache? Carlo
Re: [PERFORM] Are bitmap index scans slow to start?
A cool idea, but if I understand it correctly very specific and fussy. New DB's are spawned on this model, and all the developers would have to be aware of this non-standard behaviour, and DBAs would have to create these indexes every month, for every DB (as the log tables are created every month). There are 89 session_id values in the January log (log_2013_01) so this would quickly get out of control. But - like I said - an interesting idea for more specific challenges. From: Marc Mamin [mailto:m.ma...@intershop.de] Sent: February 21, 2013 2:41 PM To: Jeff Janes; Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: AW: [PERFORM] Are bitmap index scans slow to start? Rebuilding the index might help, as it would put all the leaf pages holding values for session_id=27 adjacent to each other, so they would read from disk faster. But with a name like session_id, I don't know how long such clustering would last though. If I'm right about the index disk-read time, then switching to a plain index scan rather than a bitmap index scan would make no difference--either way the data has to come off the disk. I'd prefer a strategy that allowed fast performance the first time, rather than slow the first time and extremely fast subsequently. Hello, if the index is only used to locate rows for single session_id, you may consider split it in a set of partial indexes. e.g. create index i_0 on foo where session_id%4 =0; create index i_1 on foo where session_id%4 =1; create index i_2 on foo where session_id%4 =2; create index i_3 on foo where session_id%4 =3; (can be built in parallel using separate threads) Then you will have to ensure that all your WHERE clauses also contain the index condition: WHERE session_id = 27 AND session_id%4 =27%4 regards, Marc Mamin
Re: [PERFORM] Are bitmap index scans slow to start?
Also, you might could try clustering newly created tables on session_id and setting the fillfactor down so rows with the same session id will stick together on disk. My understanding of PG's cluster is that this is a one-time command that creates a re-ordered table and doesn't maintain the clustered order until the command is issued again. During the CLUSTER, the table is read and write locked. So, in order for me to use this I would need to set up a timed event to CLUSTER occasionally. I can't really help, but I can make it more clear why postgres is choosing a _bitmap_ index scan rather than a regular index scan The EXPLAIN ANALYZE is showing it is taking a long time to prepare the bitmap (i.e.- Bitmap Index Scan on log_2013_01_session_idx (cost=0.00..63186.52 rows=2947664 width=0) (actual time=32611.918..32611.918 rows=2772042 loops=1) Index Cond: (session_id = 27) the bitmap scan is actually very fast. Jeff sasys that the bitmap is not cached, so I will assume the PG general caches being created are of general use. I think what I need to do is figure out is: 1) Why does it take 36 seconds to set up the general index caches? 2) What can I do about it (what stats do I need to look at)? 3) How can I force these caches to expire so I can tell if the strategy worked? From: Nikolas Everett [mailto:nik9...@gmail.com] Sent: February 22, 2013 2:05 PM To: Carlo Stonebanks Cc: Marc Mamin; Jeff Janes; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Are bitmap index scans slow to start? I can't really help, but I can make it more clear why postgres is choosing a _bitmap_ index scan rather than a regular index scan. With a regular index scan it pumps the index for the locations of the rows that it points to and loads those rows as it finds them. This works great if the rows in the index are sorta sorted - that way it isn't jumping around the table randomly. Random io is slow. In a bitmap index scan pg pumps the index and buffers the by shoving them in a big bitmap. Then, it walks the bitmap in order to produce in order io. PG makes the choice based on a measure of the index's correlation. The problem comes down to you inserting the sessions concurrently with one another. My instinct would be to lower the FILLFACTOR on newly created indecies so they can keep their entries more in order. I'm not sure why I have that instinct but it feels right. Also, you might could try clustering newly created tables on session_id and setting the fillfactor down so rows with the same session id will stick together on disk. Now that I look stuff up on the internet I'm not sure where I saw that pg tries to maintain a cluster using empty space from FILLFACTOR but I _think_ it does. I'm not sure what is going on with my google foo today. Nik On Fri, Feb 22, 2013 at 12:50 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: A cool idea, but if I understand it correctly very specific and fussy. New DB's are spawned on this model, and all the developers would have to be aware of this non-standard behaviour, and DBAs would have to create these indexes every month, for every DB (as the log tables are created every month). There are 89 session_id values in the January log (log_2013_01) so this would quickly get out of control. But - like I said - an interesting idea for more specific challenges. From: Marc Mamin [mailto:m.ma...@intershop.de] Sent: February 21, 2013 2:41 PM To: Jeff Janes; Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: AW: [PERFORM] Are bitmap index scans slow to start? Rebuilding the index might help, as it would put all the leaf pages holding values for session_id=27 adjacent to each other, so they would read from disk faster. But with a name like session_id, I don't know how long such clustering would last though. If I'm right about the index disk-read time, then switching to a plain index scan rather than a bitmap index scan would make no difference--either way the data has to come off the disk. I'd prefer a strategy that allowed fast performance the first time, rather than slow the first time and extremely fast subsequently. Hello, if the index is only used to locate rows for single session_id, you may consider split it in a set of partial indexes. e.g. create index i_0 on foo where session_id%4 =0; create index i_1 on foo where session_id%4 =1; create index i_2 on foo where session_id%4 =2; create index i_3 on foo where session_id%4 =3; (can be built in parallel using separate threads) Then you will have to ensure that all your WHERE clauses also contain the index condition: WHERE session_id = 27 AND session_id%4 =27%4 regards, Marc Mamin
[PERFORM] Are bitmap index scans slow to start?
(Sorry moderators for any double posts, I keep making subscription errors. Hopefully this one gets through) Hi speed freaks, Can anyone tell me why the bitmap heap scan takes so long to start for this query? (SQL and EXPLAIN ANALYZE follows). The big culprit in this appears to be: - Bitmap Index Scan on log_2013_01_session_idx (cost=0.00..63186.52 rows=2947664 width=0) (actual time=32611.918..32611.918 rows=2772042 loops=1) Index Cond: (session_id = 27) I can't see anything that occurs between actual time 0.0..32611.918 that this could be waiting on. Is it building the bitmap? Running the query a second time yields this: - Bitmap Index Scan on log_2013_01_session_idx (cost=0.00..63186.52 rows=2947664 width=0) (actual time=2896.601..2896.601 rows=2772042 loops=1) Index Cond: (session_id = 27) Does the bitmap then get cached? These queries are built dynamically and called rarely, so their first-time performance is important. I'd prefer a strategy that allowed fast performance the first time, rather than slow the first time and extremely fast subsequently. Thanks, Carlo SELECT l.session_id, l.log_id, s.session_type_code, coalesce(st.name, '?' || s.session_type_code || '?') AS session_type_name, l.input_resource_id, ir.impt_schema AS input_resource_table_schema, ir.impt_table AS input_resource_table_name, ir.resource AS input_resource_name, l.input_resource_pkey_id, tar_table.table_schema, tar_table.table_name, l.target_pkey_id AS table_pkey_id, tar_op.name AS operation, tar_note.name AS note FROM mdx_audit.log_2013_01 AS l JOIN mdx_audit.session AS s USING (session_id) JOIN mdx_audit.target_table AS tar_table USING (target_table_id) JOIN mdx_audit.target_operation_type AS tar_op USING (target_operation_type_code) LEFT OUTER JOIN mdx_audit.target_note AS tar_note USING (target_note_id) LEFT OUTER JOIN mdx_audit.session_type AS st USING (session_type_code) LEFT OUTER JOIN mdx_core.input_resource AS ir USING (input_resource_id) WHERE l.session_id = 27 AND ( input_resource_pkey_id IS NULL OR input_resource_pkey_id IN ( 494568472, 494568473, 494568474, 494568475, 494568476, 494568477, 494568478, 494568479, 494568480, 494568481, 494568482, 494568483, 494568484, 494568485, 494568486, 494568487, 494568488, 494568489, 494568490 ) ) Hash Left Join (cost=63191.88..853169.29 rows=92 width=2199) (actual time=34185.045..44528.710 rows=603 loops=1) Hash Cond: (l.input_resource_id = ir.input_resource_id) - Hash Left Join (cost=63190.22..853165.68 rows=92 width=1377) (actual time=34184.963..44528.391 rows=603 loops=1) Hash Cond: (l.target_note_id = tar_note.target_note_id) - Hash Join (cost=63189.07..853164.06 rows=92 width=1161) (actual time=34184.872..44528.167 rows=603 loops=1) Hash Cond: (l.target_operation_type_code = tar_op.target_operation_type_code) - Nested Loop (cost=63188.00..853161.72 rows=92 width=1125) (actual time=34184.809..44527.884 rows=603 loops=1) - Nested Loop Left Join (cost=0.00..9.34 rows=1 width=65) (actual time=12.057..12.068 rows=1 loops=1) Join Filter: (s.session_type_code = st.session_type_code) - Index Scan using session_pkey on session s (cost=0.00..8.27 rows=1 width=7) (actual time=6.847..6.850 rows=1 loops=1) Index Cond: (session_id = 27) - Seq Scan on session_type st (cost=0.00..1.03 rows=3 width=70) (actual time=5.204..5.207 rows=3 loops=1) - Hash Join (cost=63188.00..853151.47 rows=92 width=1064) (actual time=34172.746..44515.696 rows=603 loops=1) Hash Cond: (l.target_table_id = tar_table.target_table_id) - Bitmap Heap Scan on log_2013_01 l (cost=63186.57..853148.39 rows=194 width=34) (actual time=34172.631..44515.318 rows=603 loops=1) Recheck Cond: (session_id = 27) Filter: ((input_resource_pkey_id IS NULL) OR (input_resource_pkey_id = ANY ('{494568472,494568473,494568474,494568475,494568476,494568477,494568478,494 568479,494568480,494568481,494568482,494568483,494568484,494568485,494568486 ,494568487,494568488,494568489,494568490}'::bigint[]))) - Bitmap Index Scan on log_2013_01_session_idx (cost=0.00..63186.52 rows=2947664 width=0) (actual time=32611.918..32611.918 rows=2772042 loops=1) Index Cond: (session_id = 27) - Hash (cost=1.19..1.19 rows=19 width=1034) (actual time=0.059..0.059 rows=44 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 4kB -
Re: [PERFORM] pl/pgsql functions outperforming sql ones?
Update: The main stored function in question and all of its sub sub-functions were recoded to new pure sql functions. I then stub tested the sub functions sql vs. plpgsql. Here were the results for new sql vs old plpgsql: Individual sub functions tested 20-30% faster But the main function calling new sql sub functions ran 100% slower So I tried this: I modified the old plpgsql function to call the new sql sub functions. THAT ran 20-30% faster then the unmodified version. That modified function is listed below. All the functions ending in 2 are the new SQL versions. Any thoughts or insight would be much appreciated. Carlo CREATE OR REPLACE FUNCTION mdx_lib.lex_compare_candidate3(character varying, character varying) RETURNS numeric AS $BODY$ /* Rate two strings candidacy for lex_compare. param 1: first string to compare param 2: 2nd string to compare returns: numeric result like mdx_lib.lex_distance 0 is a failure, 1 a perfect match */ declare str1 varchar = $1; str2 varchar = $2; acro1 varchar; acro2 varchar; str_dist numeric; acro_dist numeric; result numeric; begin if str1 = str2 then result = 0; else str1 = lower(regexp_replace(str1, '[^[:alnum:]]', '', 'g')); str2 = lower(regexp_replace(str2, '[^[:alnum:]]', '', 'g')); if str1 = str2 then result = 0.1; else str_dist = mdx_lib.lex_distance2(str1, str2); acro1 = mdx_lib.lex_acronym2(str1); acro2 = mdx_lib.lex_acronym2(str2); acro_dist = mdx_lib.lex_distance2(acro1, acro2); result = (acro_dist + (str_dist * 2)) / 2; end if; end if; result = 1 - result; if result 0 then result = 0; end if; return result; end; $BODY$ LANGUAGE plpgsql IMMUTABLE COST 100; -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Pavel Stehule Sent: January 28, 2012 1:38 AM To: Carlo Stonebanks Cc: Merlin Moncure; pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? 2012/1/27 Carlo Stonebanks stonec.regis...@sympatico.ca: Yes, I did test it - i.e. I ran the functions on their own as I had always noticed a minor difference between EXPLAIN ANALYZE results and direct query calls. Interesting, so sql functions DON'T cache plans? Will plan-caching be of any benefit to SQL that makes no reference to any tables? The SQL is emulating the straight non-set-oriented procedural logic of the original plpgsql. It is not necessary usually - simple SQL functions are merged to outer query - there are e few cases where this optimization cannot be processed and then there are performance lost. For example this optimization is not possible (sometimes) when some parameter is volatile Regards Pavel Stehule -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pl/pgsql functions outperforming sql ones?
Pavel, thank you very much for your explanation. Is it possible to define under what conditions that sql procs will outperform plpgsql ones, and vice-versa? -Original Message- From: Pavel Stehule [mailto:pavel.steh...@gmail.com] Sent: January 30, 2012 2:57 AM To: Carlo Stonebanks Cc: Merlin Moncure; pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? Hello 2012/1/30 Carlo Stonebanks stonec.regis...@sympatico.ca: Pavel, are you saying that the code of the stored function is actually being added to the SQL query, instead of a call to it? For example, I have seen this: SELECT myVar FROM myTable WHERE myVar 0 AND myFunc(myVar) And seen the SQL body of myVar appended to the outer query: ... Filter: SELECT CASE WHERE myVar 10 THEN true ELSE false END Is this what we are talking about? Two questions: yes - it is SQL function inlining 1) Is this also done when the function is called as a SELECT column; e.g. would: SELECT myFunc(myVar) AS result - become: SELECT ( SELECT CASE WHERE myVar 10 THEN true ELSE false END ) AS result? yes CREATE OR REPLACE FUNCTION public.fx(integer, integer) RETURNS integer LANGUAGE sql AS $function$ select coalesce($1, $2) $function$ postgres=# explain verbose select fx(random()::int, random()::int); QUERY PLAN -- Result (cost=0.00..0.02 rows=1 width=0) Output: COALESCE((random())::integer, (random())::integer) (2 rows) 2) Does that not bypass the benefits of IMMUTABLE? no - optimizator works with expanded query - usually is preferred style a writing SQL functions without flags, because optimizer can work with definition of SQL function and can set well flags. SQL function is not black box for optimizer like plpgsql does. And SQL optimizer chooses a inlining or some other optimizations. Sometimes explicit flags are necessary, but usually not for scalar SQL functions. postgres=# create or replace function public.fxs(int) postgres-# returns setof int as $$ postgres$# select * from generate_series(1,$1) postgres$# $$ language sql; CREATE FUNCTION postgres=# explain verbose select * from fxs(10); QUERY PLAN --- Function Scan on public.fxs (cost=0.25..10.25 rows=1000 width=4) Output: fxs Function Call: fxs(10) (3 rows) postgres=# create or replace function public.fxs(int) returns setof int as $$ select * from generate_series(1,$1) $$ language sql IMMUTABLE; CREATE FUNCTION postgres=# explain verbose select * from fxs(10); QUERY PLAN --- Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 width=4) Output: generate_series.generate_series Function Call: generate_series(1, 10) -- inlined query (3 rows) Regards Pavel Stehule -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Pavel Stehule Sent: January 28, 2012 1:38 AM To: Carlo Stonebanks Cc: Merlin Moncure; pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? 2012/1/27 Carlo Stonebanks stonec.regis...@sympatico.ca: Yes, I did test it - i.e. I ran the functions on their own as I had always noticed a minor difference between EXPLAIN ANALYZE results and direct query calls. Interesting, so sql functions DON'T cache plans? Will plan-caching be of any benefit to SQL that makes no reference to any tables? The SQL is emulating the straight non-set-oriented procedural logic of the original plpgsql. It is not necessary usually - simple SQL functions are merged to outer query - there are e few cases where this optimization cannot be processed and then there are performance lost. For example this optimization is not possible (sometimes) when some parameter is volatile Regards Pavel Stehule -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pl/pgsql functions outperforming sql ones?
Pavel, are you saying that the code of the stored function is actually being added to the SQL query, instead of a call to it? For example, I have seen this: SELECT myVar FROM myTable WHERE myVar 0 AND myFunc(myVar) And seen the SQL body of myVar appended to the outer query: ... Filter: SELECT CASE WHERE myVar 10 THEN true ELSE false END Is this what we are talking about? Two questions: 1) Is this also done when the function is called as a SELECT column; e.g. would: SELECT myFunc(myVar) AS result - become: SELECT ( SELECT CASE WHERE myVar 10 THEN true ELSE false END ) AS result? 2) Does that not bypass the benefits of IMMUTABLE? -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Pavel Stehule Sent: January 28, 2012 1:38 AM To: Carlo Stonebanks Cc: Merlin Moncure; pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? 2012/1/27 Carlo Stonebanks stonec.regis...@sympatico.ca: Yes, I did test it - i.e. I ran the functions on their own as I had always noticed a minor difference between EXPLAIN ANALYZE results and direct query calls. Interesting, so sql functions DON'T cache plans? Will plan-caching be of any benefit to SQL that makes no reference to any tables? The SQL is emulating the straight non-set-oriented procedural logic of the original plpgsql. It is not necessary usually - simple SQL functions are merged to outer query - there are e few cases where this optimization cannot be processed and then there are performance lost. For example this optimization is not possible (sometimes) when some parameter is volatile Regards Pavel Stehule -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pl/pgsql functions outperforming sql ones?
Yes, I did test it - i.e. I ran the functions on their own as I had always noticed a minor difference between EXPLAIN ANALYZE results and direct query calls. Interesting, so sql functions DON'T cache plans? Will plan-caching be of any benefit to SQL that makes no reference to any tables? The SQL is emulating the straight non-set-oriented procedural logic of the original plpgsql. -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: January 27, 2012 10:47 AM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? On Thu, Jan 26, 2012 at 6:09 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Assuming there was some sort of cost to pl/pgsql, I rewrote a bunch of stored functions s in straight SQL. Each stored proc was calling the next, so to get the full effect I had to track down all the pl/pgsql stored functions and convert them to sql. However, I was surprised to find after all of the rewrites, the LANGUAGE sql procs caused the queries to run slower than the LANGUAGE plpgsql. One reason that plpgsql can outperform sql functions is that plpgsql caches plans. That said, I don't think that's what's happening here. Did you confirm the performance difference outside of EXPLAIN ANALYZE? In particular cases EXPLAIN ANALYZE can skew times, either by injecting time calls or in how it discards results. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pl/pgsql functions outperforming sql ones?
Was I even right in thinking I would gain any performance by converting to SQL? -Original Message- From: Deron [mailto:fecas...@gmail.com] Sent: January 27, 2012 2:29 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? You can use PREPARE... EXECUTE to cache the plan (as well as parsing). However, I find it unlikely this will would explain the loss in performance you experienced. Deron On Fri, Jan 27, 2012 at 11:36 AM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Yes, I did test it - i.e. I ran the functions on their own as I had always noticed a minor difference between EXPLAIN ANALYZE results and direct query calls. Interesting, so sql functions DON'T cache plans? Will plan-caching be of any benefit to SQL that makes no reference to any tables? The SQL is emulating the straight non-set-oriented procedural logic of the original plpgsql. -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: January 27, 2012 10:47 AM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] pl/pgsql functions outperforming sql ones? On Thu, Jan 26, 2012 at 6:09 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Assuming there was some sort of cost to pl/pgsql, I rewrote a bunch of stored functions s in straight SQL. Each stored proc was calling the next, so to get the full effect I had to track down all the pl/pgsql stored functions and convert them to sql. However, I was surprised to find after all of the rewrites, the LANGUAGE sql procs caused the queries to run slower than the LANGUAGE plpgsql. One reason that plpgsql can outperform sql functions is that plpgsql caches plans. That said, I don't think that's what's happening here. Did you confirm the performance difference outside of EXPLAIN ANALYZE? In particular cases EXPLAIN ANALYZE can skew times, either by injecting time calls or in how it discards results. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] pl/pgsql functions outperforming sql ones?
Assuming there was some sort of cost to pl/pgsql, I rewrote a bunch of stored functions s in straight SQL. Each stored proc was calling the next, so to get the full effect I had to track down all the pl/pgsql stored functions and convert them to sql. However, I was surprised to find after all of the rewrites, the LANGUAGE sql procs caused the queries to run slower than the LANGUAGE plpgsql. None of the stored functions selected from tables, the operated on and returned scalar values - it was all assign variables, if/then/else - not even any looping. For those who need the dirty details, here they are. If you happen to think this behavior is expected, I needn't bore you - just let me know! Thanks, Carlo This was all triggered during the optimization of a query like this: SELECT myVar FROM myTable WHERE myFunc(myVar); Looking at EXPLAIN ANALYSE I saw something like this: Filter: myFunc(myVar) I rewrote the body of myFunc(myVar) something like this: SELECT CASE WHEN myVar IS NULL THEN false ELSE myOtherFunc(myVar) END When I reran EXPLAIN ANALYZE I got this: Filter: SELECT CASE WHEN myVar IS NULL THEN false ELSE myOtherFunc(myVar) END Nice. So, I did the same treatment to myOtherFunc() (converted to straight sql) but the EXPLAIN ANALYZE didn't change (reasonable, I guess - how deep would I expect it to go?) All of the procs were IMMUTABLE. I was very surprised to find that the query now ran much slower by a factor of 4.
[PERFORM] Performance costs of various PL languages
We are currently using pltclu as our PL of choice AFTER plpgSql. I'd like to know if anyone can comment on the performance costs of the various PL languages BESIDES C. For example, does pltclu instantiate faster than pltcl (presumably because it uses a shared interpreter?) Is Perl more lightweight? I know that everything depends on context - what you are doing with it, e.g. choose Tcl for string handling vs. Perl for number crunching - but for those who know about this, is there a clear performance advantage for any of the various PL languages - and if so, is it a difference so big to be worth switching? I ask this because I had expected to see pl/pgsql as a clear winner in terms of performance over pltclu, but my initial test showed the opposite. I know this may be an apples vs oranges problem and I will test further, but if anyone has any advice or insight, I would appreciate it so I can tailor my tests accordingly. Thanks, Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance costs of various PL languages
Thanks guys. Ah, Pl/java - of course. I would miss writing the code right in the SQL script, but that would have been true of C as well. None of these procedures really qualify as stored procs that move data; rather they are scalar functions used for fuzzy string comparisons based on our own domain logic - imagine something like, SELECT * FROM fathers AS f, sons AS s WHERE same_name(f.last_name, s.last_name) ... and same_name had business logic that corrected for O'reilly vs oreilly, Van De Lay vs Vandelay, etc. The point is that as we learn about the domain, we would add the rules into the function same_name() so that all apps would benefit from the new rules. Some of the functions are data-driven, for example a table of common abbreviations with regex or LIKE expressions that would be run against both strings so that each string is reduced to common abbreviations (i.e. lowest common denominator) then compared, e.g. SELECT * FROM companies AS c WHERE same_business_name(s, 'ACME Business Supplies, Incorporated') Would reduce both parameters down to the most common abbreviation and then compare again with fuzzy logic. Of course, even if this was written in C, the function would be data-bound as it read from the abbreviation table - unless you guys tell that there is a not inconsiderable cost involved in type conversion from PG to internal vars. Carlo -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: December 27, 2011 5:54 PM To: Pavel Stehule Cc: Carlo Stonebanks; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance costs of various PL languages On Tue, Dec 27, 2011 at 4:20 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello 2011/12/27 Carlo Stonebanks stonec.regis...@sympatico.ca: We are currently using pltclu as our PL of choice AFTER plpgSql. I'd like to know if anyone can comment on the performance costs of the various PL languages BESIDES C. For example, does pltclu instantiate faster than pltcl (presumably because it uses a shared interpreter?) Is Perl more lightweight? I know that everything depends on context - what you are doing with it, e.g. choose Tcl for string handling vs. Perl for number crunching - but for those who know about this, is there a clear performance advantage for any of the various PL languages - and if so, is it a difference so big to be worth switching? I ask this because I had expected to see pl/pgsql as a clear winner in terms of performance over pltclu, but my initial test showed the opposite. I know this may be an apples vs oranges problem and I will test further, but if anyone has any advice or insight, I would appreciate it so I can tailor my tests accordingly. A performance strongly depends on use case. PL/pgSQL has fast start but any expression is evaluated as simple SQL expression - and some repeated operation should be very expensive - array update, string update. PL/pgSQL is best as SQL glue. Positive to performance is type compatibility between plpgsql and Postgres. Interpret plpgsql is very simply - there are +/- zero optimizations - plpgsql code should be minimalistic, but when you don't do some really wrong, then a speed is comparable with PHP. http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Inappropriate_use_of_the_PL. 2FpgSQL_language PL/Perl has slower start - but string or array operations are very fast. Perl has own expression evaluator - faster than expression evaluation in plpgsql. On second hand - any input must be transformed from postgres format to perl format and any result must be transformed too. Perl and other languages doesn't use data type compatible with Postgres. One big advantage pl/pgsql has over scripting languages is that it understands postgresql types natively. It knows what a postgres array is, and can manipulate one directly. pl/perl would typically have to have the database convert it to a string, parse it into a perl structure, do the manipulation, then send it to the database to be parsed again. If your procedure code is mainly moving data between tables and doing minimal intermediate heavy processing, this adds up to a big advantage. Which pl to go with really depends on what you need to do. pl/pgsql is always my first choice though. perl and tcl are not particularly fast languages in the general case -- you are largely at the mercy of how well the language's syntax or library features map to the particular problem you're solving. if you need a fast general purpose language in the backend and are (very understandably) skeptical about C, I'd look at pl/java. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Composite keys
Excuse the noob question, I couldn't find any reading material on this topic. Let's say my_table has two fields, pkey_id and another_id. The primary key is pkey_id and of course indexed. Then someone adds a composite index on btree(pkey_id, another_id). Question 1) Is there any benefit to having pkey_id in the second index (assuming the index was created to satisfy some arbitrary WHERE clause)? Question 2) Regardless of the answer to Question 1 - if another_id is not guaranteed to be unique, whereas pkey_id is - there any value to changing the order of declaration (more generally, is there a performance impact for column ordering in btree composite keys?) Thanks, Carlo
Re: [PERFORM] Composite keys
Thanks Dave Claudio. Unfortunately, my specific example had a primary key in it (based on a real-world case) but this kind of distracted from the general point. So with PG I will stick to the general SQL rule that IF I use compound keys then we have the most selective columns to the left. correct? _ From: Dave Crooke [mailto:dcro...@gmail.com] Sent: October 11, 2011 9:28 PM To: Claudio Freire Cc: Carlo Stonebanks; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Composite keys Claudio is on point, I'll be even more pointed If pkey_id truly is a primary key in the database sense of the term, and thus unique, then IIUC there is no circumstance in which your composite index would ever even get used ... all it's doing is slowing down writes :-) If the query is sufficiently selective on pkey_id to merit using an index, then the planner will use the primary key index, because it's narrower; if not, then the only other option is to do a full table scan because there is no index of which another_id is a prefix. There are only three options which make sense: 1. No additional indexes, just the primary key 2. An additional index on (another_id) 3. An additional index on (another_id, pkey_id) 4. Both 2. and 3. Choosing between these depends on a lot of variables of the query mix in practice ... you could set up both 2. and 3. and then see which indexes the planner actually uses in practice and then decide which to keep. The value in having pkey_id in the index in 3. is for queries whose primary selectivity is on another_id, but which also have some selectivity on pkey_id the planner can use an index scan to filter candidate rows / blocks to look at. This is especially helpful if another_id is not very selective and / or the rows are quite wide. On gut feel, it seems unlikely that you'd have a real-world circumstance in which it makes sense to choose option 4. but it can't be ruled out without further context. Cheers Dave On Tue, Oct 11, 2011 at 7:52 PM, Claudio Freire klaussfre...@gmail.com wrote: On Tue, Oct 11, 2011 at 5:16 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Question 2) Regardless of the answer to Question 1 - if another_id is not guaranteed to be unique, whereas pkey_id is - there any value to changing the order of declaration (more generally, is there a performance impact for column ordering in btree composite keys?) Multicolumn indices on (c1, c2, ..., cn) can only be used on where clauses involving c1..ck with kn. So, an index on (a,b) does *not* help for querying on b. Furthermore, if a is unique, querying on a or querying on a and b is equally selective. b there is just consuming space and cpu cycles. I'd say, although it obviously depends on the queries you issue, you only need an index on another_id. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Thanks guys, So, would you say that transaction pooling has a load-balancing effect because of its granularity compared to session pooling? I'm concerned about the side-effects of transaction pooling, like the sessiion-level features we would always have to look out for. Wouldn't this require a code review? Just reading UDF Session State=No on this page got my attention: http://wiki.postgresql.org/wiki/PgBouncer If we go with transaction pooling, will we get any sort of warnings or exceptions when apps and stored pgUDF's are violating transaction pooling features, or will things just quietly go wrong, with one session getting a side-effect from another session's state? Carlo Date: Wed, 14 Sep 2011 09:52:07 +0800 From: ring...@ringerc.id.au To: stonec.regis...@sympatico.ca CC: kevin.gritt...@wicourts.gov; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post) On 09/14/2011 02:56 AM, Carlo Stonebanks wrote: Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? If they're all doing active work all that time you can still benefit from a pooler. Say your server can service 50 connections at optimum speed, and any more result in reduced overall throughput. You have 300 apps with statements they want to run. Your pooler will basically queue them, so at any one time 50 are doing work and 250 are waiting for database access. This should _improve_ database throughput by reducing contention if 50 worker connections is your sweet spot. However, it will also increase latency for service for those workers because they may have to wait a while before their transaction runs, even though their transaction will complete much faster. You'd probably want to pool at the transaction level, so once a client gets a connection it keeps it for the lifetime of that transaction and the connection is handed back to the pool when the transaction commits or rolls back. you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. fsync! If your workload is read-only, it won't help you much. If your workload is write-heavy or fairly balanced it'll make a HUGE difference, because fsync() on commit won't have to wait for disk I/O, only I/O to the RAID card's cache controller. You can also play with commit_delay and synchronous_commit to trade guarantees of data persistence off against performance. Don't mind losing up to 5 mins of commits if you lose power? These options are for you. Whatever you do, do NOT set fsync=off. It should be called Eat my data if anything goes even slightly wrong=on; it does have legitimate uses, but they're not yours. Can our particular setup benefit from changing the bgwriter values? Probably not. If you find that your interactive users have periods where queries seem to freeze for a few minutes at a time and then return to normal levels of performance, you might need to make this more aggressive. We actually experience this. Once again, remember the overwhelming use of the system is long-running import threads with continuous connections. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). This could definitely be checkpointing issues. Enable checkpoint logging. What would the bg_writer settings be in this case? You need to tune it for your workload I'm afraid. See the manual and mailing list discussions. -- Craig Ringer
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Hi Kevin, (sorry for late reply, PG forums seem to have problems with my e-mail client, now trying web mail) First, thanks for taking the time. I wish I could write back with quick, terse questions to your detailed reply - but I'm sorry, this is still going to be a wordy post. max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? The 300 is overkill, a sys admin raised it from 100 when multiple large projects were loaded and the server refused the additional connections. We can take large imports and break them into multiple smaller ones which the operators are doing to try and improve import performance. It does result in some improvement, but I think they have gone over the top and the answer is to improve DB and OS performance. Perhaps I don't understand how connection pooling will work with stateful apps that are continuously reading and writing (the apps are DB I/O bound). you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. Can our particular setup benefit from changing the bgwriter values? Probably not. If you find that your interactive users have periods where queries seem to freeze for a few minutes at a time and then return to normal levels of performance, you might need to make this more aggressive. We actually experience this. Once again, remember the overwhelming use of the system is long-running import threads with continuous connections. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). What would the bg_writer settings be in this case? Thanks again for your time, Carlo Date: Fri, 9 Sep 2011 13:16:28 -0500 From: kevin.gritt...@wicourts.gov To: pgsql-performance@postgresql.org; stonec.regis...@sympatico.ca Subject: Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post) Carlo Stonebanks stonec.regis...@sympatico.ca wrote: this is a full-time ETL system, with only a handful of actual *users* and automated processes over 300 connections running *import* programs 24/7 Intel* Xeon* Processor X5560 (8M Cache, 2.80 GHz, 6.40 GT/s Intel* QPI) x 2, dual quad core 48 GB RAM RAID 10, 6 X 600 GB 15krpm SAS) So, eight cores and six spindles. You are probably going to see *much* better throughput if you route those 300 workers through about 22 connections. Use a connection pooler which limits active transactions to that and queues up requests to start a transaction. Sys admin says that battery-backup RAID controller and consequent write settings should have no impact on performance. With only six drives, I your OS, WAL files, indexes, and heap files are all in the same RAID? If so, your sys admin is wrong -- you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. shared_buffers = 500MB# At 48GB of RAM, could we go to 2GB You might benefit from as much as 8GB, but only testing with your actual load will show for sure. effective_cache_size = 2457MB # Sys admin says assume 25% of 48GB Add together the shared_buffers setting and whatever the OS tells you is used for cache under your normal load. It's usually 75% of RM or higher. (NOTE: This doesn't cause any allocation of RAM; it's a hint to the cost calculations.) work_mem = 512MB # Complex reads are called many times a second Maybe, if you use the connection pooler as described above. Each connection can allocate this multiple times. So with 300 connections you could very easily start using 150GB of RAM in addition to your shared buffers; causing a swap storm followed by OOM crashes. If you stay with 300 connections this *must* be reduced by at least an order of magnitude. # from each connection, so what should this be? maintenance_work_mem = 256MB # Should this be bigger - 1GB at least? I'd go to 1 or 2 GB. checkpoint_segments = 128 # There is lots of write activity; this is high OK # but could it be higher? IMO, there's unlikely to be much benefit beyond that. #checkpoint_completion_target not set; # Recommendation appears to be .9 for our 128 checkpoint segments 0.9 is probably a good idea. default_statistics_target = 200
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Ok, connection pooler it is. As I understand it, even if there are no idle connections available we'll get the benefit of putting a turnstile on the butcher's door. I also ordered the book as soon as you mentioned - the title alone was enough to sell me on it! The book won't be for the errant sys admin who increased the connections, it's for me - I'll use it to whack the sys admin on the head. Thanks fo rthe tip, the author owes you a beer - as do I. Will the book recommend any particular connection pooler product, or is it inappropriate to ask for a recommendation on the forum? Carlo Date: Tue, 13 Sep 2011 16:13:00 -0500 From: kevin.gritt...@wicourts.gov To: pgsql-performance@postgresql.org; stonec.regis...@sympatico.ca Subject: RE: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post) Carlo Stonebanks wrote: max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? Absolutely. A good connection pooler will be able to hold those 300 *client* connections, and maintain a much smaller set of connections to the database. It will notice when a client connection is requesting the start of a database transaction. If there is an idle database connection it will route the requests there; otherwise it will put that client connection in a queue. When a database transaction is committed, a waiting client connection (if any) will be assigned to its database connection. Every benchmark I've seen shows that this will improve both throughput and latency over the approach of releasing a thundering herd of requests against the server. Picture a meat counter with four butchers behind it, and few spinning devices to slice meat. If customers queue up, and the butchers call on people as they are ready, things go better than if each butcher tries to take on one- fourth of the customers at a time and constantly switch between one order and another to try to make incremental progress on all of them. a sys admin raised it from 100 when multiple large projects were loaded and the server refused the additional connections. Whoever is making these decisions needs more training. I suggest Greg Smith's book: http://www.postgresql.org/docs/books/ (Full disclosure, I was a technical reviewer of the book and got a free copy.) you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? You get better performance with write-back. If you can't rely on the battery, then write-back is not safe and you need to use write- through. Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. If you can get him to change it back and forth for performance testing, it is easy enough to prove. Write a client application which inserts on row per database transaction. A nice, simple, short row -- like containing one integer column with no indexes. Have the external application create the table and do a million inserts. Try this with both cache settings. It's best not to issue a BEGIN and COMMIT at all. Don't loop in a function or a DO block, because that creates an implicit transaction. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). What would the bg_writer settings be in this case? I'm not sure what that is based on information so far, so it's unclear whether background writer settings would help; but on the face of it my bet would be that it's a context switching storm or swapping, and the connection pool would be the better solution. Those poor butchers are just overwhelmed -Kevin
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Hi Kevin, First, thanks for taking the time. I wish I could write back with quick, terse questions to your detailed reply - but I'm sorry, this is still going to be a wordy post. max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? The 300 is overkill, a sys admin raised it from 100 when multiple large projects were loaded and the server refused the additional connections. We can take large imports and break them into multiple smaller ones which the operators are doing to try and improve import performance. It does result in some improvement, but I think they have gone over the top and the answer is to improve DB and OS performance. Perhaps I don't understand how connection pooling will work with stateful apps that are continuously reading and writing (the apps are DB I/O bound). you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. Can our particular setup benefit from changing the bgwriter values? Probably not. If you find that your interactive users have periods where queries seem to freeze for a few minutes at a time and then return to normal levels of performance, you might need to make this more aggressive. We actually experience this. Once again, remember the overwhelming use of the system is long-running import threads with continuous connections. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). What would the bg_writer settings be in this case? Thanks again for your time, Carlo -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: September 9, 2011 2:16 PM To: pgsql-performance@postgresql.org; Carlo Stonebanks Subject: Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post) Carlo Stonebanks stonec.regis...@sympatico.ca wrote: this is a full-time ETL system, with only a handful of actual *users* and automated processes over 300 connections running *import* programs 24/7 Intel* Xeon* Processor X5560 (8M Cache, 2.80 GHz, 6.40 GT/s Intel* QPI) x 2, dual quad core 48 GB RAM RAID 10, 6 X 600 GB 15krpm SAS) So, eight cores and six spindles. You are probably going to see *much* better throughput if you route those 300 workers through about 22 connections. Use a connection pooler which limits active transactions to that and queues up requests to start a transaction. Sys admin says that battery-backup RAID controller and consequent write settings should have no impact on performance. With only six drives, I your OS, WAL files, indexes, and heap files are all in the same RAID? If so, your sys admin is wrong -- you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. shared_buffers = 500MB# At 48GB of RAM, could we go to 2GB You might benefit from as much as 8GB, but only testing with your actual load will show for sure. effective_cache_size = 2457MB # Sys admin says assume 25% of 48GB Add together the shared_buffers setting and whatever the OS tells you is used for cache under your normal load. It's usually 75% of RM or higher. (NOTE: This doesn't cause any allocation of RAM; it's a hint to the cost calculations.) work_mem = 512MB # Complex reads are called many times a second Maybe, if you use the connection pooler as described above. Each connection can allocate this multiple times. So with 300 connections you could very easily start using 150GB of RAM in addition to your shared buffers; causing a swap storm followed by OOM crashes. If you stay with 300 connections this *must* be reduced by at least an order of magnitude. # from each connection, so what should this be? maintenance_work_mem = 256MB # Should this be bigger - 1GB at least? I'd go to 1 or 2 GB. checkpoint_segments = 128 # There is lots of write activity; this is high OK # but could it be higher? IMO, there's unlikely to be much benefit beyond that. #checkpoint_completion_target not set; # Recommendation appears to be .9 for our 128 checkpoint segments 0.9 is probably a good idea. default_statistics_target = 200 # Deprecated? Depends on your data. The default is 100. You might want to leave that in general and boost it for specific columns where you find it is needed. Higher values improve estimates and can lead
[PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Hello performance wizards! (Sorry for the re-post if this appears twice - I see no evidence e-mailing to pgsql-perfomrance is working yet.) My client has migrated his 8.3 hosted DB to new machines running PG 9.0. It’s time to look at the config settings. Immediately below are the config settings. The specifics of the DB and how it is used is below that, but in general let me say that this is a full-time ETL system, with only a handful of actual “users” and automated processes over 300 connections running “import” programs 24/7. I appreciate the help, Carlo The host system: Intel® Xeon® Processor X5560 (8M Cache, 2.80 GHz, 6.40 GT/s Intel® QPI) x 2, dual quad core 48 GB RAM RAID 10, 6 X 600 GB 15krpm SAS) LINUX Redhat/Centos 2.6.18-164.el5 Sys admin says that battery-backup RAID controller and consequent write settings should have no impact on performance. Is this true? Current config and my thoughts on what to do with it. If it isn’t mentioned here, the values are default values: # === max_connections = 300 shared_buffers = 500MB# At 48GB of RAM, could we go to 2GB # - what is the impact on LINX config? effective_cache_size = 2457MB # Sys admin says assume 25% of 48GB # is used by OS and other apps work_mem = 512MB # Complex reads are called many times a second # from each connection, so what should this be? maintenance_work_mem = 256MB # Should this be bigger - 1GB at least? checkpoint_segments = 128 # There is lots of write activity; this is high # but could it be higher? #checkpoint_completion_target not set; # Recommendation appears to be .9 for our 128 checkpoint segments default_statistics_target = 200 # Deprecated? #autovacuum_freeze_max_age not set; # recommendation is 1,000,000 for non-activity. # What is the metric for wal_buffers setting? wal_buffers = 4MB # Looks low, recommendation appears to be 16MB. # Is it really set it and forget it? #synchronous_commit not set; # Recommendation is to turn this off and leave fsync on #fsync not set; # Recommendation is to leave this on #wal_level not set; # Do we only needed for replication? #max_wal_senders not set; # Do we only needed for replication? # The issue of vacuum/analyze is a tricky one. # Data imports are running 24/7. One the DB is seeded, the vast majority # of write activity is updates, and not to indexed columns. # Deletions are vary rare. vacuum_cost_delay = 20ms # The background writer has not been addressed at all. # Can our particular setup benefit from changing # the bgwriter values? bgwriter_lru_maxpages = 100 # This is the default; listen_addresses = '*' port = 5432 log_destination = 'stderr' logging_collector = on log_directory = 'pg_log' log_filename = 'postgresql-%a.log' log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 0 log_line_prefix = '%t' track_counts = on # === The DB is pretty large, and organized by schema. The most active are: 1) One “Core” schema a. 100 tables b. Typical row counts in the low millions. c. This represents the enterprise’s core data. d. Equal read/write activity 2) Multiple “Import” schemas a. Contain several thousand raw “flat file” tables b. Ragged column structure, up to hundreds of columns c. Erratic row counts, from dozens of rows to 1 million d. Each table sequentially read once, only status fields are written back 3) One “Audit” schema a. A new log table is created every month b. Typical row count is 200 million c. Log every write to the “Core” d. Almost entirely write operations, but the few read operations that are done have to be fast owing to the size of the tables e. Links the “Core” data to the “Import” data There are next to no “users” on the system – each connection services a constantly running import process which takes the incoming “import” data, analyzes the “core” data and decides how to distil the import into the core. Analytical Processes are not report-oriented The “Core” reads are mostly single row results The “Import” reads are 1,000 row pages There is next to no use of aggregate queries Transactional Processes are a steady stream of writes Not bursty or sporadic Overwhelmingly inserts and updates, next to no deletes Each transaction represents 10 – 50 writes to the “core” schema
Re: [PERFORM] Very bad plan when using VIEW and IN (SELECT...*)
Unfortunately I had double-posted this - originally in General. Tom Lane pointed out (in PG-GENERAL) that the planner will take any IN (SELECT...) statement and do a JOIN, which is what is causing the planner problem - even though the SELECT was just returning a constant. Obviously, the real query this was testing was something more real-world. SO, I took my original query and turned it to this: SELECT * FROM mdx_core.vw_provider AS p WHERE provider_id = ANY array( SELECT provider_id FROM mdx_core.provider_alias ) BLISTERINGLY fast! PG version is 8.3 - as for configuration, I didn't want to throw too much info as my concern was actually whether views were as klunky as other DB platforms. Carlo -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: August 13, 2010 9:29 AM To: pgsql-performance@postgresql.org; Carlo Stonebanks Subject: Re: [PERFORM] Very bad plan when using VIEW and IN (SELECT...*) Carlo Stonebanks stonec.regis...@sympatico.ca wrote: SELECT * FROM mdx_core.vw_provider AS p WHERE provider_id IN (SELECT 13083101) I am using the simple IN (SELECT n) in QUERY 2 to simplify the problem. I noticed the oddity of the behaviour when I used a proper IN (SELECT myId FROM myTable) Did you try?: SELECT * FROM mdx_core.vw_provider AS p WHERE EXISTS (SELECT * FROM myTable WHERE myId = provider_id) For any follow-up you should probably mention what version of PostgreSQL this is and how it's configured. http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Very bad plan when using VIEW and IN (SELECT...*)
Ref these two queries against a view: -- QUERY 1, executes 0.5 secs SELECT * FROM mdx_core.vw_provider AS p WHERE provider_id IN (13083101) -- QUERY 2, executes 13.5 secs SELECT * FROM mdx_core.vw_provider AS p WHERE provider_id IN (SELECT 13083101) I am using the simple IN (SELECT n) in QUERY 2 to simplify the problem. I noticed the oddity of the behaviour when I used a proper IN (SELECT myId FROM myTable) but the planner shows the same behaviour even if not selecting from a table - just the SELECT keyword is enough. Plans are below. The view has an internal UNION. Any explanation as to why this happens? The actualt view is listed at the very bottom, if relevant. Carlo QUERY 1 PLAN Unique (cost=25.48..25.69 rows=2 width=417) (actual time=0.180..0.190 rows=2 loops=1) - Sort (cost=25.48..25.48 rows=2 width=417) (actual time=0.179..0.180 rows=2 loops=1) Sort Key: *SELECT* 1.provider_id, (NULL::integer), *SELECT* 1.master_id, *SELECT* 1.client_ids, *SELECT* 1.upin, *SELECT* 1.medical_education_number, *SELECT* 1.abmsuid, *SELECT* 1.npi, *SELECT* 1.npi_status_code, *SELECT* 1.cc_id, *SELECT* 1.aoa_id, *SELECT* 1.last_name, *SELECT* 1.first_name, *SELECT* 1.middle_name, *SELECT* 1.suffix, *SELECT* 1.display_name, *SELECT* 1.display_title, *SELECT* 1.nickname, *SELECT* 1.familiar_name, *SELECT* 1.pubmed_name, *SELECT* 1.master_name, *SELECT* 1.display_name_orig, (NULL::text), *SELECT* 1.gender, *SELECT* 1.birth_year, *SELECT* 1.birth_month, *SELECT* 1.birth_day, *SELECT* 1.clinical_interest, *SELECT* 1.research_interest, *SELECT* 1.summary, *SELECT* 1.comments, *SELECT* 1.degree_types, *SELECT* 1.provider_type_ids, *SELECT* 1.provider_status_code, *SELECT* 1.provider_status_year, *SELECT* 1.created, *SELECT* 1.unique_flag, *SELECT* 1.is_locked, *SELECT* 1.provider_standing_code, *SELECT* 1.impt_source_date, *SELECT* 1.input_resource_id, *SELECT* 1.input_source_ids Sort Method: quicksort Memory: 27kB - Append (cost=0.00..25.47 rows=2 width=417) (actual time=0.078..0.143 rows=2 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..8.59 rows=1 width=408) (actual time=0.078..0.079 rows=1 loops=1) - Index Scan using provider_provider_id_idx on provider p (cost=0.00..8.58 rows=1 width=408) (actual time=0.076..0.077 rows=1 loops=1) Index Cond: (provider_id = 13083101) - Subquery Scan *SELECT* 2 (cost=0.00..16.87 rows=1 width=417) (actual time=0.061..0.062 rows=1 loops=1) - Nested Loop (cost=0.00..16.86 rows=1 width=417) (actual time=0.055..0.056 rows=1 loops=1) - Index Scan using provider_name_pid_rec_stat_idx on provider_alias pa (cost=0.00..8.27 rows=1 width=32) (actual time=0.047..0.047 rows=1 loops=1) Index Cond: (provider_id = 13083101) - Index Scan using provider_provider_id_idx on provider p (cost=0.00..8.58 rows=1 width=389) (actual time=0.005..0.006 rows=1 loops=1) Index Cond: (p.provider_id = 13083101) Total runtime: 0.371 ms QUERY 2 PLAN Merge IN Join (cost=2421241.80..3142039.99 rows=30011 width=2032) (actual time=13778.400..13778.411 rows=2 loops=1) Merge Cond: (*SELECT* 1.provider_id = (13083101)) - Unique (cost=2421241.77..3066486.33 rows=6002275 width=417) (actual time=13778.119..13778.372 rows=110 loops=1) - Sort (cost=2421241.77..2436247.46 rows=6002275 width=417) (actual time=13778.118..13778.163 rows=110 loops=1) Sort Key: *SELECT* 1.provider_id, (NULL::integer), *SELECT* 1.master_id, *SELECT* 1.client_ids, *SELECT* 1.upin, *SELECT* 1.medical_education_number, *SELECT* 1.abmsuid, *SELECT* 1.npi, *SELECT* 1.npi_status_code, *SELECT* 1.cc_id, *SELECT* 1.aoa_id, *SELECT* 1.last_name, *SELECT* 1.first_name, *SELECT* 1.middle_name, *SELECT* 1.suffix, *SELECT* 1.display_name, *SELECT* 1.display_title, *SELECT* 1.nickname, *SELECT* 1.familiar_name, *SELECT* 1.pubmed_name, *SELECT* 1.master_name, *SELECT* 1.display_name_orig, (NULL::text), *SELECT* 1.gender, *SELECT* 1.birth_year, *SELECT* 1.birth_month, *SELECT* 1.birth_day, *SELECT* 1.clinical_interest, *SELECT* 1.research_interest, *SELECT* 1.summary, *SELECT* 1.comments, *SELECT* 1.degree_types, *SELECT* 1.provider_type_ids, *SELECT* 1.provider_status_code, *SELECT* 1.provider_status_year, *SELECT* 1.created, *SELECT* 1.unique_flag, *SELECT* 1.is_locked, *SELECT* 1.provider_standing_code, *SELECT* 1.impt_source_date, *SELECT* 1.input_resource_id, *SELECT* 1.input_source_ids Sort Method: external merge Disk: 423352kB - Append (cost=0.00..596598.30 rows=6002275 width=417) (actual time=0.039..7879.715 rows=1312637 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..543238.96 rows=5994998 width=408) (actual time=0.039..7473.664 rows=1305360 loops=1) - Seq Scan on provider p
[PERFORM] Does FILTER in SEQSCAN short-circuit AND?
Sample code: SELECT * FROM MyTable WHERE foo = 'bar' AND MySlowFunc('foo') = 'bar' Let's say this required a SEQSCAN because there were no indexes to support column foo. For every row where foo 'bar' would the filter on the SEQSCAN short-circuit the AND return false right away, or would it still execute MySlowFunc('foo') ? Thanks! Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Function scan/Index scan to nested loop
Thanks Scott, This is almost always due to caching. First time the data aren't in the cache, second time they are. I had assumed that it was caching, but I don't know from where because of the inexplicable delay. Hardware? O/S (Linux)? DB? From the function, which is IMMUTABLE? I am concerned that there is such a lag between all the index and function scans start/complete times and and the nested loops starting. I have reformatted the SLOW PLAN results below to make them easier to read. Can you tell me if this makes any sense to you? I can understand that EXPLAIN might inject some waste, but the delay being shown here is equivalent to the delay in real query times - I don't think EXPLAIN components would inject 15 second waits... would they? Your row estimates are WAY off. A nested loop might now be the best choice. I tried to run this with set enable_nestloop to off and it built this truly impressively complex plan! However, the cache had already spun up. The thing that makes testing so difficult is that once the caches are loaded, you have to flail around trying to find query parameters that DON'T hit the cache, making debugging difficult. The row estimates being off is a chronic problem with our DB. I don't think the 3000 row ANALYZE is getting a proper sample set and would love to change the strategy, even if at the expense of speed of execution of ANALYZE. I don't know what it is about our setup that makes our PG servers so hard to tune, but I think its time to call the cavalry (gotta find serious PG server tuning experts in NJ). Carlo SLOW PLAN Sort (cost=42869.40..42869.59 rows=77 width=18) (actual time=26316.495..26322.102 rows=9613 loops=1) Sort Key: p.provider_id, zips_in_mile_range.distance Sort Method: quicksort Memory: 1136kB - Nested Loop (cost=0.00..42866.98 rows=77 width=18) (actual time=126.354..26301.027 rows=9613 loops=1) - Nested Loop (cost=0.00..42150.37 rows=122 width=18) (actual time=117.369..15349.533 rows=13247 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..52.50 rows=67 width=40) (actual time=104.196..104.417 rows=155 loops=1) Filter: (zip ''::text) - Index Scan using provider_practice_default_base_zip_country_idx on provider_practice pp (cost=0.00..628.30 rows=2 width=19) (actual time=1.205..98.231 rows=85 loops=155) Index Cond: ((pp.default_country_code = 'US'::bpchar) AND (substr((pp.default_postal_code)::text, 1, 5) = zips_in_mile_range.zip) AND (pp.is_principal = 'Y'::bpchar)) Filter: (COALESCE(pp.record_status, 'A'::bpchar) = 'A'::bpchar) - Index Scan using provider_provider_id_provider_status_code_idx on provider p (cost=0.00..5.86 rows=1 width=4) (actual time=0.823..0.824 rows=1 loops=13247) Index Cond: ((p.provider_id = pp.provider_id) AND (p.provider_status_code = 'A'::bpchar)) Filter: (p.is_visible = 'Y'::bpchar) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Function scan/Index scan to nested loop
Hello all, A query ran twice in succession performs VERY poorly the first time as it iterates through the nested loop. The second time, it rips. Please see SQL, SLOW PLAN and FAST PLAN below. I don't know why these nested loops are taking so long to execute. - Nested Loop (cost=0.00..42866.98 rows=77 width=18) (actual time=126.354..26301.027 rows=9613 loops=1) - Nested Loop (cost=0.00..42150.37 rows=122 width=18) (actual time=117.369..15349.533 rows=13247 loops=1) The loop members appear to be finished quickly. I suspect that the results for the function aren't really as fast as reported, and are actually taking much longer to comeplete returning results. - Function Scan on zips_in_mile_range (cost=0.00..52.50 rows=67 width=40) (actual time=104.196..104.417 rows=155 loops=1) Filter: (zip ''::text) Is this possible? I can't see what other delay there could be. The second time the query runs, the loops are fast: - Nested Loop (cost=0.00..42866.98 rows=77 width=18) (actual time=97.073..266.826 rows=9613 loops=1) - Nested Loop (cost=0.00..42150.37 rows=122 width=18) (actual time=97.058..150.172 rows=13247 loops=1) Since it is fast the second time, I wonder if this is related at all to the function being IMMUTABLE? (Even though it's IMMUTABLE it reads a very static table) This DB is a copy of another DB, on the same server host, same drive but different tablespace. The original query has good performance, and is hit often by the live web server. With the copy - which performs poorly - the query is hit infrequently. Is there any evidence for why the nested loop is slow? Code and plans follow - regards and thanks! Carlo SQL: select pp.provider_practice_id, p.provider_id, distance, pp.is_principal, p.provider_id as sort_order from mdx_core.provider as p join mdx_core.provider_practice as pp on pp.provider_id = p.provider_id join (select * from mdx_core.zips_in_mile_range('75203', 15::numeric) where zip '') as nearby on nearby.zip = substr(pp.default_postal_code, 1, 5) where pp.default_country_code = 'US' and p.provider_status_code = 'A' and p.is_visible = 'Y' and pp.is_principal = 'Y' and coalesce(pp.record_status, 'A') = 'A' order by sort_order, distance SLOW PLAN: Sort (cost=42869.40..42869.59 rows=77 width=18) (actual time=26316.495..26322.102 rows=9613 loops=1) Sort Key: p.provider_id, zips_in_mile_range.distance Sort Method: quicksort Memory: 1136kB - Nested Loop (cost=0.00..42866.98 rows=77 width=18) (actual time=126.354..26301.027 rows=9613 loops=1) - Nested Loop (cost=0.00..42150.37 rows=122 width=18) (actual time=117.369..15349.533 rows=13247 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..52.50 rows=67 width=40) (actual time=104.196..104.417 rows=155 loops=1) Filter: (zip ''::text) - Index Scan using provider_practice_default_base_zip_country_idx on provider_practice pp (cost=0.00..628.30 rows=2 width=19) (actual time=1.205..98.231 rows=85 loops=155) Index Cond: ((pp.default_country_code = 'US'::bpchar) AND (substr((pp.default_postal_code)::text, 1, 5) = zips_in_mile_range.zip) AND (pp.is_principal = 'Y'::bpchar)) Filter: (COALESCE(pp.record_status, 'A'::bpchar) = 'A'::bpchar) - Index Scan using provider_provider_id_provider_status_code_idx on provider p (cost=0.00..5.86 rows=1 width=4) (actual time=0.823..0.824 rows=1 loops=13247) Index Cond: ((p.provider_id = pp.provider_id) AND (p.provider_status_code = 'A'::bpchar)) Filter: (p.is_visible = 'Y'::bpchar) Total runtime: 26327.329 ms FAST PLAN: Sort (cost=42869.40..42869.59 rows=77 width=18) (actual time=278.722..284.326 rows=9613 loops=1) Sort Key: p.provider_id, zips_in_mile_range.distance Sort Method: quicksort Memory: 1136kB - Nested Loop (cost=0.00..42866.98 rows=77 width=18) (actual time=97.073..266.826 rows=9613 loops=1) - Nested Loop (cost=0.00..42150.37 rows=122 width=18) (actual time=97.058..150.172 rows=13247 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..52.50 rows=67 width=40) (actual time=97.013..97.161 rows=155 loops=1) Filter: (zip ''::text) - Index Scan using provider_practice_default_base_zip_country_idx on provider_practice pp (cost=0.00..628.30 rows=2 width=19) (actual time=0.017..0.236 rows=85 loops=155) Index Cond: ((pp.default_country_code = 'US'::bpchar) AND (substr((pp.default_postal_code)::text, 1, 5) = zips_in_mile_range.zip) AND (pp.is_principal = 'Y'::bpchar)) Filter: (COALESCE(pp.record_status, 'A'::bpchar) = 'A'::bpchar) - Index Scan using provider_provider_id_provider_status_code_idx on provider p (cost=0.00..5.86 rows=1 width=4) (actual
Re: [PERFORM] default_statistics_target
HI Greg, Thanks for the insight. How much more of a server's resources will be consumed by an ANALYZE with default_statistics_target = 100? We have two environments hosting the same data. One is our live server, which serves the web site, and this hosts our published data, not more than 200 - 300 tables. PRODUCTION: The data warehouse consisting of our published data, as well as our input resources which are transformed via ETL processes into our published data. It is these input resources which currently consist of about 8,000 tables and growing. Don't really require analysis, as they are typically run once in a linear read when importing.they are typically read linearly, and rarely more than once. They are kept for auditing and rollbacks. LIVE: Hosts just the published data, copied over from the production server. Because the data does not get written to very often, older stats from ANALYZE are likely to still be valid. Our concern is that with the older setting of default_statistics_target = 10 it has not gone deep enough into these tables (numbering in the millios of rows) to really represent the data distribution properly. Given that it looks like you're running 8.3 from past messages I've seen from you, I'd also be concerned that you've overrun your max_fsm_pages, so that VACUUM is growing increasing ineffective for you, and that's contributing to your headache. Below are the config values of our production server (those not listed are those stubbed out) . Sadly, in an attempt to improve the server's performance, someone wiped out all of the changes I had made to date, along with comments indicating previous values, reason for the change, etc. What do they call that again? Oh, yeah. Documentation. # CENTOS 5.4 # Linux mdx_octo 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux # pgsql 8.3.10, 8 CPUs, 48GB RAM # RAID 10, 4 Disks autovacuum = on # Enable autovacuum subprocess? 'on' autovacuum_analyze_scale_factor = 0.05 # fraction of table size before analyze autovacuum_analyze_threshold = 1000 autovacuum_naptime = 1min # time between autovacuum runs autovacuum_vacuum_cost_delay = 50 # default vacuum cost delay for autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum autovacuum_vacuum_threshold = 1000 bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/round checkpoint_segments = 128 # in logfile segments, min 1, 16MB each checkpoint_warning = 290s # 0 is off client_min_messages = debug1 # values in order of decreasing detail: datestyle = 'iso, mdy' default_statistics_target = 250 # range 1-1000 default_text_search_config = 'pg_catalog.english' lc_messages = 'C' # locale for system error message lc_monetary = 'C' # locale for monetary formatting lc_numeric = 'C' # locale for number formatting lc_time = 'C'# locale for time formatting listen_addresses = '*' # what IP address(es) to listen on; log_destination = 'stderr' # Valid values are combinations of log_error_verbosity = verbose # terse, default, or verbose messages log_line_prefix = '%t ' # special values: log_min_error_statement = debug1 # values in order of decreasing detail: log_min_messages = debug1 # values in order of decreasing detail: logging_collector = on # Enable capturing of stderr and csvlog maintenance_work_mem = 256MB max_connections = 100 # (change requires restart) max_fsm_relations = 1000 # min 100, ~70 bytes each max_locks_per_transaction = 128 # min 10 port = 5432# (change requires restart) shared_buffers = 4096MB shared_preload_libraries = '$libdir/plugins/plugin_debugger.so' # (change requires restart) track_counts = on vacuum_cost_delay = 5 # 0-1000 milliseconds wal_buffers = 4MB wal_sync_method = open_sync work_mem = 64MB Carlo Greg Smith g...@2ndquadrant.com wrote in message news:4b9e33af.2020...@2ndquadrant.com... Carlo Stonebanks wrote: The whole topic of messing with stats makes my head spin but I am concerned about some horridly performing queries that have had bad rows estimates and others which always choose seq scans when indexes are available. Reading up on how to improve planner estimates, I have seen references to default_statistics_target being changed from the default of 10 to 100. Our DB is large, with thousands of tables Stop right there for a second. Are you sure autovacuum is working well here? With thousands of tables, it wouldn't surprise me to discover your planner estimates are wrong because there hasn't been a recent enough ANALYZE on the relevant tables. If you haven't already, take a look at pg_stat_user_tables and make sure that tables that have the bad estimates have actually been analyzed recently. A look at the live/dead row counts there should be helpful as well. If all that's recent, but you're still getting bad estimates, only then would I suggest trying an increase to default_statistics_target. In the situation where
[PERFORM] Got that new server, now it's time for config!
Here we go again! Based on recommendations made here, I got my client to migrate off of our Windows 2003 Server x64 box to a new Linux box. # CENTOS 5.4 # Linux mdx_octo 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux # pgsql 8.3.10, 8 CPUs, 48GB RAM # RAID 10, 4 Disks Below are the config values of this production server (those not listed are those stubbed out) . Sadly, in an attempt to improve the server's performance, someone wiped out all of the changes I had made to date, along with comments indicating previous values, reason for the change, etc. This is a data warehouse production server, used for ETL. 500 GB database, approx 8000 tables and growing, although the vast majority of them are the original import resource tables and are rarely accessed. The actual core data is about 200 tables, consisting of millions of rows. Data importing and content management is done via a 15,000 line TCL import scripts and application base (as this is ETL with fuzzy logic, not just COPY... FROM...) . So, we have the hardware, we have the O/S - but I think our config leaves much to be desired. Typically, our planner makes nad decisions, picking seq scan over index scan, where index scan has a better result. Can anyone see any obvious faults? Carlo autovacuum = on autovacuum_analyze_scale_factor = 0.05 autovacuum_analyze_threshold = 1000 autovacuum_naptime = 1min autovacuum_vacuum_cost_delay = 50 autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 1000 bgwriter_lru_maxpages = 100 checkpoint_segments = 128 checkpoint_warning = 290s client_min_messages = debug1 datestyle = 'iso, mdy' default_statistics_target = 250 default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' listen_addresses = '*' log_destination = 'stderr' log_error_verbosity = verbose log_line_prefix = '%t ' log_min_error_statement = debug1 log_min_messages = debug1 logging_collector = on maintenance_work_mem = 256MB max_connections = 100 max_fsm_relations = 1000 max_locks_per_transaction = 128 port = 5432 shared_buffers = 4096MB shared_preload_libraries = '$libdir/plugins/plugin_debugger.so' track_counts = on vacuum_cost_delay = 5 wal_buffers = 4MB wal_sync_method = open_sync work_mem = 64MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] default_statistics_target
Hi people, The whole topic of messing with stats makes my head spin but I am concerned about some horridly performing queries that have had bad rows estimates and others which always choose seq scans when indexes are available. Reading up on how to improve planner estimates, I have seen references to default_statistics_target being changed from the default of 10 to 100. Our DB is large, with thousands of tables, but the core schema has about 100 tables and the typical row counts are in the millions of rows for the whole table. We have been playing endless games with tuning this server - but with all of the suggestions, I don't think the issue of changing default_statistics_target has ever come up. Realizing that there is a performance hit associated with ANALYZE, are there any other downsides to increasing this value to 100, and is this a common setting for large DBs? Thanks, Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Hi Greg, As a follow up to this suggestion: I don't see effective_cache_size listed there. If that's at the default, I wouldn't be surprised that you're seeing sequential scans instead of indexed ones far too often. I found an article written by you http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm and thought this was pretty useful, and especially this comment: effective_cache_size should be set to how much memory is leftover for disk caching after taking into account what's used by the operating system, dedicated PostgreSQL memory, and other applications. If it's set too low, indexes may not be used for executing queries the way you'd expect. Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting. You might find a better estimate by looking at your operating system's statistics. On UNIX-like systems, add the free+cached numbers from free or top. On Windows see the System Cache in the Windows Task Manager's Performance tab. Are these values to look at BEFORE starting PG? If so, how do I relate the values returned to setting the effective_cache_size values? Carlo PS Loved your 1995 era pages. Being a musician, it was great to read your recommendations on how to buy these things called CD's. I Googled the term, and they appear to be some ancient precursor to MP3s which people actually PAID for. What kind of stone were they engraved on? ;-D -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
yeah, the values are at the end. Sounds like your vacuum settings are too non-aggresive. Generally this is the vacuum cost delay being too high. Of course, I have to ask: what's the down side? Yes! You can run vacuum verbose against the regular old postgres database (or just create one for testing with nothing in it) and you'll still get the fsm usage numbers from that! So, no need to run it against the big db. However, if regular vacuum verbose couldn't finish in a week, then you've likely got vacuum and autovacuum set to be too timid in their operation, and may be getting pretty bloated as we speak. Once the fsm gets too blown out of the water, it's quicker to dump and reload the whole DB than to try and fix it. My client reports this is what they actualyl do on a monthly basis. And the numbers are in: NOTICE: number of page slots needed (4090224) exceeds max_fsm_pages (204800) HINT: Consider increasing the configuration parameter max_fsm_pages to a value over 4090224. Gee, only off by a factor of 20. What happens if I go for this number (once again, what's the down side)? Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: New server to improve performance on our large and busy DB - advice? (v2)
* A database that is of small to medium size (5 - 10 GB)? * Around 10 clients that perform constant write operations to the database (UPDATE/INSERT) * Around 10 clients that occasionally read from the database * Around 6000 tables in your database * A problem with tuning it all * Migration to new hardware and/or OS Is this all correct? Actually, the tablespace is very large, over 500GB. However, the actualy production DB is 200GB. First thing that is noticeable is that you seem to have way too few drives in the server - not because of disk space required but because of speed. You didn't say what type of drives you have and you didn't say what you would consider desirable performance levels, but off hand (because of the 10 clients perform constant writes part) you will probably want at least 2x-4x more drives. With only 4 drives, RAID 10 is the only thing usable here. What would be the optimum RAID level and number of disks? 2) Which Windows OS would you recommend? (currently 2008 x64 Server) Would not recommend Windows OS. We may be stuck as my client is only considering Red Hat Linux (still waiting to find out which version). If it turns out that this limitatt doesn't give better than a marginal improvement, then there is no incentive to create more complications in what is basically a Windows shop (although the project manager is a Linux advocate). Most importantly, you didn't say what you would consider desirable performance. The hardware and the setup you described will work, but not necessarily fast enough. Once again, it seems as though we are down to the number of drives... Have you tried decreasing random_page_cost in postgresql.conf? Or setting (as a last resort) enable_seqscan = off? In critical code sections, we do - we have stored procedures and code segments which save the current enable_seqscan value, set it to off (local to the transaction), then restore it after the code has run. Our current planner cost values are all default. Is this what you would choose for a Intel Core 2 Quads Quad with 48 GB RAM? # - Planner Cost Constants - #seq_page_cost = 1.0 # measured on an arbitrary scale #random_page_cost = 4.0 # same scale as above #cpu_tuple_cost = 0.01 # same scale as above #cpu_index_tuple_cost = 0.005 # same scale as above #cpu_operator_cost = 0.0025 # same scale as above #effective_cache_size = 128MB Thanks for the help, Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Hi Scott, Sorry for the very late reply on this post, but I'd like to follow up. The reason that I took so long to reply was due to this suggestion: Run vacuum verbose to see if you're overrunning the max_fsm_pages settings or the max_fsm_relations. My first thought was, does he mean against the entire DB? That would take a week! But, since it was recommended, I decided to see what would happen. So, I just ran VACUUM VERBOSE. After five days, it was still vacuuming and the server admin said they needed to bounce the server, which means the command never completed (I kept the log of the progress so far, but don't know if the values you needed would appear at the end. I confess I have no idea how to relate the INFO and DETAIL data coming back with regards to max_fsm_pages settings or the max_fsm_relations. So, now my questions are: 1) Did you really mean you wanted VACUUM VERBOSE to run against the entire DB? 2) Given my previous comments on the size of the DB (and my thinking that this is an exceptionally large and busy DB) were you expecting it to take this long? 3) I took no exceptional measures before running it, I didn't stop the automated import processes, I didn't turn off autovacuum. Would this have accounted for the time it is taking to THAT degree? 4) Any other way to get max_fsm_pages settings and max_fsm_relations? Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] New server to improve performance on our large and busy DB - advice?
My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend 2) Which Windows OS would you recommend? (currently 2008 x64 Server) 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) 4) Is this the right PG version for our needs? Thanks, Carlo The details of our use: . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the professional information of 1.3M individuals. . The KB tables related to these 130M individuals are naturally also large . The DB is in a perpetual state of serving TCL-scripted Extract, Transform and Load (ETL) processes . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. . There are few, if any DELETEs . The ETL source data are import tables . The import tables are permanently kept in the data warehouse so that we can trace the original source of any information. . There are 6000+ and counting . The import tables number from dozens to hundreds of thousands of rows. They rarely require more than a pkey index. . Linking the KB to the source import date requires an audit table of 500M rows, and counting. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
Guys, I want to thank you for all of the advice - my client has just made a surprise announcement that he would like to set start from scratch with a new server, so I am afraid that all of this great advice has to be seen in the context of whatever decision is made on that. I am out there, hat-in-hand, looking for advice under the PERFORM post: New server to improve performance on our large and busy DB - advice? Thanks again! Carlo Scott Marlowe scott.marl...@gmail.com wrote in message news:dcc563d11001071740q572cdae2re410788fe790d...@mail.gmail.com... On Thu, Jan 7, 2010 at 2:48 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Doing the updates in smaller chunks resolved these apparent freezes - or, more specifically, when the application DID freeze, it didn't do it for more than 30 seconds. In all likelyhood, this is the OS and the DB thrashing. It might well be checkpoints. Have you tried cranking up checkpoint segments to something like 100 or more and seeing how it behaves then? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend 2) Which Windows OS would you recommend? (currently 2008 x64 Server) 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) 4) Is this the right PG version for our needs? Thanks, Carlo The details of our use: . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the professional information of 1.3M individuals. . The KB tables related to these 130M individuals are naturally also large . The DB is in a perpetual state of serving TCL-scripted Extract, Transform and Load (ETL) processes . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. . There are few, if any DELETEs . The ETL source data are import tables . The import tables are permanently kept in the data warehouse so that we can trace the original source of any information. . There are 6000+ and counting . The import tables number from dozens to hundreds of thousands of rows. They rarely require more than a pkey index. . Linking the KB to the source import date requires an audit table of 500M rows, and counting. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
My bad - I thought I had, so it has been re-posted with a (v2) disclaimer in the title... like THAT will stop the flaming! g Thanks for your patience! Craig James craig_ja...@emolecules.com wrote in message news:4b4f8a49.7010...@emolecules.com... Carlo Stonebanks wrote: Guys, I want to thank you for all of the advice - my client has just made a surprise announcement that he would like to set start from scratch with a new server, so I am afraid that all of this great advice has to be seen in the context of whatever decision is made on that. I am out there, hat-in-hand, looking for advice under the PERFORM post: New server to improve performance on our large and busy DB - advice? You might start this as a new topic with a relevant title, and reiterate your database requirements. Otherwise it will get submerged as just a footnote to your original question. It's really nice to be able to quickly find the new-equipment discussions. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Pretty much everyone thinks their requirements are exceptional. It's funny how infrequently that's actually true. The techniques that favor index-use aren't that unique: collect better stats, set basic parameters correctly, adjust random_page_cost, investigate plans that don't do what you want to figure out why. It's easy to say there's something special about your data rather than follow fundamentals here; I'd urge you to avoid doing that. The odds that the real issue is that you're feeding the optimizer bad data is more likely than most people think, which brings us to: I understand that. And the answer is usually to go and do and ANALYZE manually (if it isn't this, it will be some dependency on a set-returning stored function we wrote before we could specify the rows and cost). My question is really - why do I need this constant intervention? When we rarely do aggregates, when our queries are (nearly) always single row queries (and very rarely more than 50 rows) out of tables that have hundreds of thousands to millions of rows, what does it take to NOT have to intervene? WHich brings me to your next point: I don't see effective_cache_size listed there. If that's at the default, I wouldn't be surprised that you're seeing sequential scans instead of indexed ones far too often. Nice to know - I suspect someone has been messing around with stuff they don't understand. I do know that after some screwing around they got the server to the point that it wouldn't restart and tried to back out until it would. max_connections = 200 work_mem = 512MB This is a frightening combination by the way. Looks like it's connected to the above issue. The real max connection value is 1/10th of that. Thanks Greg! Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
crank it up more and delay the checkpoints as much as possible during these updates. 64 segments is already 1024M. We have 425M rows, total table size is 78GB, so we can imagine a worst case UPDATE write is less than 200 bytes * number of rows specified in the update (is that logic correct?). Inerestingly, the total index size is 148GB, twice that of the table, which may be an indication of where the performance bottleneck is. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
I thought that post mentioned that the plan was one statement in an iteration, and that the cache would have been primed by a previous query checking whether there were any rows to update. If that was the case, it might be worthwhile to look at the entire flow of an iteration. This is the only SQL query in the code in question - the rest of the code manages the looping and commit. The code was copied to PgAdminIII and values written in for the WHERE clause. In order for me to validate that rows would have been updated, I had to run a SELECT with the same WHERE clause in PgAdminIII first to see how many rows would have qualified. But this was for testing purposes only. The SELECT statement does not exist in the code. The vast majority of the rows that will be processed will be updated as this is a backfill to synch the old rows with the values being filled into new columns now being inserted. Also, if you ever responded with version and configuration information, I missed it. This is hosted on a new server the client set up so I am waiting for the exact OS and hardware config. PG Version is PostgreSQL 8.3.6, compiled by Visual C++ build 1400, OS appears to be Windows 2003 x64 Server. More than anything, I am more concerned with the long-term use of the system. This particular challenge with the 500M row update is one thing, but I am concerned about the exceptional effort required to do this. Is it REALLY this exceptional to want to update 500M rows of data in this day and age? Or is the fact that we are considering dumping and restoring and dropping indexes, etc to do all an early warning that we don't have a solution that is scaled to the problem? Config data follows (I am assuming commented values which I did not include are defaulted). Carlo autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
Got an explain analyze of the delete query? UPDATE mdx_core.audit_impt SET source_table = 'mdx_import.'||impt_name WHERE audit_impt_id = 31941 AND audit_impt_id = 319400010 AND coalesce(source_table, '') = '' Index Scan using audit_impt_pkey on audit_impt (cost=0.00..92.63 rows=1 width=608) (actual time=0.081..0.244 rows=10 loops=1) Index Cond: ((audit_impt_id = 31941) AND (audit_impt_id = 319400010)) Filter: ((COALESCE(source_table, ''::character varying))::text = ''::text) Total runtime: 372.141 ms Hard to tell how reliable these numbers are, because the caches are likely spun up for the WHERE clause - in particular, SELECT queries have been run to test whether the rows actually qualify for the update. The coalesce may be slowing things down slightly, but is a necessary evil. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
If it is possible to lock this audit table exclusively (may be during off peak hours) I would look into - create new_audit_table as select col1, col2, col3 ... col9, 'new_col_value' from old_audit_table; - create all indexes - drop old_audit_table - rename new_audit_table to old_audit_table That is probably the fasted method you can do, even if you have to join the new_col_value from an extra helper-table with the correspondig id. Remeber, databases are born to join. This has all been done before - the production team was crippled while they waited for this and the SECOND the table was available again, they jumped on it - even though it meant recreating the bare minimum of the indexes. You could also try to just update the whole table in one go, it is probably faster than you expect. Possibly, but with such a large table you have no idea of the progress, you cannot interrupt it without rolling back everything. Worse, you have applications stalling and users wanting to know what is going on - is the OS and the DB/MVCC trashing while it does internal maintenance? Have you reached some sort of deadlock condition that you can't see because the server status is not helpful with so many uncommitted pending updates? And of course, there is the file bloat. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
What is the rationale behind this? How about doing 10k rows in 1 update, and committing every time? When we did 10K updates, the application would sometimes appear to have frozen, and we were concerned that there was a deadlock condition because of the number of locked rows. While we may have the patience to sit around and wait five minutes to see if the update would continue, we couldn't risk having other applications appear frozen if that was the case. In fact, there is no reason for any user or application to write to the same records we are writing to - but the audit table is read frequently. We are not explicitly locking anything, or writing any additional code to arbitrate the lcoking model anywhere -it's all default UPDATE and SELECT syntax. Doing the updates in smaller chunks resolved these apparent freezes - or, more specifically, when the application DID freeze, it didn't do it for more than 30 seconds. In all likelyhood, this is the OS and the DB thrashing. We have since modified the updates to process 1000 rows at a time with a commit every 10 pages. Just this morning, though, the IS manager asked me to stop the backfill because of the load affect on other processes. You could try making the condition on the ctid column, to not have to use the index on ID, and process the rows in physical order. An interesting idea, if I can confirm that the performance problem is because of the WHERE clause, not the UPDATE. 'where new_column is null' to the conditions. Already being done, albeit with a coalesce(val, '') = '' - it's quite possible that this is hurting the WHERE clause; the EXPLAIN shows the table using the pkey and then filtering on the COALESCE as one would expect. Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
It might well be checkpoints. Have you tried cranking up checkpoint segments to something like 100 or more and seeing how it behaves then? No I haven't, althugh it certainly make sense - watching the process run, you get this sense that the system occaisionally pauses to take a deep, long breath before returning to work frantically ;D Checkpoint_segments are currently set to 64. The DB is large and is on a constant state of receiving single-row updates as multiple ETL and refinement processes run continuously. Would you expect going to 100 or more to make an appreciable difference, or should I be more aggressive? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
Already done in an earlier post, Kevin - I have included it again below. As you can see, it's pretty well wqhat you would expect, index scan plus a filter. One note: updates where no rows qualify run appreciably faster than the ones that do. That is, the update itself appears to be consuming a good deal of the processing time. This may be due to the 6 indexes. UPDATE mdx_core.audit_impt SET source_table = 'mdx_import.'||impt_name WHERE audit_impt_id = 31941 AND audit_impt_id = 319400010 AND coalesce(source_table, '') = '' Index Scan using audit_impt_pkey on audit_impt (cost=0.00..92.63 rows=1 width=608) (actual time=0.081..0.244 rows=10 loops=1) Index Cond: ((audit_impt_id = 31941) AND (audit_impt_id = 319400010)) Filter: ((COALESCE(source_table, ''::character varying))::text = ''::text) Total runtime: 372.141 ms Kevin Grittner kevin.gritt...@wicourts.gov wrote in message news:4b46256302250002d...@gw.wicourts.gov... Carlo Stonebanks stonec.regis...@sympatico.ca wrote: An interesting idea, if I can confirm that the performance problem is because of the WHERE clause, not the UPDATE. If you could show EXPLAIN ANALYZE output for one iteration, with related queries and maybe more info on the environment, it would take most of the guesswork out of things. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Massive table (500M rows) update nightmare
Our DB has an audit table which is 500M rows and growing. (FYI the objects being audited are grouped semantically, not individual field values). Recently we wanted to add a new feature and we altered the table to add a new column. We are backfilling this varchar(255) column by writing a TCL script to page through the rows (where every update is a UPDATE ... WHERE id = x AND id x+10 and a commit is performed after every 1000 updates statement, i.e. every 1 rows.) We have 10 columns, six of which are indexed. Rough calculations suggest that this will take two to three weeks to complete on an 8-core CPU with more than enough memory. As a ballpark estimate - is this sort of performance for an 500M updates what one would expect of PG given the table structure (detailed below) or should I dig deeper to look for performance issues? As always, thanks! Carlo Table/index structure: CREATE TABLE mdx_core.audit_impt ( audit_impt_id serial NOT NULL, impt_session integer, impt_version character varying(255), impt_name character varying(255), impt_id integer, target_table character varying(255), target_id integer, target_op character varying(10), note text, source_table character varying(255), CONSTRAINT audit_impt_pkey PRIMARY KEY (audit_impt_id) ) CREATE INDEX audit_impt_impt_id_idx ON mdx_core.audit_impt USING btree (impt_id); CREATE INDEX audit_impt_impt_name ON mdx_core.audit_impt USING btree (impt_name, impt_version); CREATE INDEX audit_impt_session_idx ON mdx_core.audit_impt USING btree (impt_session); CREATE INDEX audit_impt_source_table ON mdx_core.audit_impt USING btree (source_table); CREATE INDEX audit_impt_target_id_idx ON mdx_core.audit_impt USING btree (target_id, audit_impt_id); CREATE INDEX audit_impt_target_table_idx ON mdx_core.audit_impt USING btree (target_table, target_id, audit_impt_id); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to avoid hashjoin and mergejoin
Larry, Considering these recommendations, let's try setting shared_buffers to 2GB and work_mem to 16MB. The thing is that work_mem is per connection, and if we get too aggressive and we get a lot of simultaneous users, we can potentially eat up a lot of memory. So 2GB + (100 * 16MB) = 3.6GB total RAM eaten up under peak load for these two values alone. If we wanted to get more aggressive, we can raise work_mem. Carlo -Original Message- From: Scott Marlowe [mailto:[EMAIL PROTECTED] Sent: November 1, 2007 5:39 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to avoid hashjoin and mergejoin On 11/1/07, Carlo Stonebanks [EMAIL PROTECTED] wrote: I am comparing the same query on two different PG 8.2 servers, one Linux (8GB RAM) and one Windows (32GB RAM). Both have similar drives and CPU's. The Windows posgrestsql.config is pretty well tuned but it looks like someone had wiped out the Linux config so the default one was re-installed. All performance-related memory allocation values seem to be set to the defaults, but mods have been made: max_connections = 100 and shared_buffers = 32MB. The performance for this query is terrible on the Linux server, and good on the Windows server - presumably because the original Linux PG config has been lost. This query requires: that set enable_seqscan to 'off'; Have you run analyze on the server yet? A few general points on performance tuning. With 8.2 you should set shared_buffers to a pretty big chunk of memory on linux, up to 25% or so. That means 32 Meg shared buffers is REAL low for a linux server. Try running anywhere from 512Meg up to 1Gig for starters and see if that helps too. Also turn up work_mem to something like 16 to 32 meg then restart the server after making these changes. Then give us the explain analyze output with all the enable_xxx set to ON. summary: analyze, increase shared_buffers and work_mem, give us explain analyze.
[PERFORM] How to avoid hashjoin and mergejoin
I am comparing the same query on two different PG 8.2 servers, one Linux (8GB RAM) and one Windows (32GB RAM). Both have similar drives and CPU's. The Windows posgrestsql.config is pretty well tuned but it looks like someone had wiped out the Linux config so the default one was re-installed. All performance-related memory allocation values seem to be set to the defaults, but mods have been made: max_connections = 100 and shared_buffers = 32MB. The performance for this query is terrible on the Linux server, and good on the Windows server - presumably because the original Linux PG config has been lost. This query requires: that set enable_seqscan to 'off'; Still, the Linux server did not create the same, fast plan as the Windows server. In order to get the same plan we had to: set enable_hashjoin to 'off'; set enable_mergejoin to 'off'; The plans were now similar, using nested loops and bitmapped heap scans. Now the Linux query outperformed the Windows query. Question: Can anyone tell me which config values would have made PG select hash join and merge joins when the nested loop/bitmap heap scan combination was faster? Carlo ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to avoid hashjoin and mergejoin
This is just about never the appropriate way to solve a performance problem, as it will inevitably create performance problems in other queries. In this particular example, this was done to force the query on the Linux box to use the same plan as on the Windows box to prove that - once the correct plan was chosen - the Linux box could at least MATCH the Windows box. That being said, I should mention this: we take certain core queries that we know are essential and embed them in a plpgsql SRF's that save the various settings, modify them as required for the query, then restore them after the rows are returned. Does this address the problem you mentioned? What I'm wondering is whether the tables have been ANALYZEd recently, This is SUPPOSED to be done after a restore - but I will verify, thanks for the reminder. and also whether there are any nondefault postgresql.conf settings in use on the other server. Definitely - this is what alerted me to the fact that there was something suspicious. We try to optimize our memory settings (based on various tuning docs, advice from here, and good old trial-and-error). Since the new config had barely any changes, I knew something was wrong. Carlo -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: November 1, 2007 5:42 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to avoid hashjoin and mergejoin Carlo Stonebanks [EMAIL PROTECTED] writes: Still, the Linux server did not create the same, fast plan as the Windows server. In order to get the same plan we had to: set enable_hashjoin to 'off'; set enable_mergejoin to 'off'; This is just about never the appropriate way to solve a performance problem, as it will inevitably create performance problems in other queries. What I'm wondering is whether the tables have been ANALYZEd recently, and also whether there are any nondefault postgresql.conf settings in use on the other server. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] REPOST: Nested loops row estimates always too high
Has anyone offered any answers to you? No one else has replied to this post. Ow Mun Heng [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Thu, 2007-09-20 at 11:02 -0400, Carlo Stonebanks wrote: (SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR ANY ANSWERS HERE) I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high. Are the over-estimations below significant, and if so, is this an indication of a general configuration problem? Sounds much like the issue I was seeing as well. Unique (cost=67605.91..67653.18 rows=4727 width=16) (actual time=8634.618..8637.918 rows=907 loops=1) You can to rewrite the queries to individual queries to see it if helps. In my case, I was doing select a.a,b.b,c.c from (select a from x where) a --- Put as a SRF left join ( select b from y where ) b --- Put as a SRF on a.a = b.a ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Acceptable level of over-estimation?
Is there a rule of thumb about when the planner's row estimates are too high? In particular, when should I be concerned that planner's estimated number of rows estimated for a nested loop is off? By a factor of 10? 100? 1000? Carlo ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] REPOST: Nested loops row estimates always too high
My problem is that I think that SRF's are causing my problems. The SRF's gets an automatic row estimate of 1000 rows. Add a condition to it, the planner guesses 333 rows. Even at 333, this is an overestimate of the number of rows returned. I'm really disappointed - SRF's are a great way to place the enterprise's db-centric business logic at the server. Carlo -Original Message- From: Ow Mun Heng [mailto:[EMAIL PROTECTED] Sent: September 24, 2007 8:51 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] REPOST: Nested loops row estimates always too high On Mon, 2007-09-24 at 14:12 -0400, Carlo Stonebanks wrote: Has anyone offered any answers to you? No one else has replied to this post. Overestimate of selectivity. I guess it's mainly due to my one to many table relationships. I've tried everything from concatenated join columns and indexing it to creating all sorts of indexes and splitting the (1) tables into multiple tables and upping the indexes to 1000 and turning of nestloops/enabling geqo/ tweaking the threshold/effort and much much more (as much as I was asked to/suggested to) but still no luck. In my case, the individual queries were fast. So, In then end, I made a SRF and used the SRFs to join each other. This worked better. Ow Mun Heng [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Thu, 2007-09-20 at 11:02 -0400, Carlo Stonebanks wrote: (SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR ANY ANSWERS HERE) I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high. Are the over-estimations below significant, and if so, is this an indication of a general configuration problem? Sounds much like the issue I was seeing as well. Unique (cost=67605.91..67653.18 rows=4727 width=16) (actual time=8634.618..8637.918 rows=907 loops=1) You can to rewrite the queries to individual queries to see it if helps. In my case, I was doing select a.a,b.b,c.c from (select a from x where) a --- Put as a SRF left join ( select b from y where ) b --- Put as a SRF on a.a = b.a ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] REPOST: Nested loops row estimates always too high
(SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR ANY ANSWERS HERE) I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high. Are the over-estimations below significant, and if so, is this an indication of a general configuration problem? Carlo select pp.provider_id, pp.provider_practice_id, nearby.distance from mdx_core.provider_practice as pp join mdx_core.facility as f on f.facility_id = pp.facility_id join (select * from mdx_core.zips_in_mile_range('08820', 10) where zip '') as nearby on f.default_country_code = 'US' and f.default_postal_code = nearby.zip and pp.facility_address_id is NULL union select pp.provider_id, pp.provider_practice_id, nearby.distance from mdx_core.provider_practice as pp join mdx_core.facility_address as fa on fa.facility_address_id = pp.facility_address_id join mdx_core.address as a on a.address_id = fa.address_id join (select * from mdx_core.zips_in_mile_range('08820', 10) where zip '') as nearby on a.country_code = 'US' and a.postal_code = nearby.zip Unique (cost=67605.91..67653.18 rows=4727 width=16) (actual time=8634.618..8637.918 rows=907 loops=1) - Sort (cost=67605.91..67617.73 rows=4727 width=16) (actual time=8634.615..8635.651 rows=907 loops=1) Sort Key: provider_id, provider_practice_id, distance - Append (cost=0.00..67317.41 rows=4727 width=16) (actual time=176.056..8632.429 rows=907 loops=1) - Nested Loop (cost=0.00..38947.07 rows=3143 width=16) (actual time=176.054..7867.962 rows=872 loops=1) - Nested Loop (cost=0.00..11520.79 rows=8121 width=12) (actual time=169.372..3041.010 rows=907 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..15.00 rows=333 width=40) (actual time=151.479..151.671 rows=66 loops=1) Filter: (zip ''::text) - Index Scan using facility_country_postal_code_idx on facility f (cost=0.00..34.25 rows=24 width=15) (actual time=4.969..43.740 rows=14 loops=66) Index Cond: ((f.default_country_code = 'US'::bpchar) AND ((f.default_postal_code)::text = zips_in_mile_range.zip)) - Index Scan using provider_practice_facility_idx on provider_practice pp (cost=0.00..3.36 rows=1 width=12) (actual time=4.915..5.316 rows=1 loops=907) Index Cond: (f.facility_id = pp.facility_id) Filter: (facility_address_id IS NULL) - Nested Loop (cost=0.00..28323.07 rows=1584 width=16) (actual time=170.310..762.472 rows=35 loops=1) - Nested Loop (cost=0.00..7791.77 rows=1579 width=12) (actual time=170.289..612.579 rows=36 loops=1) - Nested Loop (cost=0.00..2595.96 rows=712 width=12) (actual time=167.017..354.261 rows=29 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..15.00 rows=333 width=40) (actual time=150.188..150.312 rows=66 loops=1) Filter: (zip ''::text) - Index Scan using address_country_postal_code_address_idx on address a (cost=0.00..7.73 rows=2 width=17) (actual time=2.483..3.086 rows=0 loops=66) Index Cond: ((a.country_code = 'US'::bpchar) AND ((a.postal_code)::text = zips_in_mile_range.zip)) - Index Scan using facility_address_address_idx on facility_address fa (cost=0.00..7.15 rows=12 width=8) (actual time=7.652..8.901 rows=1 loops=29) Index Cond: (a.address_id = fa.address_id) - Index Scan using provider_practice_facility_address_idx on provider_practice pp (cost=0.00..12.80 rows=16 width=12) (actual time=4.156..4.158 rows=1 loops=36) Index Cond: (fa.facility_address_id = pp.facility_address_id) Total runtime: 8639.066 ms ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] REPOST: Performance improves only after repeated VACUUM/ANALYZE
(SORRY FOR THE REPOST, I DON'T SEE MY ORIGINAL QUESTION OR ANY ANSWERS HERE) My client publishes an edition of their DB from his production site to his hosted web/db server. This is done by FTPing a backup of the DB to his hosting provider. Immediately after a publication (restore to web/db server) we immediately run VACUUM ANALYZE to make sure the statistics and row estimates are correct. The problem is, after this initial VACUUM ANALYZE, the row estimates in query plans are off by several orders of magnitude. For example, a disastrous plan was created because the planner estimated 4K rows when in fact it returned 980K rows. Sometimes - a day or two later - the plans return to normal and row estimates are closer to realistic values. Guessing that there may be background events that are correcting the row estimates over time, I ran an ANALYZE on the DB - and sure enough - the row estimates corrected themselves. The puzzling thing is, there have been no writes of any sort to the data - there is no reason for the stats to have changed. I believe that a VACUUM may not be necessary for a newly restored DB, but I assumed that VACUUM ANALYZE and ANALYZE have the same net result. Am I wrong? If I am not wrong (i.e. VACUUM ANALYZE and ANALYZE should produce the same results) why would the performance improve on a DB that has seen no transactional activity only after the SECOND try? PG 8.2.4 on RH LINUX 1GB RAM SCSI RAID 1 Carlo ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Nested loops row estimates always too high
I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high. If this is always occurring, is this an indication of a general configuration problem? Carlo select pp.provider_id, pp.provider_practice_id, nearby.distance from mdx_core.provider_practice as pp join mdx_core.facility as f on f.facility_id = pp.facility_id join (select * from mdx_core.zips_in_mile_range('08820', 10) where zip '') as nearby on f.default_country_code = 'US' and f.default_postal_code = nearby.zip and pp.facility_address_id is NULL union select pp.provider_id, pp.provider_practice_id, nearby.distance from mdx_core.provider_practice as pp join mdx_core.facility_address as fa on fa.facility_address_id = pp.facility_address_id join mdx_core.address as a on a.address_id = fa.address_id join (select * from mdx_core.zips_in_mile_range('08820', 10) where zip '') as nearby on a.country_code = 'US' and a.postal_code = nearby.zip Unique (cost=67605.91..67653.18 rows=4727 width=16) (actual time=8634.618..8637.918 rows=907 loops=1) - Sort (cost=67605.91..67617.73 rows=4727 width=16) (actual time=8634.615..8635.651 rows=907 loops=1) Sort Key: provider_id, provider_practice_id, distance - Append (cost=0.00..67317.41 rows=4727 width=16) (actual time=176.056..8632.429 rows=907 loops=1) - Nested Loop (cost=0.00..38947.07 rows=3143 width=16) (actual time=176.054..7867.962 rows=872 loops=1) - Nested Loop (cost=0.00..11520.79 rows=8121 width=12) (actual time=169.372..3041.010 rows=907 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..15.00 rows=333 width=40) (actual time=151.479..151.671 rows=66 loops=1) Filter: (zip ''::text) - Index Scan using facility_country_postal_code_idx on facility f (cost=0.00..34.25 rows=24 width=15) (actual time=4.969..43.740 rows=14 loops=66) Index Cond: ((f.default_country_code = 'US'::bpchar) AND ((f.default_postal_code)::text = zips_in_mile_range.zip)) - Index Scan using provider_practice_facility_idx on provider_practice pp (cost=0.00..3.36 rows=1 width=12) (actual time=4.915..5.316 rows=1 loops=907) Index Cond: (f.facility_id = pp.facility_id) Filter: (facility_address_id IS NULL) - Nested Loop (cost=0.00..28323.07 rows=1584 width=16) (actual time=170.310..762.472 rows=35 loops=1) - Nested Loop (cost=0.00..7791.77 rows=1579 width=12) (actual time=170.289..612.579 rows=36 loops=1) - Nested Loop (cost=0.00..2595.96 rows=712 width=12) (actual time=167.017..354.261 rows=29 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..15.00 rows=333 width=40) (actual time=150.188..150.312 rows=66 loops=1) Filter: (zip ''::text) - Index Scan using address_country_postal_code_address_idx on address a (cost=0.00..7.73 rows=2 width=17) (actual time=2.483..3.086 rows=0 loops=66) Index Cond: ((a.country_code = 'US'::bpchar) AND ((a.postal_code)::text = zips_in_mile_range.zip)) - Index Scan using facility_address_address_idx on facility_address fa (cost=0.00..7.15 rows=12 width=8) (actual time=7.652..8.901 rows=1 loops=29) Index Cond: (a.address_id = fa.address_id) - Index Scan using provider_practice_facility_address_idx on provider_practice pp (cost=0.00..12.80 rows=16 width=12) (actual time=4.156..4.158 rows=1 loops=36) Index Cond: (fa.facility_address_id = pp.facility_address_id) Total runtime: 8639.066 ms
[PERFORM] Performance improves only after repeated VACUUM/ANALYZE
My client publishes an edition of their DB from his production site to his hosted web/db server. This is done by FTPing a backup of the DB to his hosting provider. Immediately after a publication (restore to web/db server) we immediately run VACUUM ANALYZE to make sure the statistics and row estimates are correct. The problem is, after this initial VACUUM ANALYZE, the row estimates in query plans are off by several orders of magnitude. For example, a disastrous plan was created because the planner estimated 4K rows when in fact it returned 980K rows. Sometimes - a day or two later - the plans return to normal and row estimates are closer to realistic values. Guessing that there may be background events that are correcting the row estimates over time, I ran an ANALYZE on the DB - and sure enough - the row estimates corrected themselves. The puzzling thing is, there have been no writes of any sort to the data - there is no reason for the stats to have changed. I believe that a VACUUM may not be necessary for a newly restored DB, but I assumed that VACUUM ANALYZE and ANALYZE have the same net result. Am I wrong? If I am not wrong (i.e. VACUUM ANALYZE and ANALYZE should produce the same results) why would the performance improve on a DB that has seen no transactional activity only after the SECOND try? PG 8.2.4 on RH LINUX 1GB RAM SCSI RAID 1 Carlo ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Query works when kludged, but would prefer best practice solution
Hi all, Please see the section marked as PROBLEM in ORIGINAL QUERY plan below. You can see it's pretty slow. Oddly enough, an index for facility_address_id is available but not being used, but I suspect it's questionable whether it would be an improvement. I knew that the filter was best applied to the results of the join - my attempts to restructure the query with subqueries, etc didn't fool the planner - it always figured out a plan that had this problem SEQ SCAN + FILTER in it. Finally, I hid the condition from the planner with a coalesce function - see SOLUTION in the KLUDGED QUERY plan below. Sure enough, a new plan appeared with a remarkable performance improvement! The purpose of this query is to find facilities within a geographical area when the complete address data is missing (hence the facility_address_id is NULL). PG is 8.4.2 on RH linux server with 1GB ram, HDD is RAID 1. I don't like kludging like this - so any and all help or advice is appreciated! Carlo ORIGINAL QUERY select pp.provider_id, pp.provider_practice_id, nearby.distance from mdx_core.provider_practice as pp join mdx_core.facility as f on f.facility_id = pp.facility_id join (select * from mdx_core.zips_in_mile_range('08820', 10)) as nearby on f.default_country_code = 'US' and f.default_postal_code = nearby.zip where facility_address_id is null Hash Join (cost=30258.99..107702.53 rows=9438 width=16) (actual time=169.516..3064.188 rows=872 loops=1) Hash Cond: (pp.facility_id = f.facility_id) PROBLEM: - Seq Scan on provider_practice pp (cost=0.00..74632.55 rows=724429 width=12) (actual time=0.039..1999.457 rows=728396 loops=1) Filter: (facility_address_id IS NULL) - Hash (cost=29954.15..29954.15 rows=24387 width=12) (actual time=156.668..156.668 rows=907 loops=1) - Nested Loop (cost=0.00..29954.15 rows=24387 width=12) (actual time=149.891..155.343 rows=907 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..12.50 rows=1000 width=40) (actual time=149.850..149.920 rows=66 loops=1) - Index Scan using facility_country_postal_code_idx on facility f (cost=0.00..29.64 rows=24 width=15) (actual time=0.015..0.048 rows=14 loops=66) Index Cond: ((f.default_country_code = 'US'::bpchar) AND ((f.default_postal_code)::text = zips_in_mile_range.zip)) Total runtime: 3065.338 ms KLUDGED QUERY select pp.provider_id, pp.provider_practice_id, nearby.distance from mdx_core.provider_practice as pp join mdx_core.facility as f on f.facility_id = pp.facility_id join (select * from mdx_core.zips_in_mile_range('08820', 10)) as nearby on f.default_country_code = 'US' and f.default_postal_code = nearby.zip and coalesce(pp.facility_address_id, -1) = -1 Nested Loop (cost=0.00..112618.87 rows=180 width=16) (actual time=149.680..167.261 rows=872 loops=1) - Nested Loop (cost=0.00..29954.15 rows=24387 width=12) (actual time=149.659..155.018 rows=907 loops=1) - Function Scan on zips_in_mile_range (cost=0.00..12.50 rows=1000 width=40) (actual time=149.620..149.698 rows=66 loops=1) - Index Scan using facility_country_postal_code_idx on facility f (cost=0.00..29.64 rows=24 width=15) (actual time=0.015..0.045 rows=14 loops=66) Index Cond: ((f.default_country_code = 'US'::bpchar) AND ((f.default_postal_code)::text = zips_in_mile_range.zip)) SOLUTION - - Index Scan using provider_practice_facility_idx on provider_practice pp (cost=0.00..3.38 rows=1 width=12) (actual time=0.007..0.009 rows=1 loops=907) Index Cond: (f.facility_id = pp.facility_id) Filter: (COALESCE(facility_address_id, -1) = -1) - Total runtime: 168.275 ms ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query works when kludged, but would prefer best practice solution
Well, there goes my dream of getting a recommendation that will deliver a blinding insight into how to speed up all of my queries a thousand-fold. Thanks Merlin! -Original Message- From: Merlin Moncure [mailto:[EMAIL PROTECTED] Sent: September 17, 2007 8:03 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query works when kludged, but would prefer best practice solution On 9/17/07, Carlo Stonebanks [EMAIL PROTECTED] wrote: Hi all, Please see the section marked as PROBLEM in ORIGINAL QUERY plan below. You can see it's pretty slow. Oddly enough, an index for facility_address_id is available but not being used, but I suspect it's questionable whether it would be an improvement. This looks like it might be the problem tom caught and rigged a solution to: http://people.planetpostgresql.org/dfetter/index.php?/archives/134-PostgreSQ L-Weekly-News-September-03-2007.html (look fro band-aid). If that's the case, the solution is to wait for 8.2.5 (coming soon). merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Query works when kludged, but would prefer best practice solution
Thanks, it worked. Client happy. Big bonus in the mail. -Original Message- From: Merlin Moncure [mailto:[EMAIL PROTECTED] Sent: September 17, 2007 8:18 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query works when kludged, but would prefer best practice solution On 9/17/07, Carlo Stonebanks [EMAIL PROTECTED] wrote: Well, there goes my dream of getting a recommendation that will deliver a blinding insight into how to speed up all of my queries a thousand-fold. that's easy...delete your data! :-) merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1
Can anyone answer this for me: Although I realize my client's disk subsystem (SCSI/RAID Smart Array E200 controller using RAID 1) is less than impressive - is the default setting of 4.0 realistic or could it be lower? Thanks! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance on 8CPU's and 32GB of RAM
Wow - it's nice to hear someone say that... out loud. Thanks, you gave me hope! -Original Message- From: James Mansion [mailto:[EMAIL PROTECTED] Sent: September 6, 2007 4:55 PM To: Carlo Stonebanks Cc: Scott Marlowe; Alvaro Herrera; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance on 8CPU's and 32GB of RAM Carlo Stonebanks wrote: Isn't it just easier to assume that Windows Server can't do anything right? ;-) Well, avoiding the ;-) - people do, and its remarkably foolish of them. Its a long-standing whinge that many people with a UNIX-background seem to just assume that Windows sucks, but you could run 40,000 sockets from a single Win32 process for a while and some well-known UNIX systems would still struggle to do this, libevent or no. Admitedly, the way a Win32 app is architected would be rather different from a typical POSIX one. Windows has been a cheap target bt its remarkably adequate and the TPC results speak for themselves. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance on 8CPU's and 32GB of RAM
If what you mean is that pg has a design that's heavily oriented towards things that tend to be cheap on POSIX and doesn't use the core Win32 features effectively, then let's track that as an optimisation opportunity for the Win32 port. Isn't it just easier to assume that Windows Server can't do anything right? ;-) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance on 8CPU's and 32GB of RAM
Unfortunately, LINUX is not an option at this time. We looked into it; there is no *NIX expertise in the enterprise. However, I have raised this issue in various forums before, and when pressed no one was willing to say that *NIX *DEFINITELY* outperforms Windows for what my client is doing (or if it did outperform Windows, that it would outperform so significantly that it merited the move). Was this incorrect? Can my client DEFINITELY expect a significant improvement in performance for what he is doing? DISK subsystem reports: SCSI/RAID Smart Array E200 controller using RAID 1. -Original Message- From: Scott Marlowe [mailto:[EMAIL PROTECTED] Sent: September 4, 2007 7:15 PM To: Alvaro Herrera Cc: Carlo Stonebanks; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance on 8CPU's and 32GB of RAM On 9/4/07, Alvaro Herrera [EMAIL PROTECTED] wrote: Carlo Stonebanks wrote: A client is moving their postgresql db to a brand new Windows 2003 x64 server with 2 quad cores and 32GB of RAM. It is a dedicated server to run 8.2.4. Large shared_buffers and Windows do not mix. Perhaps you should leave the shmem config low, so that the kernel can cache the file pages. Egads, I'd completely missed the word Windows up there. I would highly recommend building the postgresql server on a unixish OS. Even with minimum tuning, I'd expect the same box running linux or freebsd to stomp windows pretty heavily in the performance department. But yeah, the I/O, that's the big one. If it's just a single or a couple of IDE drives, it's not gonna be able to handle much load. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance on 8CPU's and 32GB of RAM
Right, additionally NTFS is really nothing to use on any serious disc array. Do you mean that I will not see any big improvement if I upgrade the disk subsystem because the client is using NTFS (i.e. Windows) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance on 8CPU's and 32GB of RAM
Large shared_buffers and Windows do not mix. Perhaps you should leave the shmem config low, so that the kernel can cache the file pages. Is there a problem BESIDES the one that used to cause windows to fail to allocate memory in blocks larger than 1.5GB? The symptom of this problem was that postgresql would just refuse to restart. Microsoft released a patch for this problem and we can now start postgresql with larger shared buffers. If this is indeed the problem that you refer to - and it has indeed been solved by Microsoft - is there a down side to this? It sounds like you will need a huge lot of vacuuming effort to keep up. Maybe you should lower autovac scale factors so that your tables are visited more frequently. A vacuum_delay of 40 sounds like too much though. Does autovacuum not impede performance while it is vacuuming a table? ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Performance on 8CPU's and 32GB of RAM
A client is moving their postgresql db to a brand new Windows 2003 x64 server with 2 quad cores and 32GB of RAM. It is a dedicated server to run 8.2.4. The server typically will have less than 10 users. The primary use of this server is to host a database that is continuously being updated by data consolidation and matching software software that hits the server very hard. There are typically eight such processes running at any one time. The software extensively exploits postgresql native fuzzy string for data matching. The SQL is dynamically generated by the software and consists of large, complex joins. (the structure of the joins change as the software adapts its matching strategies). I would like to favour the needs of the data matching software, and the server is almost exclusivly dedicated to PostgreSQL. I have made some tentative modifications to the default postgres.config file (see below), but I don't think I've scratched the surface of what this new system is capable of. Can I ask - given my client's needs and this new, powerful server and the fact that the server typically has a small number of extremely busy processes, what numbers they would change, and what the recommendations would be? Thanks! Carlo max_connections = 100 shared_buffers = 10 work_mem = 100 max_fsm_pages = 204800 max_fsm_relations = 1500 vacuum_cost_delay = 40 bgwriter_lru_maxpages = 100 bgwriter_all_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 effective_cache_size = 375000 stats_command_string = on stats_start_collector = on stats_row_level = on autovacuum = on autovacuum_naptime = 1min autovacuum_vacuum_threshold = 500 autovacuum_analyze_threshold = 250 autovacuum_vacuum_scale_factor = 0.2 autovacuum_analyze_scale_factor = 0.1 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Fast tsearch2, trigram matching on short phrases
I have read that trigram matching (similarity()) performance degrades when the matching is on longer strings such as phrases. I need to quickly match strings and rate them by similiarity. The strings are typically one to seven words in length - and will often include unconventional abbreviations and misspellings. I have a stored function which does more thorough testing of the phrases, including spelling correction, abbreviation translation, etc... and scores the results - I pick the winning score that passes a pass/fail constant. However, the function is slow. My solution was to reduce the number of rows that are passed to the function by pruning obvious mismatches using similarity(). However, trigram matching on phrases is slow as well. I have experimented with tsearch2 but I have two problems: 1) I need a score so I can decide if match passed or failed. trigram similarity() has a fixed result that you can test, but I don't know if rank() returns results that can be compared to a fixed value 2) I need an efficient methodology to create vectors based on trigrams, and a way to create an index to support it. My tsearch2 experiment with normal vectors used gist(text tsvector) and an on insert/update trigger to populate the vector field. Any suggestions on where to go with this project to improve performance would be greatly appreciated. Carlo ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Fast tsearch2, trigram matching on short phrases
In December I had tried this; it caused a tremendous slowdown in our system. I have avoided it since then. Do you expect pg_trgm to work with phrases? OI had read a post earlier from an earlier support question that suggested that it I SHOULD expect performance to degrade and that pg_trgrm was oriented towards word mathcing, not phrase matching. Carlo Steinar H. Gunderson [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Wed, Aug 22, 2007 at 12:02:54PM -0400, Carlo Stonebanks wrote: Any suggestions on where to go with this project to improve performance would be greatly appreciated. I'm a bit unsure from reading your mail -- have you tried pg_trgm with a GiST index? /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Fast tsearch2, trigram matching on short phrases
Hi Oleg, you didn't show us explain analyze of your select. I didn't because I didn't expect any reaction to it - my understanding is that trigram matching for phrases is not recommended because of the performance. Do you believe that I SHOULD expect good performance from trigram matching on phrases (a sopposed to words)? Carlo ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Fast tsearch2, trigram matching on short phrases
The problem is in idea, not in performance. Oh, I think we both agree on that! ;-D This is why I didn't post any EXPLAINs or anything like that. I thought the problem was in the entire method of how to best zero in on the set of records best suited for closer analysis by my phrase-matching function. Since you asked about an EXPLAIN ANALYZE, I put together some results for for you. I added a pg_trgm index to the table to show the performance of GiST indexes, and did another based on exclusively on similarity(). But I don't think that you will be surprised by what you see. As you say, the problem is in the idea - but no matter what, I need to be able to match phrases that will have all sorts of erratic abbreviations and misspellings - and I have to do it at very high speeds. I would appreciate any suggestions you might have. Carlo select similarity('veterans''s affairs', name) as sim, name from institution where name % 'veterans''s affairs' order by sim desc Sort (cost=4068.21..4071.83 rows=1446 width=23) (actual time=4154.962..4155.006 rows=228 loops=1) Sort Key: similarity('veterans''s affairs'::text, (name)::text) - Bitmap Heap Scan on institution (cost=75.07..3992.31 rows=1446 width=23) (actual time=4152.825..4154.754 rows=228 loops=1) Recheck Cond: ((name)::text % 'veterans''s affairs'::text) - Bitmap Index Scan on institution_name_trgm_idx (cost=0.00..74.71 rows=1446 width=0) (actual time=4152.761..4152.761 rows=228 loops=1) Index Cond: ((name)::text % 'veterans''s affairs'::text) Total runtime: 4155.127 ms select name from institution where similarity('veterans''s affairs', name) 0.5 order by similarity('veterans''s affairs', name) 0.5 Sort (cost=142850.08..144055.17 rows=482036 width=23) (actual time=12603.745..12603.760 rows=77 loops=1) Sort Key: (similarity('veterans''s affairs'::text, (name)::text) 0.5::double precision) - Seq Scan on institution (cost=0.00..97348.81 rows=482036 width=23) (actual time=2032.439..12603.370 rows=77 loops=1) Filter: (similarity('veterans''s affairs'::text, (name)::text) 0.5::double precision) Total runtime: 12603.818 ms ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Is Vacuum/analyze destroying my performance?
Matthew O'Connor matthew@zeut.net wrote in message news:[EMAIL PROTECTED] Just a wild guess, but the performance problem sounds like maybe as your data changes, eventually the planner moves some query from an index scan to a sequential scan, do you have any details on what queries are taking so long when things are running slow? You can turn on the GUC var log_min_duration_statement and see what queries are slow and then manually check them with an explain analyze, that might help. Matt This is pretty well what I think is happening - I expect all queries to eventually move from seq scans to index scans. I actually have a SQL logging opion built into the import app. I just can't figure out how the planner can be so wrong. We are running a 4 CPU server (two dual core 3.2 GHz Xeons) with 4GB RAM and Windows Server 2003 x64 and a PERC RAID subsystem (I don't know the RAID type). I know that the metrics for the planner can be changed - is the default config for postgesql not suitable for our setup? For this server, we would like to be optimised for high speed over a few connections, rather than the classic balanced speed over many connections. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Is Vacuum/analyze destroying my performance?
I have always been frustrated by the wildly erratic performance of our postgresql 8 server. We run aprogram that does heavy data importing via a heuristics-based import program. Sometime records being imported would just fly by, sometimes they would crawl. The import program imports records from a flat table and uses heuristics to normalise and dedupe. This is done via a sequence of updates and inserts bracketed by a start-end transaction. At a certain checkpoint representing about 1,000,000 rows read and imported, I ran a vacuum/analyze on all of the tables in the target schema. To my horror, performance reduced to less than TEN percent of what it was befor the vacuum/analyze. I thought that turning autovacuum off and doing my own vacuuming would improve performance, but it seems to be killing it. I have since turned autovacuum on and am tearing my hair out wathcing the imported records crawl by. I have tried vacuuming the entire DB as well as rebuilding indexes. Nothing. Any ideas what could have happened? What is the right thing to do? Carlo ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Is Vacuum/analyze destroying my performance?
Update on this issue, I solved my problem by doing the following: 1) Stopped the import, and did a checkpoint backup on my import target schema 2) Dropped the import target schema 3) Restored a backup from a previous checkpoint when the tables were much smaller 4) Performed a VACUUM/ANALYZE on all of the tables in the import target schema in that smaller state 5) Dropped the import target schema again 6) Restored the checkpoint backup of the larger data set referred to in step 1 7) Rstarted the import from where it left off The result: the import is flying again, with 10-20 times the performance. The import runs as 4 different TCL scripts in parallel, importing difernt segments of the table. The problem that I have when the import runs at this speed is that I hve to constantly watch for lock-ups. Previously I had reported that when these multiple processes are running at high speed, PostgreSQL occasionally freezes one or more of the processes by never retutning from a COMMIT. I look at the target tables, and it seems that the commit has gone through. This used to be a disaster because Ithought I had to restart every frozen proess by killing the script and restarting at the last imported row. Now I have found a way to un-freeze the program: I find the frozen process via PgAdmin III and send a CANCEL. To my surprise, the import continues as i nothing happened. Still incredibly inconvenient and laborious, but at least it's a little less tedious. Could these two problems - the weird slowdowns after a VACUUM/ANALYZE and the frequent lockups when the import process is running quickly - be related? Carlo ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] commit so slow program looks frozen
Ben Trewern [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] It might be worth turning off hyperthreading if your Xeons are using it. There have been reports of this causing inconsistent behaviour with PostgreSQL. Yes, this issue comes up often - I wonder if the Woodcrest Xeons resolved this? Have these problems been experienced on both Linux and Windows (we are running Windows 2003 x64) Carlo ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] commit so slow program looks frozen
I do think we need some better instrumentation for this kind of thing. Well, one thing's for sure - I have little other information to offer. The problem is that the lockups occur after hours of operation and thousands of rows being digested (which is the nature of the program). If better instrumentation implies tools to inpsect the sate of the db server's process and to know what it's waiting for from the OS, I agree. Then again, I can't even tell you whether the postgres process is at fault or the TCL interface - which would be odd, because it's one fo the most mature interfaces postgres has. So, here's a thought: is there any way for me to inspect the state of a postgres process to see if it's responsive - even if it's serving another connection? Carlo ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] commit so slow program looks frozen
I can just see the postgresql group getting together at the next O'Reilley's conference and creating that band. And it will all be your fault. Finally, a chance for me to wear my black leather pants. A context switch storm is when your machine spends more time trying to figure out what to do than actually doing anything. The CPU spends most it's time switching between programs than running them. Is thatl likely on a new 4 CPU server that has no clients connected and that is only running four (admittedly heavy) TCL data load scripts? Seeing as PostgreSQL runs one thread / process per connection, it's pretty unlikely that the problem here is one hungry thread. Do all four CPUs show busy, or just one? Do you have a way of measuring how much time is spent waiting on I/O on a windows machine like top / vmstat does in unix? Before optimising the queries, all four CPU's were pinned to max performance (that's why I only run four imports at a time). After opimisation, all four CPU's are busy, but usage is spikey (which looks more normal), but all are obviously busy. I have this feeling that when an import app freezes, one CPU goes idle while the others stay busy - I will confirm that with the next import operation. I suspect that the server has the Xeon processors that were of a generation which PostgreSQL had a problem with - should a postgresql process be able to distrivute its processing load across CPU's? (i.e. When I see one CPU at 100% while all others are idle?) Note that if you have an import process that needs a big chunk of memory, you can set just that one connection to use a large setting and leave the default smaller. Total memory usage is below the max available. Each postgresql process takes up 500MB, there are four running and I have 4GB of RAM. Carlo ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] commit so slow program looks frozen
This is pretty interesting - where can I read more on this? Windows isn't actually hanging, one single command line window is - from its behaviour, it looks like the TCL postgresql package is waiting for pg_exec to come back from the commit (I believe the commit has actually gone through). It could even be that there's something wrong with the TCL package, but from my understanding it is one of the most complete interfaces out there - which is weird, because TCL seems to be the most unpopular language in the community. Caro Rocco Altier [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I seem to remember Oleg/Teodor recently reporting a problem with Windows hanging on a multi-processor machine, during a heavy load operation. In their case it seemed like a vacuum would allow it to wake up. They did commit a patch that did not make it into the last minor version for lack of testing. Perhaps you could see if that patch might work for you, which would also help ease the argument against the patches lack of testing. -rocco ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] commit so slow program looks frozen
when it happens, make sure to query pg_locks and see what is going on there lock issues are not supposed to manifest on a commit, which releases locks, but you never know. There aren't any pedning locks (assuming that pgAdmin is using pg_locks to display pendin glocks). There have been reports of insonsistent lock ups on windows (espeically multi-processor) which you might be experiencing. Make sure you have the very latest version of pg 8.1.x. Also consider checking out 8.2 and see if you can reproduce the behavior there...this will require compiling postgresql. Are these associated with any type of CPU? Carlo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] commit so slow program looks frozen
(I tried this question on the interface forum and got no result, but I don't know how to tell if it's an interface issue or not) I have a TCL app which typically takes hours to complete. I found out that it is taking longer than it should because it occasionally stalls inexplicably (for tens of minute at a time) then usually continues. There are a minimum of four apps running at the same time, all reading different sections of the same table, all writing to the same db and the same tables. The other apps seem unaffected by the one app that freezes. This happens running pg_exec $conn commit from within a TCL script on a client app. The delays are so long that I used to think the app was hopelessly frozen. By accident, I left the app alone in its frozen state and came back a good deal later and seen that it was running again. Sometimes I decide it *IS* frozen and have to restart. Because Ctrl-C will not cause the script to break, it appears the app is stuck in non-TCL code (either waiting for postgres or stuck in the interface code?) The application loops through an import file, reading one row at a time, and issues a bunch of inserts and updates to various tables. There's a simple pg_exec $conn start transaction at the beginning of the loop and the commit at the end. The commit actually appears to be going through. There are no messages of any significance in the log. There do not appear to be any outstanding locks or transactions. I am not doing any explicit locking, all transaction settings are set to default. Any thoughts on the cause and possible solutions would be appreciated. Carlo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] commit so slow program looks frozen
You may try to figure out what's the process doing (the backend obviously, not the frontend (Tcl) process) by attaching to it with strace. It's so sad when us poor Windows guys get helpful hints from people assume that we're smart enough to run *NIX... ;-) Maybe it's swamped by a context switch storm (but in that case, probably the other processes would be affected as well). What is a context switch storm? (and what a great name for a heavy metal rock band!) Interestingly enough, last night (after the original post) I watched three of the processes slow down, one after the other - and then stall for so long that I had assumed they had frozen. They were all stalled on a message that I had put in the script that indicated they had never returned from a commit. I have looked into this, and I believe the commits are actually going through. The remaining 4th process continued to run, and actually picked up speed as the CPU gave its cycles over. The Windows task manager shows the postgresql processes that (I assume) are associated with the stalled processes as consuming zero CPU time. Sometimes I have seen all of the apps slow down and momentarrily freeze at the same time... but then continue. I have autovacuum off, although stats_row_level and stats_start_collector remain on (I thought these were only relevant if autovacuum was on). I have seen the apps slow down (and perhaps stall) when specifical tables have vacuum/analyze running, and that makes sense. I did notice that on one occasion a frozen app came back to life after I shut down EMS PostgreSQL manager in another session. Maybe a coincidence, or maybe an indication that the apps are straining resources... on a box with two twin-core XEONs and 4GB of memory? Mind you, the config file is confgiured for the database loading phase weare in now - with lots of resources devoted to a few connections. I wouldn't expect it to be stuck on locks, because if it's only on commit, then it probably has all the locks it needs. But try to see if you can find something not granted in pg_locks that it may be stuck on. Looking at the pgadmin server status pages, no locks or transactions are pending when this happens. Carlo ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] commit so slow program looks frozen
I have a question for you: did you have a long running query keeping open a transaction? I've just noticed the same problem here, but things cleaned up immediately when I aborted the long-running transaction. No, the only processes are from those in the import applications themselves: short transactions never lasting more than a fraction of a second. Carlo
Re: [PERFORM] Is ODBC that slow?
Try Command Prompt's ODBC driver. Lately it has been measured to be consistently faster than psqlODBC. http://projects.commandprompt.com/public/odbcng Thanks, I tried this, but via Access it always reports a login (username/password) to db failure. However, this a an Alpha - is there an official release I should be waiting for? It's not clear to me whether this is a commercial product or not. Carlo ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Is ODBC that slow?
carlo: please, please, get your mail server to quit telling me your mailbox is full :) Merlin, sorry about that. This is the first I've heard of it. Carlo ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Sorry, I didn'tpoint it out because an earlier post included the query with documentation - that post got lost... or at least *I* can't see it. The other half of the union renders the facilities that DO have addresses, and because of the performance problem (which I have finally sorted out by creating indexes which are more explicit - my oversight, really!) The original query was a slightly more complex outer join, which I then decomposed to an explicit union with two halves - one half handling the explicit facility_address_id is null portion, the other half handling the is not null portion (implicitly because of the normal join between facility and facility_address). I hadn't considered the not exists option - it's obvious when you look at the sub-query by itself, but didn't strike me before I broke it out of the union and you mentioned it. I was just under th eimpression that getting this sub-query to work would have produced the most clear, straightforward ANALYZE results. Carlo Shaun Thomas [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Monday 16 October 2006 16:37, Carlo Stonebanks wrote: The facility_address_id is null statement is necessary, as this is a sub-query from a union clause and I want to optimise the query with the original logic intact. The value is not hard coded to true but rather to null. Heh, you neglect to mention that this query is discovering faculty who do *not* have an address entry, which makes the is null a major necessity. With that, how did a not exists (blabla faculty_address blabla) subquery to get the same effect treat you? How about an IN (blabla LIMIT 1) ? -- Shaun Thomas Database Administrator Leapfrog Online 807 Greenwood Street Evanston, IL 60201 Tel. 847-440-8253 Fax. 847-570-5750 www.leapfrogonline.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
you have a two part part key on facility(country code, postal code), right? Well, I'm glad you pointed it out, because I THOUGhT I had created it, but apparently I haven't -- I only noticed that it was missing after I listed all the other indexes. Looks like this query is one of the victims of a db structure corruption I suffered when transferring the schema over from development into production. (Well, that's my excuse and I'm sticking to it!) Thanks for all the help - I've reduced the execution time to 1/10 of its original time. Carlo ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Can you try temporarily disabling bitmap scans and see what comes up? Well, that's slowing everything down. I've got a couple of results, below 1) Bitmap scan off, but seq scan enabled. 2) Bitmap scan and seq scan off 3) Bitmap scan back on, seq scan back on, and a new index created 4) VACUUM VERBOSE on the tables involved 5) Original SQL with original EXPLAIN to show the code that started this. Carlo 1) Bitmap scan off, but seq scan enabled. It created a suprisingly expensive seq scan. Nested Loop Left Join (cost=0.00..34572.43 rows=109 width=71) (actual time=1536.827..1536.827 rows=0 loops=1) Filter: (inner.facility_address_id IS NULL) - Seq Scan on facility f (cost=0.00..34146.91 rows=109 width=71) (actual time=621.100..1536.606 rows=7 loops=1) Filter: ((default_country_code = 'US'::bpchar) AND (((default_postal_code)::text = '14224-1945'::text) OR ((default_postal_code)::text = '14224'::text))) - Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.020..0.023 rows=1 loops=7) Index Cond: (fa.facility_id = outer.facility_id) Total runtime: 1536.957 ms 2) So I turned both bitmap scan and seq scan off - now we get index scans, the performance is suprisingly horrible: Nested Loop Left Join (cost=0.00..39286.55 rows=109 width=71) (actual time=3598.462..3598.462 rows=0 loops=1) Filter: (inner.facility_address_id IS NULL) - Index Scan using facility_pkey on facility f (cost=0.00..38861.03 rows=109 width=71) (actual time=1500.690..3598.201 rows=7 loops=1) Filter: ((default_country_code = 'US'::bpchar) AND (((default_postal_code)::text = '14224-1945'::text) OR ((default_postal_code)::text = '14224'::text))) - Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.024..0.027 rows=1 loops=7) Index Cond: (fa.facility_id = outer.facility_id) Total runtime: 3598.600 ms 3) So I turned bitmap scan back on, seq scan back on, and created an index to EXPLICITLY to satisfy this condition. Iintuitivly, I thought that combinations of other indexes should have satisfied the optimizer, but figured better overkill than nothing. I thought this would solve it - but no. We is using a BRAND NEW INDEX which is unlikely to be corrupt so expensive? Nested Loop Left Join (cost=25300.96..26043.67 rows=110 width=71) (actual time=1339.216..1339.216 rows=0 loops=1) Filter: (inner.facility_address_id IS NULL) - Bitmap Heap Scan on facility f (cost=25300.96..25614.42 rows=110 width=71) (actual time=1339.043..1339.066 rows=7 loops=1) Recheck Cond: (((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) OR ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text))) - BitmapOr (cost=25300.96..25300.96 rows=110 width=0) (actual time=1339.027..1339.027 rows=0 loops=1) - Bitmap Index Scan on facility_facility_country_state_postal_code_idx (cost=0.00..12650.48 rows=55 width=0) (actual time=796.146..796.146 rows=7 loops=1) Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) - Bitmap Index Scan on facility_facility_country_state_postal_code_idx (cost=0.00..12650.48 rows=55 width=0) (actual time=542.873..542.873 rows=0 loops=1) Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text)) - Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.014..0.016 rows=1 loops=7) Index Cond: (fa.facility_id = outer.facility_id) Total runtime: 1339.354 ms 4) VACUUM VERBOSE on the tables involved. Note how much more painful in elapsed time it is to vacuum facility vs facility_address, even though the number of rows is comparable: INFO: vacuuming mdx_core.facility INFO: index facility_pkey now contains 964123 row versions in 3682 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.03s/0.03u sec elapsed 0.18 sec. INFO: index facility_country_state_city_idx now contains 964188 row versions in 7664 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.25s/0.17u sec elapsed 84.14 sec. INFO: index facility_country_state_postal_code_idx now contains 964412 row versions in 7689 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.42s/0.10u sec elapsed 137.12 sec. INFO: index facility_facility_country_state_city_idx now contains 964493 row versions in 6420 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.17s/0.09u sec elapsed 2.23 sec. INFO: index facility_facility_country_state_postal_code_idx now contains 964494 row versions in
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
what is the facility_address_id is null all about? remove it since you hardcode it to true in select. The facility_address_id is null statement is necessary, as this is a sub-query from a union clause and I want to optimise the query with the original logic intact. The value is not hard coded to true but rather to null. Admittedly, it's redundant but I put it there to make sure that I matched up the columns from the other select in the union clause. you have a two part part key on facility(country code, postal code), right? The indexes and constrains are below. If you see redundancy, this was from vain attempts to please the optimiser gods. Carlo ALTER TABLE mdx_core.facility ADD CONSTRAINT facility_pkey PRIMARY KEY(facility_id); CREATE INDEX facility_country_state_city_idx ON mdx_core.facility USING btree (default_country_code, default_state_code, lower(default_city::text)); CREATE INDEX facility_country_state_postal_code_idx ON mdx_core.facility USING btree (default_country_code, default_state_code, default_postal_code); CREATE INDEX facility_facility_country_state_city_idx ON mdx_core.facility USING btree (facility_id, default_country_code, default_state_code, lower(default_city::text)); CREATE INDEX facility_facility_country_state_postal_code_idx ON mdx_core.facility USING btree (facility_id, default_country_code, default_state_code, default_postal_code); Merlin Moncure [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On 10/15/06, Carlo Stonebanks [EMAIL PROTECTED] wrote: that contains full address data */ select f.facility_id, null as facility_address_id, null as address_id, f.facility_type_code, f.name, null as address, f.default_city as city, f.default_state_code as state_code, f.default_postal_code as postal_code, f.default_country_code as country_code, null as parsed_unit from mdx_core.facility as f left outer join mdx_core.facility_address as fa on fa.facility_id = f.facility_id where facility_address_id is null and f.default_country_code = 'US' and (f.default_postal_code = '14224-1945' or f.default_postal_code = '14224') what is the facility_address_id is null all about? remove it since you hardcode it to true in select. you have a two part part key on facility(country code, postal code), right? merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
I think there's 2 things that would help this case. First, partition on country. You can either do this on a table level or on an index level by putting where clauses on the indexes (index method would be the fastest one to test, since it's just new indexes). That should shrink the size of that index noticably. I'm afraid I don't quite understand this, or how to 'partition' this at a table level. Right now, the table consists of ONLY US addresses, so I don't know if I would expect a performance improvement in changing the table or the indexes as the indexes would not reduce anything. The other thing is to try and get the planner to not double-scan the index. If you add the following, I think it will scan the index once for the LIKE, and then just filter whatever it finds to match the other conditions. and f.default_postal_code LIKE '14224%' I did try this - nothing signoificant came from the results (see below) thanks, Carlo explain analyze select f.facility_id, null as facility_address_id, null as address_id, f.facility_type_code, f.name, null as address, f.default_city as city, f.default_state_code as state_code, f.default_postal_code as postal_code, f.default_country_code as country_code, null as parsed_unit from mdx_core.facility as f left outer join mdx_core.facility_address as fa on fa.facility_id = f.facility_id where facility_address_id is null and f.default_country_code = 'US' and f.default_postal_code like '14224%' and (f.default_postal_code = '14224-1945' or f.default_postal_code = '14224') Nested Loop Left Join (cost=26155.38..26481.58 rows=1 width=71) (actual time=554.138..554.138 rows=0 loops=1) Filter: (inner.facility_address_id IS NULL) - Bitmap Heap Scan on facility f (cost=26155.38..26477.68 rows=1 width=71) (actual time=554.005..554.025 rows=7 loops=1) Recheck Cond: (((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) OR ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text))) Filter: ((default_postal_code)::text ~~ '14224%'::text) - BitmapOr (cost=26155.38..26155.38 rows=113 width=0) (actual time=553.983..553.983 rows=0 loops=1) - Bitmap Index Scan on facility_facility_country_state_postal_code_idx (cost=0.00..13077.69 rows=57 width=0) (actual time=313.156..313.156 rows=7 loops=1) Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) - Bitmap Index Scan on facility_facility_country_state_postal_code_idx (cost=0.00..13077.69 rows=57 width=0) (actual time=240.819..240.819 rows=0 loops=1) Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text)) - Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=7) Index Cond: (fa.facility_id = outer.facility_id) Total runtime: 554.243 ms ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Hi Merlin, Well, I'm back. first of all, thanks for your dogged determination to help me out - it is much appreciated. I owe you a beer or twelve. The import has been running for a week. The import program got faster as I tuned things. I capture the dynamic SQL statements generated by the app, as well as an accompanying EXPLAIN - and put it out to an XML file. I turned off seq scan in the config, and ran a trial import. I knew that with seq scan off that if I saw a seq scan in my log, it's because there were no indexes available to satisfy the query - I adjusted accordingly and this worked really well. When the import runs against an empty or small db, it's blisteringly fast (considering that it's a heauristically based process). This proved that it wasn't the app or the SQL connection that was slow. Once again, though, as the data db grows, it slows down. Now it's crawling again. All of the queries appear to be fine, taking advantage of the indexes. There is ONE query, though, that seems to be the troublemaker - the same one I had brought up before. I believe that it is one sub-query that is causing the problem, taking what appears to be 500 to 1000+ms to run every time. (See below). Curiously, it's using index scans, and it really looks like a simple query to me. I am completely baffled. The two tables in question have about 800K rows each - not exactly an incredible number. The EXPLAIN is simple, but the performance is dreadful. All the other queries run much faster than this - does ANYTHING about this query strike you as odd? Carlo /* Find all facilities that do not have full address information but do have default location information that indicates its the facilitiy's US zip code. NULL values cast as columns are placeholders to allow this sub-query to be unioned with another subquery that contains full address data */ select f.facility_id, null as facility_address_id, null as address_id, f.facility_type_code, f.name, null as address, f.default_city as city, f.default_state_code as state_code, f.default_postal_code as postal_code, f.default_country_code as country_code, null as parsed_unit from mdx_core.facility as f left outer join mdx_core.facility_address as fa on fa.facility_id = f.facility_id where facility_address_id is null and f.default_country_code = 'US' and (f.default_postal_code = '14224-1945' or f.default_postal_code = '14224') Nested Loop Left Join (cost=22966.70..23594.84 rows=93 width=71) (actual time=662.075..662.075 rows=0 loops=1) Filter: (inner.facility_address_id IS NULL) - Bitmap Heap Scan on facility f (cost=22966.70..23231.79 rows=93 width=71) (actual time=661.907..661.929 rows=7 loops=1) Recheck Cond: (((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) OR ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text))) - BitmapOr (cost=22966.70..22966.70 rows=93 width=0) (actual time=661.891..661.891 rows=0 loops=1) - Bitmap Index Scan on facility_country_state_postal_code_idx (cost=0.00..11483.35 rows=47 width=0) (actual time=374.284..374.284 rows=7 loops=1) Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) - Bitmap Index Scan on facility_country_state_postal_code_idx (cost=0.00..11483.35 rows=47 width=0) (actual time=287.599..287.599 rows=0 loops=1) Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text)) - Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.014..0.016 rows=1 loops=7) Index Cond: (fa.facility_id = outer.facility_id) Total runtime: 662.203 ms ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Hey Tom, thanks for jumping in. Nothing on TV on a Sunday afternoon? ;-) Appreciate teh input. Here is vacuum verbose output for both the tables in question. Carlo INFO: vacuuming mdx_core.facility INFO: index facility_pkey now contains 832399 row versions in 3179 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.09s/0.04u sec elapsed 0.21 sec. INFO: index facility_country_state_city_idx now contains 832444 row versions in 6630 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.15s/0.07u sec elapsed 43.81 sec. INFO: index facility_country_state_postal_code_idx now contains 832499 row versions in 6658 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.23s/0.07u sec elapsed 0.37 sec. INFO: facility: found 0 removable, 832398 nonremovable row versions in 15029 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.67s/0.32u sec elapsed 44.71 sec. INFO: vacuuming pg_toast.pg_toast_58570311 INFO: index pg_toast_58570311_index now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: pg_toast_58570311: found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. Query returned successfully with no result in 44875 ms. INFO: vacuuming mdx_core.facility_address INFO: index facility_address_pkey now contains 772770 row versions in 2951 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.10s/0.04u sec elapsed 9.73 sec. INFO: index facility_address_address_idx now contains 772771 row versions in 2750 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.04s/0.04u sec elapsed 0.34 sec. INFO: index facility_address_facility_address_address_type_idx now contains 772773 row versions in 3154 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.01s/0.04u sec elapsed 0.06 sec. INFO: facility_address: found 0 removable, 772747 nonremovable row versions in 7969 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.39s/0.18u sec elapsed 10.70 sec. Query returned successfully with no result in 10765 ms. Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Carlo Stonebanks [EMAIL PROTECTED] writes: Curiously, it's using index scans, and it really looks like a simple query to me. I am completely baffled. The two tables in question have about 800K rows each - not exactly an incredible number. The EXPLAIN is simple, but the performance is dreadful. All the other queries run much faster than this - does ANYTHING about this query strike you as odd? Lots of dead rows perhaps? The EXPLAIN estimates look a bit out of line --- 11483 cost units to fetch 47 index entries is an order or two of magnitude higher than it ought to be. The real time also seems to be concentrated in that index scan. What are the physical sizes of the table and index? (VACUUM VERBOSE output for the facility table might tell something.) regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
This didn't work right away, but DID work after running a VACUUM FULL. In other words, i was still stuck with a sequential scan until after the vacuum. I turned autovacuum off in order to help with the import, but was perfoming an ANALYZE with every 500 rows imported. With autovacuum off for imports, how frequently should I VACUUM? Merlin Moncure [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On 10/5/06, Carlo Stonebanks [EMAIL PROTECTED] wrote: do we have an multi-column index on facility_address(facility_id, address_id)? did you run analyze? There is an index on facility_address on facility_id. I didn't create an index on facility_address.address_id because I expected joins to go in the other direction (from facility_address to address). Nor did I create a multi-column index on facility_id, address_id because I had yet to come up with a query that required that. right. well, since you are filtering on address, I would consider added an index on address_id or a multi column on address_id, facility_id (in addition to facility_id). also, I'd consider removing all the explicit joins like this: explain analyze select f.facility_id, fa.facility_address_id, a.address_id, f.facility_type_code, f.name, a.address, a.city, a.state_code, a.postal_code, a.country_code from mdx_core.facility f, mdx_core.facility_address fa, mdx_core.address a where fa.facility_id = f.facility_id and a.address_id = fa.address_id and a.country_code = 'US' and a.state_code = 'IL' and a.postal_code like '60640-5759'||'%' order by facility_id; yet another way to write that where clause is: (fa_address_id, fa.facility_id) = (a.address_id, f.facility_id) and a.country_code = 'US' and a.state_code = 'IL' and a.postal_code like '60640-5759'||'%' order by facility_id; I personally only use explicit joins when doing outer joins and even them push them out as far as possible. I like the row constructor style better because it shows the key relationships more clearly. I don't think it makes a difference in execution (go ahead and try it). If you do make a multi column key on facility_address, though, make sure to put they key fields in left to right order in the row constructor. Try adding a multi key on address_id and facility_id and run it this way. In a proper design you would have a primary key on these fields but with imported data you obviously have to make compromises :). However, I still have a lot to learn about how SQL chooses its indexes, how multi-column indexes are used, and when to use them (other than the obvious - i.e. sort orders or relational expressions which request those columns in one search expression) well, it's kind of black magic but if the database is properly laid out the function usually follows form pretty well. Analyse is actually run every time a page of imported data loads into the client program. This is currently set at 500 rows. ok. merlin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
how did you determine that it is done every 500 rows? this is the The import program pages the import table - it is currently set at 500 rows per page. With each page, I run an ANALYZE. default autovacuum paramater. if you followed my earlier recommendations, you are aware that autovacuum (which also analyzes) is not running during bulk inserts, right? It's intuitivly obvious, but I can't do bulk inserts. It's just not the nature of what we are doing with the data. imo, best way to do big data import/conversion is to: 1. turn off all extra features, like stats, logs, etc done 2. use copy interface to load data into scratch tables with probably all text fields done 3. analyze (just once) I think this doesn't apply in our case, because we aren't doing bulk inserts. 4. use big queries to transform, normalize, etc This is currently being done programmatically. The nature of what we're doing is suited for imperitive, navigational logic rather than declarative, data set logic; just the opposite of what SQL likes, I know! If there's some way to replace thousands of lines of analysis and decision trees with ultrafast queries - great... important feature of analyze is to tell the planner approx. how big the tables are. But the tables grow as the process progresses - would you not want the server to re-evaluate its strategy periodically? Carlo merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
do we have an multi-column index on facility_address(facility_id, address_id)? did you run analyze? There is an index on facility_address on facility_id. I didn't create an index on facility_address.address_id because I expected joins to go in the other direction (from facility_address to address). Nor did I create a multi-column index on facility_id, address_id because I had yet to come up with a query that required that. However, I still have a lot to learn about how SQL chooses its indexes, how multi-column indexes are used, and when to use them (other than the obvious - i.e. sort orders or relational expressions which request those columns in one search expression) Analyse is actually run every time a page of imported data loads into the client program. This is currently set at 500 rows. Carlo explain analyze select f.facility_id, fa.facility_address_id, a.address_id, f.facility_type_code, f.name, a.address, a.city, a.state_code, a.postal_code, a.country_code from mdx_core.facility as f join mdx_core.facility_address as fa on fa.facility_id = f.facility_id join mdx_core.address as a on a.address_id = fa.address_id where (a.country_code, a.state_code, mdx_core.zip_trunc(a.postal_code)) = ('US', 'IL', mdx_core.zip_trunc('60640-5759')) order by facility_id Sort (cost=6474.78..6474.84 rows=25 width=103) (actual time=217.279..217.311 rows=65 loops=1) Sort Key: f.facility_id - Nested Loop (cost=2728.54..6474.20 rows=25 width=103) (actual time=35.828..217.059 rows=65 loops=1) - Hash Join (cost=2728.54..6384.81 rows=25 width=72) (actual time=35.801..216.117 rows=65 loops=1) Hash Cond: (outer.address_id = inner.address_id) - Seq Scan on facility_address fa (cost=0.00..3014.68 rows=128268 width=12) (actual time=0.007..99.072 rows=128268 loops=1) - Hash (cost=2728.50..2728.50 rows=19 width=64) (actual time=33.618..33.618 rows=39 loops=1) - Bitmap Heap Scan on address a (cost=48.07..2728.50 rows=19 width=64) (actual time=2.569..33.491 rows=39 loops=1) Recheck Cond: ((country_code = 'US'::bpchar) AND ((state_code)::text = 'IL'::text)) Filter: (mdx_core.zip_trunc(postal_code) = '60640'::text) - Bitmap Index Scan on address_country_state_zip_trunc_idx (cost=0.00..48.07 rows=3846 width=0) (actual time=1.783..1.783 rows=3554 loops=1) Index Cond: ((country_code = 'US'::bpchar) AND ((state_code)::text = 'IL'::text)) - Index Scan using facility_pkey on facility f (cost=0.00..3.56 rows=1 width=35) (actual time=0.009..0.010 rows=1 loops=65) Index Cond: (outer.facility_id = f.facility_id) Total runtime: 217.520 ms ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Just to clarify: if I expect to join two tables that I expect to benfit from indexed scans, I should create indexes on the joined columns on BOTH sides? Carlo Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Carlo Stonebanks [EMAIL PROTECTED] writes: I didn't create an index on facility_address.address_id because I expected joins to go in the other direction (from facility_address to address). Well, that's your problem right there ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Oh you hate explicit joins too? I started in Oracle and was dismayed to find out what the SQL standard was. I especially miss the simplicity of += outer joins. I'll try adding the address_id index to facility_address and see what I get! Carlo Merlin Moncure [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On 10/5/06, Carlo Stonebanks [EMAIL PROTECTED] wrote: do we have an multi-column index on facility_address(facility_id, address_id)? did you run analyze? There is an index on facility_address on facility_id. I didn't create an index on facility_address.address_id because I expected joins to go in the other direction (from facility_address to address). Nor did I create a multi-column index on facility_id, address_id because I had yet to come up with a query that required that. right. well, since you are filtering on address, I would consider added an index on address_id or a multi column on address_id, facility_id (in addition to facility_id). also, I'd consider removing all the explicit joins like this: explain analyze select f.facility_id, fa.facility_address_id, a.address_id, f.facility_type_code, f.name, a.address, a.city, a.state_code, a.postal_code, a.country_code from mdx_core.facility f, mdx_core.facility_address fa, mdx_core.address a where fa.facility_id = f.facility_id and a.address_id = fa.address_id and a.country_code = 'US' and a.state_code = 'IL' and a.postal_code like '60640-5759'||'%' order by facility_id; yet another way to write that where clause is: (fa_address_id, fa.facility_id) = (a.address_id, f.facility_id) and a.country_code = 'US' and a.state_code = 'IL' and a.postal_code like '60640-5759'||'%' order by facility_id; I personally only use explicit joins when doing outer joins and even them push them out as far as possible. I like the row constructor style better because it shows the key relationships more clearly. I don't think it makes a difference in execution (go ahead and try it). If you do make a multi column key on facility_address, though, make sure to put they key fields in left to right order in the row constructor. Try adding a multi key on address_id and facility_id and run it this way. In a proper design you would have a primary key on these fields but with imported data you obviously have to make compromises :). However, I still have a lot to learn about how SQL chooses its indexes, how multi-column indexes are used, and when to use them (other than the obvious - i.e. sort orders or relational expressions which request those columns in one search expression) well, it's kind of black magic but if the database is properly laid out the function usually follows form pretty well. Analyse is actually run every time a page of imported data loads into the client program. This is currently set at 500 rows. ok. merlin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
can you do explain analyze on the two select queries on either side of the union separatly? the subquery is correctly written and unlikely to be a problem (in fact, good style imo). so lets have a look at both sides of facil query and see where the problem is. Sorry for the delay, the server was down yesterday and couldn't get anything. I have modified the sub-queries a little, trying to get the index scans to fire - all the tables involved here are large enough to benefit from index scans over sequential scans. I am mystified as to why PART 1 is giving me: Seq Scan on facility_address fa (cost=0.00..3014.68 rows=128268 width=12) (actual time=0.007..99.033 rows=128268 loops=1) which I assume is for the: join mdx_core.facility_address as fa on fa.facility_id = f.facility_id Then again, I am not sure how to read the EXPLAIN ANALYSE performance numbers. The other part of the UNION (PART 2) I have also modified, I think it's working nicely. Let me know if I'm mistaken on thinking that! The one remaining problem is that the UNION of these two sub-queries has a column which is a call to a custom TCL function that does a lexical analysis on the results, ranking the result names by their proximity to the imported name. his definitely eats up the performance and I hope that my decision to call this function on the results of the union (assuming union deletes redundent rows) is the correct one. Thanks! Carlo /* PART 1. The redundant expression facility_address_id is NULL was removed because only an OUTER join would have made this meaningful. We use only INNER joins in this sub-query Both facility_address and address have seq scans, even though there is an index for facility_address(facility_id( and an index for address( country_code, postal_code, address). The like operator appears to be making things expensive. This is used because we have to take into account that perhaps the import row is using the 5-number US ZIP, not the 9-number USZIP+4 standard (although this is not the case in this sample). /* explain analyse select f.facility_id, fa.facility_address_id, a.address_id, f.facility_type_code, f.name, a.address, a.city, a.state_code, a.postal_code, a.country_code from mdx_core.facility as f join mdx_core.facility_address as fa on fa.facility_id = f.facility_id join mdx_core.address as a on a.address_id = fa.address_id where a.country_code = 'US' and a.state_code = 'IL' and a.postal_code like '60640-5759'||'%' order by facility_id Sort (cost=6392.50..6392.50 rows=1 width=103) (actual time=189.133..189.139 rows=12 loops=1) Sort Key: f.facility_id - Nested Loop (cost=2732.88..6392.49 rows=1 width=103) (actual time=14.006..188.967 rows=12 loops=1) - Hash Join (cost=2732.88..6388.91 rows=1 width=72) (actual time=13.979..188.748 rows=12 loops=1) Hash Cond: (outer.address_id = inner.address_id) - Seq Scan on facility_address fa (cost=0.00..3014.68 rows=128268 width=12) (actual time=0.004..98.867 rows=128268 loops=1) - Hash (cost=2732.88..2732.88 rows=1 width=64) (actual time=6.430..6.430 rows=3 loops=1) - Bitmap Heap Scan on address a (cost=62.07..2732.88 rows=1 width=64) (actual time=2.459..6.417 rows=3 loops=1) Recheck Cond: ((country_code = 'US'::bpchar) AND ((state_code)::text = 'IL'::text)) Filter: ((postal_code)::text ~~ '60640-5759%'::text) - Bitmap Index Scan on address_country_state_postal_code_address_idx (cost=0.00..62.07 rows=3846 width=0) (actual time=1.813..1.813 rows=3554 loops=1) Index Cond: ((country_code = 'US'::bpchar) AND ((state_code)::text = 'IL'::text)) - Index Scan using facility_pkey on facility f (cost=0.00..3.56 rows=1 width=35) (actual time=0.012..0.013 rows=1 loops=12) Index Cond: (outer.facility_id = f.facility_id) Total runtime: 189.362 ms /* PART 2 - can you see anything that could work faster? */ explain analyse select f.facility_id, null as facility_address_id, null as address_id, f.facility_type_code, f.name, null as address, f.default_city as city, f.default_state_code as state_code, f.default_postal_code as postal_code, f.default_country_code as country_code from mdx_core.facility as f left outer join mdx_core.facility_address as fa on fa.facility_id = f.facility_id where fa.facility_address_id is null and f.default_country_code = 'US' and f.default_state_code = 'IL' and '60640-5759' like f.default_postal_code||'%' Nested Loop Left Join (cost=0.00..6042.41 rows=32 width=73) (actual time=14.923..14.923 rows=0 loops=1) Filter: (inner.facility_address_id IS NULL) - Index Scan using