Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
The partial index is highly leveraged. If every tuple in the table is updated once, that amounts to every tuple in the index being updated 25,000 times. How so? That sounds like O(n_2) behaviour. If the table has 5 million rows while the index has 200 (active) rows at any given time, then to update every row in the table to null and back again would be 100% turn over of the table. But each such change would lead to an addition and then a deletion from the index. So 100% turnover of the table would be a 5 million / 200 = 25,000 fold turn of the index. Sorry, I was being dense. I misread that as: every time a single tuple in the table is updated, the entire index (every row) is updated. Yes, of course your explanation makes sense. There is some code that allows a btree index entry to get killed (and so the slot to be reused) without any vacuum, if a scan follows that entry and finds the corresponding tuple in the table no longer visible to anyone. I have not examined this code, and don't know whether it is doing its job but just isn't enough to prevent the bloat, or if for some reason it is not applicable to your situation. It looks like my solution is going to be a REINDEX invoked from cron, or maybe just every 100k inserts. In terms of trying to improve this behaviour for other PG users in the future, are there any more diagnostics I can do for you? Having found a special case, I'd like to help permanently resolve it if I can. Thanks very much again. Best wishes, Richard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Improve performance for writing
Hello please do not consider this email as an yet another question how to speed up writing. The situation is different: My algorithm stores after the computation the result as tuples in a DB. The tuples in addition to normal values (e.g. a,b) , contains sql statements that fetch values (for instance the geometry attribute) from another table (e.g. orig_table). e.g. INSERT INTO dest_table ( Select a,b, s.geometry,s.length from orig_table s where s.id=? ) The number of inserts depends on the size of the result and vary from 10,000 to 1,000,000. My question is: how can I speed up such inserts? Only COPY statements want work, since I need additional values Insert statements takes long time (even if using Bulk) What do you suggest me in such a situation? Would it be better to perform? - first use COPY to store values in new table - second update the new table with values from origin table thanks for your hints / suggestions cheers Markus -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improve performance for writing
Markus, Have you looked over here: http://www.postgresql.org/docs/9.2/static/populate.html From: markus.innereb...@inf.unibz.it Subject: [PERFORM] Improve performance for writing Date: Thu, 27 Dec 2012 14:10:40 +0100 To: pgsql-performance@postgresql.org Hello please do not consider this email as an yet another question how to speed up writing. The situation is different: My algorithm stores after the computation the result as tuples in a DB. The tuples in addition to normal values (e.g. a,b) , contains sql statements that fetch values (for instance the geometry attribute) from another table (e.g. orig_table). e.g. INSERT INTO dest_table ( Select a,b, s.geometry,s.length from orig_table s where s.id=? ) The number of inserts depends on the size of the result and vary from 10,000 to 1,000,000. My question is: how can I speed up such inserts? Only COPY statements want work, since I need additional values Insert statements takes long time (even if using Bulk) What do you suggest me in such a situation? Would it be better to perform? - first use COPY to store values in new table - second update the new table with values from origin table thanks for your hints / suggestions cheers Markus -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
Date: Wed, 26 Dec 2012 23:03:33 -0500 Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table From: jeff.ja...@gmail.com To: charle...@outlook.com CC: ondrej.iva...@gmail.com; pgsql-performance@postgresql.org On Monday, December 24, 2012, Charles Gomes wrote: I think your performance bottleneck is almost certainly the dynamic SQL. Using C to generate that dynamic SQL isn't going to help much, because it is still the SQL engine that has to parse, plan, and execute it. Jeff, I've changed the code from dynamic to: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $$ DECLARE r_date text; BEGIN r_date = to_char(new.received_time, '_MM_DD'); case r_date when '2012_09_10' then insert into quotes_2012_09_10 values (NEW.*) using new; return; ... However I've got no speed improvement. I need to keep two weeks worth of partitions at a time, that's why all the WHEN statements. The 'using new' and return without argument are syntax errors. When I do a model system with those fixed, I get about 2 fold improvement over the dynamic SQL performance. Even if your performance did not go up, did your CPU usage go down? Perhaps you have multiple bottlenecks all sitting at about the same place, and so tackling any one of them at a time doesn't get you anywhere. I’ve run a small test with the fixes you mentioned and it changed from 1H:20M to, 1H:30M to insert 39600 rows. If there was another bottleneck, performance when targeting the partitions directly would not be twice as fast. I’ve run another long insert test and it takes 4H:15M to complete using triggers to distribute the inserts. When targeting It completes in 1H:55M. That’s both for 70 simultaneous workers with the same data and 118800 rows. The tests that Emmanuel did translating the trigger to C have great performance improvement. While His code is very general and could work for anyone using CHECK’s for triggers. I’m still working on fixing it so it’s compatible with 9.2 So far I’m having a hard time using the C triggers anyway,: ERROR: could not load library /var/lib/pgsql/pg_trigger_example.so: /var/lib/pgsql/pg_trigger_example.so: failed to map segment from shared object: Operation not permitted I will do more reading on it. I think having it to work again can bring some value so more people can be aware of the performance improvement using C instead of PLSQL. How does both the dynamic and the CASE scale with the number of threads? I think you said you had something like 70 sessions, but only 8 CPUs. That probably will do bad things with contention, and I don't see how using more connections than CPUs is going to help you here. If the CASE starts out faster in single thread but then flat lines and the EXECUTE catches up, that suggests a different avenue of investigation than they are always the same. I didn’t see a significant change in CPU utilization, it seems to be a bit less, but not that much, however IO is still idling. Wish postgres could automate the partition process natively like the other sql db. More automated would be nice (i.e. one operation to make both the check constraints and the trigger, so they can't get out of sync), but would not necessarily mean faster. I don't know what you mean about other db. Last time I looked at partitioning in mysql, it was only about breaking up the underlying storage into separate files (without regards to contents of the rows), so that is the same as what postgres does automatically. And in Oracle, their partitioning seemed about the same as postgres's as far as administrative tedium was concerned. I'm not familiar with how the MS product handles it, and maybe me experience with the other two are out of date. The other free sql DB supports a more elaborated scheme, for example: CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE) ENGINE=INNODB PARTITION BY HASH( MONTH(tr_date) ) PARTITIONS 6; It also supports partitioning by RANGE, LIST or KEY. The paid one uses a very similar style:CREATE TABLE dept (deptno NUMBER, deptname VARCHAR(32)) PARTITION BY HASH(deptno) PARTITIONS 16; Also: CREATE TABLE sales ( prod_id NUMBER(6) , cust_id NUMBER , time_id DATE , quantity_sold NUMBER(3) ) PARTITION BY RANGE (time_id) ( PARTITION sales_q1_2006 VALUES LESS THAN (TO_DATE('01-APR-2006','dd-MON-')) TABLESPACE tsa , PARTITION sales_q2_2006 VALUES LESS THAN (TO_DATE('01-JUL-2006','dd-MON-')) TABLESPACE tsb ... Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription:
[PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
On Thursday, December 20, 2012, Jeff Janes wrote: On Thursday, December 20, 2012, Richard Neill wrote: - Bitmap Index Scan on tbl_tracker_exit_state_idx (cost=0.00..8.36 rows=151 width=0) (actual time=7.946..7.946 rows=20277 loops=1) This is finding 100 times more rows than it thinks it will. If that could be fixed, surely this plan would not look as good. But then, it would probably just switch to another plan that is not the one you want, either. I guess the issue here is that the histogram postgres uses to estimate the number of rows that will be found is based on visible rows, and it is correctly estimating the number of visible rows that will be found. And that is the relevant thing to pass up to a higher join for its estimation. But for estimating the number of blocks a given index scan will access, the right thing would be the number of tuples visited, not the number of them found to be visible. So that is where this plan goes systematically wrong. I guess the correct thing would be for postgres to keep two histograms, one of all tuples and one of all visible tuples, and to produce different selectivity estimates for different purposes. But I don't see that change getting made. It is only meaningful in cases where there is a fundamental skew in distribution between visible tuples and invisible-but-as-yet-unvacuumed tuples. I think that that fundamental skew is the source of both the underestimation of the bitmap scan cost, and overestimation of the partial index scan (although I can't get it to overestimate that be anywhere near the amount you were seeing). I still think your best bet is to get rid of the partial index and trade the full one on (parcel_id_code) for one on (parcel_id_code,exit_state). I think that will be much less fragile than reindexing in a cron job. Cheers, Jeff
Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
On 27/12/12 16:17, Jeff Janes wrote: I still think your best bet is to get rid of the partial index and trade the full one on (parcel_id_code) for one on (parcel_id_code,exit_state). I think that will be much less fragile than reindexing in a cron job. So, at the moment, I have 3 indexes: full: parcel_id_code full: exit_state full: parcel_id_code where exit state is null Am I right that when you suggest just a single, joint index (parcel_id_code,exit_state) instead of all 3 of the others, it will allow me to optimally run all of the following? : 1. SELECT * from tbl_tracker WHERE parcel_id_code=22345 AND exit_state IS NULL (this is the one we've been discussing) 2. SELECT * from tbl_tracker where parcel_id_code=44533 3. SELECT * from tbl_tracker where exit_code = 2 (2 and 3 are examples of queries I need to run for other purposes, unrelated to this thread, but which use the other indexes.). Thanks, Richard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
Richard Neill rn...@richardneill.org writes: So, at the moment, I have 3 indexes: full: parcel_id_code full: exit_state full: parcel_id_code where exit state is null Am I right that when you suggest just a single, joint index (parcel_id_code,exit_state) instead of all 3 of the others, I think he was just recommending replacing the first and third indexes. it will allow me to optimally run all of the following? : 1. SELECT * from tbl_tracker WHERE parcel_id_code=22345 AND exit_state IS NULL 2. SELECT * from tbl_tracker where parcel_id_code=44533 3. SELECT * from tbl_tracker where exit_code = 2 You will need an index with exit_state as the first column to make #3 perform well --- at least, assuming that an index is going to help at all anyway. The rule of thumb is that if a query is going to fetch more than a few percent of a table, an index is not useful because it's going to be touching most table pages anyway, so a seqscan will win. I've forgotten now what you said the stats for exit_code values other than null were, but it's quite possible that an index is useless for #3. These considerations are mostly covered in the manual: http://www.postgresql.org/docs/9.2/static/indexes.html regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Le 2012-12-27 à 12:10, Nikolas Everett a écrit : We just upgraded from 8.3 to 9.1 and we're seeing some performance problems. When we EXPLAIN ANALYZE our queries the explain result claim that the queries are reasonably fast but the wall clock time is way way longer. Does anyone know why this might happen? Like so: db=\timing db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123; The plan is sensible. The estimates are sensible. The actual DB time reads like it is very sensible. But the wall clock time is like 11 seconds and the \timing report confirms it. Any ideas? Could you post the actual plans? On both versions? That would help a lot. Also, http://explain.depesz.com/ helps readability. Bye, François -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
On 27/12/12 17:21, François Beausoleil wrote: Le 2012-12-27 à 12:10, Nikolas Everett a écrit : We just upgraded from 8.3 to 9.1 and we're seeing some performance problems. When we EXPLAIN ANALYZE our queries the explain result claim that the queries are reasonably fast but the wall clock time is way way longer. Does anyone know why this might happen? Is it possible you missed an optimisation setting in the migration process? I made that mistake, and much later found performance was somewhat degraded (but surprisingly not as much as I'd expected) by my having failed to set effective_cache_size. Also, if you just did a dump/restore, it might help to run Analyse once (it seems that Analyse is normally run automatically via vacuum, but the first time you insert the data, it may not happen). A side-effect of Analyse it that it will pull all the tables into the OS memory cache (or try to) - which may give significantly faster results (try running the same query twice in succession: it's often 5x faster the 2nd time). HTH, Richard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] backend suddenly becomes slow, then remains slow
On 12/26/2012 11:03 PM, Jeff Janes wrote: On Fri, Dec 14, 2012 at 10:40 AM, Andrew Dunstan andrew.duns...@pgexperts.com wrote: One of my clients has an odd problem. Every so often a backend will suddenly become very slow. The odd thing is that once this has happened it remains slowed down, for all subsequent queries. Zone reclaim is off. There is no IO or CPU spike, no checkpoint issues or stats timeouts, no other symptom that we can see. By no spike, do you mean that the system as a whole is not using an unusual amount of IO or CPU, or that this specific slow back-end is not using an unusual amount? both, really. Could you strace is and see what it is doing? Not very easily, because it's a pool connection and we've lowered the pool session lifetime as part of the amelioration :-) So it's not happening very much any more. The problem was a lot worse that it is now, but two steps have alleviated it mostly, but not completely: much less aggressive autovacuuming and reducing the maximum lifetime of backends in the connection pooler to 30 minutes. Do you have a huge number of tables? Maybe over the course of a long-lived connection, it touches enough tables to bloat the relcache / syscache. I don't know how the autovac would be involved in that, though. Yes, we do indeed have a huge number of tables. This seems a plausible thesis. cheers andrew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Thanks! http://explain.depesz.com/s/yfs Looks like we're running a load of about 6. The machines have two physical cores hyperthreaded to 32 cores. Interesting - the data is stored on nfs on a netapp. We don't seem to have a ton of nfs traffic. Also we've got shared memory set to 48 gigs which is comfortably less than the 146 gigs in the machine. On Thu, Dec 27, 2012 at 12:21 PM, François Beausoleil franc...@teksol.infowrote: Le 2012-12-27 à 12:10, Nikolas Everett a écrit : We just upgraded from 8.3 to 9.1 and we're seeing some performance problems. When we EXPLAIN ANALYZE our queries the explain result claim that the queries are reasonably fast but the wall clock time is way way longer. Does anyone know why this might happen? Like so: db=\timing db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123; The plan is sensible. The estimates are sensible. The actual DB time reads like it is very sensible. But the wall clock time is like 11 seconds and the \timing report confirms it. Any ideas? Could you post the actual plans? On both versions? That would help a lot. Also, http://explain.depesz.com/ helps readability. Bye, François
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
New news - the hot slave seems to be performing as expected with no long pauses. It looks like we're using an archive_timeout of 60 seconds and default checkout_timeout and checkpoint_completion_target. I didn't do any of the research on this. It seems like we're asking postgres to clear all of the dirty buffers every 60 seconds. With 48 gigs of shared buffers we could have quite a few buffers to clear. Is there some place I could check on how all that is going? On Thu, Dec 27, 2012 at 12:45 PM, Nikolas Everett nik9...@gmail.com wrote: p
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Actually that last paragraph doesn't make much sense. Please ignore it. On Thu, Dec 27, 2012 at 12:58 PM, Nikolas Everett nik9...@gmail.com wrote: New news - the hot slave seems to be performing as expected with no long pauses. It looks like we're using an archive_timeout of 60 seconds and default checkout_timeout and checkpoint_completion_target. I didn't do any of the research on this. It seems like we're asking postgres to clear all of the dirty buffers every 60 seconds. With 48 gigs of shared buffers we could have quite a few buffers to clear. Is there some place I could check on how all that is going? On Thu, Dec 27, 2012 at 12:45 PM, Nikolas Everett nik9...@gmail.comwrote: p
Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
On Thursday, December 27, 2012, Richard Neill wrote: On 27/12/12 16:17, Jeff Janes wrote: I still think your best bet is to get rid of the partial index and trade the full one on (parcel_id_code) for one on (parcel_id_code,exit_state). I think that will be much less fragile than reindexing in a cron job. So, at the moment, I have 3 indexes: full: parcel_id_code full: exit_state full: parcel_id_code where exit state is null Am I right that when you suggest just a single, joint index (parcel_id_code,exit_state) instead of all 3 of the others, No, just instead of 1 and 3. You still need an index on (exit_state) in order to efficiently satisfy query 3 below. Alternative, you could keep index 1, and replace 2 and 3 with one on (exit_state, parcel_id_code). And in fact this might be the better way to go, because a big problem you are facing is that the (exit_state) index is looking falsely attractive, and the easiest way to overcome that is to get rid of that index and replace it with one that can do everything that it can do, but more. Theoretically there is technique called loose scan or skip scan which could allow you to make one index, (exit_state, parcel_id_code) to replace all 3 of the above, but postgres does not yet implement that technique. I think there is a way to achieve the same thing using recursive sql. But I doubt it would be worth it, as too much index maintenance is not your root problem. 3. SELECT * from tbl_tracker where exit_code = 2 Cheers, Jeff
[PERFORM] Performance on Bulk Insert to Partitioned Table
On Wednesday, December 26, 2012, Pavel Stehule wrote: 2012/12/27 Jeff Janes jeff.ja...@gmail.com: More automated would be nice (i.e. one operation to make both the check constraints and the trigger, so they can't get out of sync), but would not necessarily mean faster. snip some benchmarking Native implementation should significantly effective evaluate expressions, mainly simple expressions - (this is significant for large number of partitions) and probably can do tuple forwarding faster than is heavy INSERT statement (is question if is possible decrease some overhead with more sophisticate syntax (by removing record expand). If the main goal is to make it faster, I'd rather see all of plpgsql get faster, rather than just a special case of partitioning triggers. For example, right now a CASE expression statement with 100 branches is about the same speed as an equivalent list of 100 elsif. So it seems to be doing a linear search, when it could be doing a hash that should be a lot faster. So native implementation can carry significant speed up - mainly if we can distribute tuples without expression evaluating (evaluated by executor) Making partitioning inserts native does open up other opportunities to make it faster, and also to make it administratively easier; but do we want to try to tackle both of those goals simultaneously? I think the administrative aspects would come first. (But I doubt I will be the one to implement either, so my vote doesn't count for much here.) Cheers, Jeff
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
On Monday, December 24, 2012, Charles Gomes wrote: By the way, I've just re-wrote the code to target the partitions individually and I've got almost 4 times improvement. Shouldn't it be faster to process the trigger, I would understand if there was no CPU left, but there is lots of cpu to chew. Once you turned off hyperthreading, it was reporting 75% CPU usage. Assuming that that accounting is perfect, that means you could only get 33% faster if you were to somehow start using all of the CPU. So I don't think I'd call that a lot of CPU left. And if you have 70 processes fighting for 8 cores, I'm not surprised you can't get above that CPU usage. It seems that there will be no other way to speedup unless the insert code is partition aware. There may be other ways, but that one will probably get you the most gain, especially if you use COPY or \copy. Since the main goal of partitioning is to allow your physical storage layout to conspire with your bulk operations, it is hard to see how you can get the benefits of partitioning without having your bulk loading participate in that conspiracy. Cheers, Jeff
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
Pavel, I've been trying to port the work of Emmanuel http://archives.postgresql.org/pgsql-hackers/2008-12/msg01221.php His implementation is pretty straight forward. Simple trigger doing constrain checks with caching for bulk inserts. So far that's what I got http://www.widesol.com/~charles/pgsql/partition.c I had some issues as He uses HeapTuples and on 9.2 I see a Slot. From: pavel.steh...@gmail.com Date: Thu, 27 Dec 2012 19:46:12 +0100 Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table To: jeff.ja...@gmail.com CC: charle...@outlook.com; ondrej.iva...@gmail.com; pgsql-performance@postgresql.org 2012/12/27 Jeff Janes jeff.ja...@gmail.com: On Wednesday, December 26, 2012, Pavel Stehule wrote: 2012/12/27 Jeff Janes jeff.ja...@gmail.com: More automated would be nice (i.e. one operation to make both the check constraints and the trigger, so they can't get out of sync), but would not necessarily mean faster. snip some benchmarking Native implementation should significantly effective evaluate expressions, mainly simple expressions - (this is significant for large number of partitions) and probably can do tuple forwarding faster than is heavy INSERT statement (is question if is possible decrease some overhead with more sophisticate syntax (by removing record expand). If the main goal is to make it faster, I'd rather see all of plpgsql get faster, rather than just a special case of partitioning triggers. For example, right now a CASE expression statement with 100 branches is about the same speed as an equivalent list of 100 elsif. So it seems to be doing a linear search, when it could be doing a hash that should be a lot faster. a bottleneck is not in PL/pgSQL directly. It is in PostgreSQL expression executor. Personally I don't see any simple optimization - maybe some variant of JIT (for expression executor) should to improve performance. Any other optimization require significant redesign PL/pgSQL what is job what I don't would do now - personally, it is not work what I would to start by self, because using plpgsql triggers for partitioning is bad usage of plpgsql - and I believe so after native implementation any this work will be useless. Design some generic C trigger or really full implementation is better work. More, there is still expensive INSERT statement - forwarding tuple on C level should be significantly faster - because it don't be generic. So native implementation can carry significant speed up - mainly if we can distribute tuples without expression evaluating (evaluated by executor) Making partitioning inserts native does open up other opportunities to make it faster, and also to make it administratively easier; but do we want to try to tackle both of those goals simultaneously? I think the administrative aspects would come first. (But I doubt I will be the one to implement either, so my vote doesn't count for much here.) Anybody who starts work on native implementation will have my support (it is feature that lot of customers needs). I have customers that can support development and I believe so there are others. Actually It needs only one tenacious man, because it is work for two years. Regards Pavel Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
* Jeff Janes (jeff.ja...@gmail.com) wrote: If the main goal is to make it faster, I'd rather see all of plpgsql get faster, rather than just a special case of partitioning triggers. For example, right now a CASE expression statement with 100 branches is about the same speed as an equivalent list of 100 elsif. So it seems to be doing a linear search, when it could be doing a hash that should be a lot faster. That's a nice thought, but I'm not sure that it'd really be practical. CASE statements in plpgsql are completely general and really behave more like an if/elsif tree than a C-style switch() statement or similar. For one thing, the expression need not use the same variables, could be complex multi-variable conditionals, etc. Figuring out that you could build a dispatch table for a given CASE statement and then building it, storing it, and remembering to use it, wouldn't be cheap. On the other hand, I've actually *wanted* a simpler syntax on occation. I have no idea if there'd be a way to make it work, but this would be kind of nice: CASE OF x -- or whatever WHEN 1 THEN blah blah WHEN 2 THEN blah blah WHEN 3 THEN blah blah END which would be possible to build into a dispatch table by looking at the type of x and the literals used in the overall CASE statement. Even so, there would likely be some number of WHEN conditions required before it'd actually be more efficient to use, though perhaps getting rid of the expression evaluation (if that'd be possible) would make up for it. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Nikolas Everett nik9...@gmail.com writes: We just upgraded from 8.3 to 9.1 and we're seeing some performance problems. When we EXPLAIN ANALYZE our queries the explain result claim that the queries are reasonably fast but the wall clock time is way way longer. Does anyone know why this might happen? Like so: db=\timing db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123; The plan is sensible. The estimates are sensible. The actual DB time reads like it is very sensible. But the wall clock time is like 11 seconds and the \timing report confirms it. Seems like the extra time would have to be in parsing/planning, or in waiting to acquire AccessShareLock on the table. It's hard to believe the former for such a simple query, unless the table has got thousands of indexes or something silly like that. Lock waits are surely possible if there is something else contending for exclusive lock on the table, but it's hard to see how the wait time would be so consistent. BTW, the explain.depesz.com link you posted clearly does not correspond to the above query (it's not doing a MAX), so another possibility is confusion about what query is really causing trouble. We've seen people remove essential details before while trying to anonymize their query. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
2012/12/27 Stephen Frost sfr...@snowman.net: * Jeff Janes (jeff.ja...@gmail.com) wrote: If the main goal is to make it faster, I'd rather see all of plpgsql get faster, rather than just a special case of partitioning triggers. For example, right now a CASE expression statement with 100 branches is about the same speed as an equivalent list of 100 elsif. So it seems to be doing a linear search, when it could be doing a hash that should be a lot faster. That's a nice thought, but I'm not sure that it'd really be practical. CASE statements in plpgsql are completely general and really behave more like an if/elsif tree than a C-style switch() statement or similar. For one thing, the expression need not use the same variables, could be complex multi-variable conditionals, etc. Figuring out that you could build a dispatch table for a given CASE statement and then building it, storing it, and remembering to use it, wouldn't be cheap. On the other hand, I've actually *wanted* a simpler syntax on occation. I have no idea if there'd be a way to make it work, but this would be kind of nice: CASE OF x -- or whatever WHEN 1 THEN blah blah WHEN 2 THEN blah blah WHEN 3 THEN blah blah END which would be possible to build into a dispatch table by looking at the type of x and the literals used in the overall CASE statement. Even so, there would likely be some number of WHEN conditions required before it'd actually be more efficient to use, though perhaps getting rid of the expression evaluation (if that'd be possible) would make up for it. I understand, but I am not happy with it. CASE is relative complex. There is SQL CASE too, and this is third variant of CASE. Maybe some simple CASE statements can be supported by parser and there should be local optimization (probably only for numeric - without casting) But it needs relative lot of new code? Will be this code accepted? Regards Pavel Thanks, Stephen -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Sorry for the confusion around the queries. Both queries are causing trouble. We've noticed that just EXPLAINING the very simple queries takes forever. After more digging it looks like this table has an inordinate number of indices (10 ish). There a whole buch of conditional indicies for other columns that we're not checking. The particular column that is causing us trouble exists in both a regular (con_id) and a composite index (con_id, somthing_else). We checked on locks and don't see any ungranted locks. Would waiting on the AccessShareLock not appear in pg_locks? Thanks! Nik On Thu, Dec 27, 2012 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Nikolas Everett nik9...@gmail.com writes: We just upgraded from 8.3 to 9.1 and we're seeing some performance problems. When we EXPLAIN ANALYZE our queries the explain result claim that the queries are reasonably fast but the wall clock time is way way longer. Does anyone know why this might happen? Like so: db=\timing db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123; The plan is sensible. The estimates are sensible. The actual DB time reads like it is very sensible. But the wall clock time is like 11 seconds and the \timing report confirms it. Seems like the extra time would have to be in parsing/planning, or in waiting to acquire AccessShareLock on the table. It's hard to believe the former for such a simple query, unless the table has got thousands of indexes or something silly like that. Lock waits are surely possible if there is something else contending for exclusive lock on the table, but it's hard to see how the wait time would be so consistent. BTW, the explain.depesz.com link you posted clearly does not correspond to the above query (it's not doing a MAX), so another possibility is confusion about what query is really causing trouble. We've seen people remove essential details before while trying to anonymize their query. regards, tom lane
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Another other thing - the query seems to get faster after the first time we plan it. I'm not sure that this is the case but I think it might be. On Thu, Dec 27, 2012 at 2:28 PM, Nikolas Everett nik9...@gmail.com wrote: Sorry for the confusion around the queries. Both queries are causing trouble. We've noticed that just EXPLAINING the very simple queries takes forever. After more digging it looks like this table has an inordinate number of indices (10 ish). There a whole buch of conditional indicies for other columns that we're not checking. The particular column that is causing us trouble exists in both a regular (con_id) and a composite index (con_id, somthing_else). We checked on locks and don't see any ungranted locks. Would waiting on the AccessShareLock not appear in pg_locks? Thanks! Nik On Thu, Dec 27, 2012 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Nikolas Everett nik9...@gmail.com writes: We just upgraded from 8.3 to 9.1 and we're seeing some performance problems. When we EXPLAIN ANALYZE our queries the explain result claim that the queries are reasonably fast but the wall clock time is way way longer. Does anyone know why this might happen? Like so: db=\timing db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123; The plan is sensible. The estimates are sensible. The actual DB time reads like it is very sensible. But the wall clock time is like 11 seconds and the \timing report confirms it. Seems like the extra time would have to be in parsing/planning, or in waiting to acquire AccessShareLock on the table. It's hard to believe the former for such a simple query, unless the table has got thousands of indexes or something silly like that. Lock waits are surely possible if there is something else contending for exclusive lock on the table, but it's hard to see how the wait time would be so consistent. BTW, the explain.depesz.com link you posted clearly does not correspond to the above query (it's not doing a MAX), so another possibility is confusion about what query is really causing trouble. We've seen people remove essential details before while trying to anonymize their query. regards, tom lane
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Nikolas Everett nik9...@gmail.com writes: After more digging it looks like this table has an inordinate number of indices (10 ish). 10 doesn't sound like a lot. There a whole buch of conditional indicies for other columns that we're not checking. The particular column that is causing us trouble exists in both a regular (con_id) and a composite index (con_id, somthing_else). You're not being at all clear here. Are you trying to say that only queries involving WHERE col = constant for a particular column seem to be slow? If so, maybe the column has a weird datatype or a wildly out of line statistics target? (Still hard to see how you get to 11-ish seconds in planning, though.) One thing you might do is watch the backend process in top or local equivalent, and see if it's spending most of the 11 seconds sleeping, or accumulating CPU time, or accumulating I/O. That info would eliminate a lot of possibilities. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
The partial index is highly leveraged. If every tuple in the table is updated once, that amounts to every tuple in the index being updated 25,000 times. How so? That sounds like O(n_2) behaviour. If the table has 5 million rows while the index has 200 (active) rows at any given time, then to update every row in the table to null and back again would be 100% turn over of the table. But each such change would lead to an addition and then a deletion from the index. So 100% turnover of the table would be a 5 million / 200 = 25,000 fold turn of the index. Sorry, I was being dense. I misread that as: every time a single tuple in the table is updated, the entire index (every row) is updated. Yes, of course your explanation makes sense. There is some code that allows a btree index entry to get killed (and so the slot to be reused) without any vacuum, if a scan follows that entry and finds the corresponding tuple in the table no longer visible to anyone. I have not examined this code, and don't know whether it is doing its job but just isn't enough to prevent the bloat, or if for some reason it is not applicable to your situation. It looks like my solution is going to be a REINDEX invoked from cron, or maybe just every 100k inserts. In terms of trying to improve this behaviour for other PG users in the future, are there any more diagnostics I can do for you? Having found a special case, I'd like to help permanently resolve it if I can. Thanks very much again. Best wishes, Richard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] sched_migration_cost for high-connection counts
Hey guys, I recently stumbled over a Linux scheduler setting that has outright shocked me. So, after reading through this: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html it became readily apparent we were hitting the same wall. I could do a pgbench and increase the connection count by 100 every iteration, and eventually performance just fell off a proverbial cliff and never recovered. For our particular systems, this barrier is somewhere around 800 processes. Select-only performance on a 3600-scale pgbench database in cache falls from about 70k TPS to about 12k TPS after crossing that line. Worse, sar shows over 70% CPU dedicated to system overhead. After some fiddling around, I changed sched_migration_cost from its default of 50 to 500 and performance returned to linear scaling immediately. It's literally night and day. Setting it back to 50 reverts to the terrible performance. In addition, setting the migration cost to a higher value does not negatively affect any other performance metric I've checked. This is on an Ubuntu 12.04 system, and I'd love if someone out there could independently verify this, because frankly, I find it difficult to believe. If legit, high-connection systems would benefit greatly. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Nikolas Everett nik9...@gmail.com writes: We straced the backend during the explain and it looked like the open commands were taking several seconds each. Kind of makes me wonder if you have a whole lot of tables (whole lot in this context probably means tens of thousands) and are storing the database on a filesystem that doesn't scale well to lots of files in one directory. If that's the explanation, the reason the 8.3 installation was okay was likely that it was stored on a more modern filesystem. BTW, please keep the list cc'd on replies. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
On Thu, Dec 27, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Nikolas Everett nik9...@gmail.com writes: We straced the backend during the explain and it looked like the open commands were taking several seconds each. Kind of makes me wonder if you have a whole lot of tables (whole lot in this context probably means tens of thousands) and are storing the database on a filesystem that doesn't scale well to lots of files in one directory. If that's the explanation, the reason the 8.3 installation was okay was likely that it was stored on a more modern filesystem. We have 1897 files for our largest database which really isn't a whole lot. The old servers were EXT3 over FC to a NetApp running RHEL5 PPC. The new servers are on NFS to the same NetApp running RHEL5 Intel. We've failed from our physical primary to a virtual secondary both of which seem to have the same problem. We're in the process of rebuilding the a hot slave on EXT3 over iSCSI. We'll fail over to it as soon as we can. We never tried stracing the PPC infrastructure but it obviously didn't have this problem. We also have another cluster running with an identical setup which doesn't seem to have the problem. In fact, the problem never came up durring correctness testing for this problem either - it specifically required load before it came up. BTW, please keep the list cc'd on replies. Itchy reply finger. regards, tom lane
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
Nikolas Everett nik9...@gmail.com writes: On Thu, Dec 27, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Nikolas Everett nik9...@gmail.com writes: We straced the backend during the explain and it looked like the open commands were taking several seconds each. Kind of makes me wonder if you have a whole lot of tables (whole lot in this context probably means tens of thousands) and are storing the database on a filesystem that doesn't scale well to lots of files in one directory. If that's the explanation, the reason the 8.3 installation was okay was likely that it was stored on a more modern filesystem. We have 1897 files for our largest database which really isn't a whole lot. OK... The old servers were EXT3 over FC to a NetApp running RHEL5 PPC. The new servers are on NFS to the same NetApp running RHEL5 Intel. Now I'm wondering about network glitches or NFS configuration problems. This is a bit outside my expertise unfortunately, but it seems clear that your performance issue is somewhere in that area. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly
It looks like it was a problem with NFS. We're not really sure what was wrong with it but once we failed over to an iSCSI mount for the data everything is running just fine. On Thu, Dec 27, 2012 at 6:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Nikolas Everett nik9...@gmail.com writes: On Thu, Dec 27, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Nikolas Everett nik9...@gmail.com writes: We straced the backend during the explain and it looked like the open commands were taking several seconds each. Kind of makes me wonder if you have a whole lot of tables (whole lot in this context probably means tens of thousands) and are storing the database on a filesystem that doesn't scale well to lots of files in one directory. If that's the explanation, the reason the 8.3 installation was okay was likely that it was stored on a more modern filesystem. We have 1897 files for our largest database which really isn't a whole lot. OK... The old servers were EXT3 over FC to a NetApp running RHEL5 PPC. The new servers are on NFS to the same NetApp running RHEL5 Intel. Now I'm wondering about network glitches or NFS configuration problems. This is a bit outside my expertise unfortunately, but it seems clear that your performance issue is somewhere in that area. regards, tom lane
Re: [PERFORM] backend suddenly becomes slow, then remains slow
On Thursday, December 27, 2012, Andrew Dunstan wrote: On 12/26/2012 11:03 PM, Jeff Janes wrote: Do you have a huge number of tables? Maybe over the course of a long-lived connection, it touches enough tables to bloat the relcache / syscache. I don't know how the autovac would be involved in that, though. Yes, we do indeed have a huge number of tables. This seems a plausible thesis. All of the syscache things have compiled hard-coded numbers of buckets, at most 2048, and once those are exceeded the resulting collision resolution becomes essentially linear. It is not hard to exceed 2048 tables by a substantial multiple, and even less hard to exceed 2048 columns (summed over all tables). I don't know why syscache doesn't use dynahash; whether it is older than dynahash is and was never converted out of inertia, or if there are extra features that don't fit the dynahash API. If the former, then converting them to use dynahash should give automatic resizing for free. Maybe that conversion should be a To Do item? Cheers, Jeff
Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?
On Thursday, December 20, 2012, Jeff Janes wrote: On Thursday, December 20, 2012, Tom Lane wrote: What I did to try to duplicate Richard's situation was to create a test table in which all the exit_state values were NULL, then build the index, then UPDATE all but a small random fraction of the rows to 1, then vacuum. This results in a rather bloated partial index, but I think that's probably what he's got given that every record initially enters the table with NULL exit_state. It would take extremely frequent vacuuming to keep the partial index from accumulating a lot of dead entries. Once I cranked up default_statistics_target, I could start reproducing the very high estimates (5000) for the partial index in 9.1. As you say, switching to 9.2 or above lowers it quite a bit, I still get some pretty high estimates, ~100 when 8 would be more accurate. The problem is in genericcostestimate if (index-pages 1 index-tuples 1) numIndexPages = ceil(numIndexTuples * index-pages / index-tuples); The index-pages should probably not include index pages which are empty. Even with aggressive vacuuming, most of the pages in the partial index seem to be empty at any given time. However, I don't know if that number is exposed readily. And it seems to be updated too slowly to be useful, if pg_freespace is to be believed. But I wonder if it couldn't be clamped to so that we there can be no more pages than there are tuples. numIndexPages = ceil(numIndexTuples * Min(1,index-pages / index-tuples)); Cheers, Jeff