Re: [PERFORM] Locking vs. Exceptions
Robins wrote: Hi, The documentation says that function blocks with exceptions are far costlier than without one. I recommend against using exceptions. There is a memory leak in the exception handler that will cause headaches if it is called many times in the transaction. In plpgsql, I would use: SELECT ... FOR UPDATE; IF FOUND THEN UPDATE ...; ELSE INSERT ...; END IF; If you have multiple transactions doing this process at the same time, you'll need explicit locking of the table to avoid a race condition. -- Benjamin Minshall <[EMAIL PROTECTED]> Senior Developer -- Intellicon, Inc. http://www.intellicon.biz ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] VACUUMs take twice as long across all nodes
On Oct 27, 2006, at 2:07 PM, Tom Lane wrote: 8.2, but in existing releases I can't see much you can do about it except REINDEX when things get slow. This will be so nice for me. I have one huge table with a massive amount of churn and bulk deletes. I have to reindex it once every other month. It takes about 60 to 75 minutes per index (times two indexes) else I'd do it monthly. It shaves nearly 1/3 of the relpages off of the index size. smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Query plan for "heavy" SELECT with "lite" sub-SELECTs
Dave Dutcher wrote: > > -Original Message- > > From: [EMAIL PROTECTED] > > Nikolay Samokhvalov > > > > What should I do to make Postgres work properly in such cases (I have > > a lot of similar queries; surely, they are executed w/o seqscans, but > > overall picture is the same - I see that starting from sub-selects > > dramatically decrease performance)? > > How about this: > > explain analyze > select (select typname from pg_type where pg_type.oid=mainq.prorettype limit > 1) > from (select * from pg_proc offset 1500 limit 1) mainq; What's the use of such a query? One would think that in the real world, you'd at least have an ORDER BY somewhere in the subqueries. Performance analysis of strange queries is useful, but the input queries have to be meaningful as well. Otherwise you end up optimizing bizarre and useless cases. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Setting "nice" values
[Madison Kelly - Thu at 10:25:07AM -0500] > Will the priority of the script pass down to the pgsql queries it calls? > I figured (likely incorrectly) that because the queries were executed by > the psql server the queries ran with the server's priority. I think you are right, and in any case, I don't think the niceness value won't help much if the bottleneck is iowait. In our application, I've made a special function for doing low-priority transactions which I believe is quite smart - though maybe not always. Before introducing this logic, we observed we had a tipping point, too many queries, and the database wouldn't swallow them fast enough, and the database server just jammed up, trying to work at too many queries at once, yielding the results far too slow. In the config file, I now have those two flags set: stats_start_collector = on stats_command_string = on This will unfortunately cause some CPU-load, but the benefit is great - one can actually check what the server is working with at any time: select * from pg_stat_activity with those, it is possible to check a special view pg_stat_activity - it will contain all the queries the database is working on right now. My idea is to peek into this table - if there is no active queries, the database is idle, and it's safe to start our low-priority transaction. If this view is full of stuff, one should certainly not run any low-priority transactions, rather sleep a bit and try again later. select count(*) from pg_stat_activity where not current_query like '%' and query_start+?
Re: [PERFORM] Database-wide vacuum can take a long time, duringwhich tables are not being analyzed
Steven Flatt wrote: > Sorry, I think there's a misunderstanding here. Our system is not doing > near that number of transactions per second. I meant that the duration of a > single DB-wide vacuum takes on the order of a couple of weeks. The time > between DB-wide vacuums is a little over a year, I believe. I wonder if this is using some vacuum delay setting? If that's the case, I think you could manually run a database-wide vacuum with a zero vacuum delay setting, so that said vacuum takes less time to finish (say, once every 8 months). (8.2 pretty much solves this issue BTW, by not requiring database-wide vacuums). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Setting "nice" values
On Thu, 2006-11-02 at 09:25, Madison Kelly wrote: > Scott Marlowe wrote: > > On Thu, 2006-11-02 at 09:14, Madison Kelly wrote: > >> Hi all, > >> > >>I've got a script (perl, in case it matters) that I need to run once > >> a month to prepare statements. This script queries and updates the > >> database a *lot*. I am not concerned with the performance of the SQL > >> calls so much as I am about the impact it has on the server's load. > >> > >>Is there a way to limit queries speed (ie: set a low 'nice' value on > >> a query)? This might be an odd question, or I could be asking the > >> question the wrong way, but hopefully you the idea. :) > > > > While you can safely set the priority lower on the calling perl script, > > setting db backend priorities lower can result in problems caused by > > "priority inversion" Look up that phrase on the pgsql admin, perform, > > general, or hackers lists for an explanation, or go here: > > > > http://en.wikipedia.org/wiki/Priority_inversion > > > > I have a simple script that grabs raw data from an oracle db and shoves > > it into a postgresql database for reporting purposes. Every 100 rows I > > put into postgresql, I usleep 10 or so and the load caused by that > > script on both systems is minimal. You might try something like that. > > Will the priority of the script pass down to the pgsql queries it calls? > I figured (likely incorrectly) that because the queries were executed by > the psql server the queries ran with the server's priority. If this > isn't the case, then perfect. :) nope, the priorities don't pass down. you connect via a client lib to the server, which spawns a backend process that does the work for you. The backend process inherits its priority from the postmaster that spawns it, and they all run at the same priority. > Thanks for the tip, too, it's something I will try. Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Setting "nice" values
Scott Marlowe wrote: On Thu, 2006-11-02 at 09:14, Madison Kelly wrote: Hi all, I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) While you can safely set the priority lower on the calling perl script, setting db backend priorities lower can result in problems caused by "priority inversion" Look up that phrase on the pgsql admin, perform, general, or hackers lists for an explanation, or go here: http://en.wikipedia.org/wiki/Priority_inversion I have a simple script that grabs raw data from an oracle db and shoves it into a postgresql database for reporting purposes. Every 100 rows I put into postgresql, I usleep 10 or so and the load caused by that script on both systems is minimal. You might try something like that. Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. If this isn't the case, then perfect. :) Thanks for the tip, too, it's something I will try. Madi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Setting "nice" values
On Thu, 2006-11-02 at 09:14, Madison Kelly wrote: > Hi all, > >I've got a script (perl, in case it matters) that I need to run once > a month to prepare statements. This script queries and updates the > database a *lot*. I am not concerned with the performance of the SQL > calls so much as I am about the impact it has on the server's load. > >Is there a way to limit queries speed (ie: set a low 'nice' value on > a query)? This might be an odd question, or I could be asking the > question the wrong way, but hopefully you the idea. :) While you can safely set the priority lower on the calling perl script, setting db backend priorities lower can result in problems caused by "priority inversion" Look up that phrase on the pgsql admin, perform, general, or hackers lists for an explanation, or go here: http://en.wikipedia.org/wiki/Priority_inversion I have a simple script that grabs raw data from an oracle db and shoves it into a postgresql database for reporting purposes. Every 100 rows I put into postgresql, I usleep 10 or so and the load caused by that script on both systems is minimal. You might try something like that. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Database-wide vacuum can take a long time, duringwhich tables are not being analyzed
Sorry, I think there's a misunderstanding here. Our system is not doing near that number of transactions per second. I meant that the duration of a single DB-wide vacuum takes on the order of a couple of weeks. The time between DB-wide vacuums is a little over a year, I believe. Every couple of weeks implies a transaction rate of ~~500tps, so I'd beinterested to hear more about your system. --Simon RiggsEnterpriseDB http://www.enterprisedb.com
[PERFORM] Setting "nice" values
Hi all, I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) Thanks! Madi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Query plan for "heavy" SELECT with "lite" sub-SELECTs
> -Original Message- > From: [EMAIL PROTECTED] > Nikolay Samokhvalov > > What should I do to make Postgres work properly in such cases (I have > a lot of similar queries; surely, they are executed w/o seqscans, but > overall picture is the same - I see that starting from sub-selects > dramatically decrease performance)? How about this: explain analyze select (select typname from pg_type where pg_type.oid=mainq.prorettype limit 1) from (select * from pg_proc offset 1500 limit 1) mainq; QUERY PLAN - Subquery Scan mainq (cost=50.99..56.85 rows=1 width=4) (actual time=13.646..13.659 rows=1 loops=1) -> Limit (cost=50.99..51.02 rows=1 width=310) (actual time=13.575..13.579 rows=1 loops=1) -> Seq Scan on pg_proc (cost=0.00..62.34 rows=1834 width=310) (actual time=0.014..7.297 rows=1501 loops=1) SubPlan -> Limit (cost=0.00..5.82 rows=1 width=64) (actual time=0.038..0.043 rows=1 loops=1) -> Index Scan using pg_type_oid_index on pg_type (cost=0.00..5.82 rows=1 width=64) (actual time=0.028..0.028 rows=1 loops=1) Index Cond: (oid = $0) Total runtime: 13.785 ms I would expect you to get closer to 2 ms on that query. My machine takes 13 ms to do just the seq scan of pg_proc. Dave ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Locking vs. Exceptions
Hi,The documentation says that function blocks with exceptions are far costlier than without one.So if I need to implement an INSTEAD OF trigger (after checking for unique constraint violations) which way should I go ? 1. Get a table lock2. Use 'Select ... For Update' (which could be used to lock only the desired recordsets)3. Use ExceptionsAny advice / experiences or even pointers would be helpful. ThanksRobins Tharakan
Re: [PERFORM] Query plan for "heavy" SELECT with "lite" sub-SELECTs
Nikolay Samokhvalov wrote: 2. explain analyze select *, (select typname from pg_type where pg_type.oid=pg_proc.prorettype limit 1) from pg_proc offset 1500 limit 1; "Limit (cost=8983.31..8989.30 rows=1 width=365) (actual time=17.648..17.649 rows=1 loops=1)" " -> Seq Scan on pg_proc (cost=0.00..13486.95 rows=2252 width=365) (actual time=0.100..16.851 rows=1501 loops=1)" "SubPlan" " -> Limit (cost=0.00..5.91 rows=1 width=64) (actual time=0.006..0.007 rows=1 loops=1501)" "-> Index Scan using pg_type_oid_index on pg_type (cost=0.00..5.91 rows=1 width=64) (actual time=0.004..0.004 rows=1 loops=1501)" " Index Cond: (oid = $0)" "Total runtime: 17.784 ms" We see that in the 2nd example Postgres starts with "Index Scan using pg_type_oid_index" (1501 iterations!). No, what you see here is that the inner loop is the index-scan over pg_type_oid. It's running a sequential scan on pg_proc and then runs 1501 index scans against pg_type. My understanding of SQL says me that the simplest (and, in this case - and probably in *most* cases - fastest) way to perform such queries is to start from main SELECT and then, when we already have rows from "main" table, perform "lite" sub-selects. So, I expected smth near 2.156 ms + 0.079 ms, but obtain 17.784 ms... For large table this is killing behaviour. You've forgotten about the cost of matching up the two sets of rows. Now, if the first part of the query outputs only one row then you might be right, but I'm not sure that the SQL standard allows the subquery to be delayed to that stage without explicitly organising the query that way. From memory, the OFFSET/LIMIT takes place at the very end of the query processing. What should I do to make Postgres work properly in such cases (I have a lot of similar queries; surely, they are executed w/o seqscans, but overall picture is the same - I see that starting from sub-selects dramatically decrease performance)? Do you have a real example? That might be more practical. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Query plan for "heavy" SELECT with "lite" sub-SELECTs
Hello, I do not understand, why Postgres very ofter starts execution from sub-select instead of doing main select and then proceeding to "lite" sub-selects. For example: (example is quite weird, but it demonstrates the problem) 1. explain analyze select * from pg_proc offset 1500 limit 1; "Limit (cost=116.91..116.99 rows=1 width=365) (actual time=2.111..2.112 rows=1 loops=1)" " -> Seq Scan on pg_proc (cost=0.00..175.52 rows=2252 width=365) (actual time=0.034..1.490 rows=1501 loops=1)" "Total runtime: 2.156 ms" 3. explain analyze select oid,* from pg_type where oid=2277 limit 1; "Limit (cost=0.00..5.91 rows=1 width=816) (actual time=0.021..0.022 rows=1 loops=1)" " -> Index Scan using pg_type_oid_index on pg_type (cost=0.00..5.91 rows=1 width=816) (actual time=0.018..0.018 rows=1 loops=1)" "Index Cond: (oid = 2277::oid)" "Total runtime: 0.079 ms" 2. explain analyze select *, (select typname from pg_type where pg_type.oid=pg_proc.prorettype limit 1) from pg_proc offset 1500 limit 1; "Limit (cost=8983.31..8989.30 rows=1 width=365) (actual time=17.648..17.649 rows=1 loops=1)" " -> Seq Scan on pg_proc (cost=0.00..13486.95 rows=2252 width=365) (actual time=0.100..16.851 rows=1501 loops=1)" "SubPlan" " -> Limit (cost=0.00..5.91 rows=1 width=64) (actual time=0.006..0.007 rows=1 loops=1501)" "-> Index Scan using pg_type_oid_index on pg_type (cost=0.00..5.91 rows=1 width=64) (actual time=0.004..0.004 rows=1 loops=1501)" " Index Cond: (oid = $0)" "Total runtime: 17.784 ms" We see that in the 2nd example Postgres starts with "Index Scan using pg_type_oid_index" (1501 iterations!). My understanding of SQL says me that the simplest (and, in this case - and probably in *most* cases - fastest) way to perform such queries is to start from main SELECT and then, when we already have rows from "main" table, perform "lite" sub-selects. So, I expected smth near 2.156 ms + 0.079 ms, but obtain 17.784 ms... For large table this is killing behaviour. What should I do to make Postgres work properly in such cases (I have a lot of similar queries; surely, they are executed w/o seqscans, but overall picture is the same - I see that starting from sub-selects dramatically decrease performance)? -- Best regards, Nikolay ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Help w/speeding up range queries?
On Tue, 2006-10-31 at 18:18 -0500, John Major wrote: > #I am a biologist, and work with large datasets (tables with millions of > rows are common). > #These datasets often can be simplified as features with a name, and a > start and end position (ie: a range along a number line. GeneX is on > some chromosome from position 10->40) Do you know about www.biopostgres.org ? I believe they provide some additional indexing mechanisms for just this type of data. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Help w/speeding up range queries?
> Ie: select FeatureID from SIMPLE_TABLE where FeatureChromosomeName like > 'chrX' and StartPosition > 1000500 and EndPosition < 200; How about ( this assumes that StartPosition <= EndPosition ): select FeatureID from SIMPLE_TABLE where FeatureChromosomeName llike 'chrX' and StartPosition > 1000500 and StartPosition < 200 and EndPosition > 1000500 and EndPosition < 200; This at least should help the planner with estimating number of rows. Also think twice when You assume that a query with ILIKE will use an index. Read about varchar_pattern_ops. Make an index on (FeatureChromosomeName,StartPosition) , and all should be fine. Greetings Marcin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Database-wide vacuum can take a long time, duringwhich tables are not being analyzed
On Wed, 2006-11-01 at 14:15 -0500, Steven Flatt wrote: > Here is a potential problem with the auto-vacuum daemon, and I'm > wondering if anyone has considered this. To avoid transaction ID > wraparound, the auto-vacuum daemon will periodically determine that it > needs to do a DB-wide vacuum, which takes a long time. On our system, > it is on the order of a couple of weeks. (The system is very busy and > there is a lot of I/O going on pretty much 24/7). During this period > of time, there is nothing to automatically analyze any of the tables, > leading to further performance problems. What are your thoughts on > having the DB-wide vacuum running on a separate thread so that the > daemon can concurrently wake up and take care of analyzing tables? Yes, do it. Every couple of weeks implies a transaction rate of ~~500tps, so I'd be interested to hear more about your system. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq