[PERFORM] Table partitioning
Sorry for not responding directly to your question and for changing the subject ... ;-) On 4 March 2011 18:18, Landreville landrevi...@deadtreepages.com wrote: That is partitioned into about 3000 tables by the switchport_id (FK to a lookup table), each table has about 30 000 rows currently (a row is inserted every 5 minutes into each table). Does such partitioning really make sense? My impression is that the biggest benefit with table partitioning is to keep old inactive data out of the caches. If so, then it doesn't seem to make much sense to split a table into 3000 active partitions ... unless, maybe, almost all queries goes towards a specific partitioning. According to http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html ... Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory. All constraints on all partitions of the master table are examined during constraint exclusion, so large numbers of partitions are likely to increase query planning time considerably. Partitioning using these techniques will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions. We have started an archiving project internally in our company since our database is outgrowing the available memory, I'm advocating that we should look into table partitioning before we do the archiving, though it seems to be out of the scope of the project group looking into the archiving. I'm not sure if I should continue nagging about it or forget about it ;-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Table partitioning
On 5 March 2011 12:59, Mark Thornton mthorn...@optrak.co.uk wrote: If your partitions a loosely time based and you don't want to discard old data, then surely the number of partitions will grow without limit. True, but is it relevant? With monthly table partitioning it takes hundreds of years before having thousands of partitions. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...
2011/2/11 Віталій Тимчишин tiv...@gmail.com: If the list is hard-coded, you can create partial index on account_transaction(account_id, created desc) where trans_type_id in ( ... long, hard-coded list ...) My idea as well, though it looks ugly and it would be a maintenance head-ache (upgrading the index as new transaction types are added would mean costly write locks on the table, and we can't rely on manual processes to get it right ... we might need to set up scripts to either upgrade the index or alert us if the index needs upgrading). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...
2011/2/11 Vitalii Tymchyshyn tiv...@gmail.com: My idea as well, though it looks ugly and it would be a maintenance head-ache (upgrading the index as new transaction types are added would mean costly write locks on the table, Create new one concurrently. Concurrently? Are there any ways to add large indexes without blocking inserts to the table for the time it takes to create the index? Yep. Another option could be to add query rewrite as select * from ( select * from account_transaction where trans_type_id =type1 and account_id=? order by created desc limit 25 union all select * from account_transaction where trans_type_id =type2 and account_id=? order by created desc limit 25 union all ... union all select * from account_transaction where trans_type_id =typeN and account_id=? order by created desc limit 25 ) a order by created desc limit 25 I actually considered that. For the test case given it works very fast. Not sure if it would work universally ... it scales well when having extreme amounts of transactions outside the given transaction list (the case we have problems with now), but it wouldn't scale if some user has an extreme amount of transactions within the list. However, I think our extreme amount of transactions-problem is mostly limited to the transaction types outside the list. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...
On 4 February 2011 04:46, Josh Berkus j...@agliodbs.com wrote: Optimizer hints are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have the problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. I have no clue about how hints works in Oracle ... I've never been working enterprise level on anything else than Postgres. Anyway, today I just came over an interesting problem in our production database today - and I think it would be a benefit to be able to explicitly tell the planner what index to use (the dev team is adding redundant attributes and more indexes to solve the problem - which worries me, because we will run into serious problems as soon as there won't be enough memory for all the frequently-used indexes). We have users and transactions, and we have transaction types. The transaction table is huge. The users are able to interactively check their transaction listings online, and they have some simple filter options available as well. Slightly simplified, the queries done looks like this: select * from account_transaction where account_id=? order by created desc limit 25; select * from account_transaction where trans_type_id in ( ... long, hard-coded list ...) and account_id=? order by created desc limit 25; and we have indexes on: account_transaction(account_id, created) account_transaction(account_id, trans_type_id, created) (At this point, someone would probably suggest to make three single-key indexes and use bitmap index scan ... well, pulling 25 rows from the end of an index may be orders of magnitude faster than doing bitmap index mapping on huge indexes) For the second query, the planner would chose the first index - and maybe it makes sense - most of our customers have between 10-30% of the transactions from the long list of transaction types, slim indexes are good and by average the slimmer index would probably do the job a bit faster. The problem is with the corner cases - for some of our extreme customers thousands of transaction index tuples may need to be scanned before 25 rows with the correct transaction type is pulled out, and if the index happens to be on disk, it may take tens of seconds to pull out the answer. Tens of seconds of waiting leads to frustration, it is a lot nowadays in an interactive session. Also, I haven't really checked it up, but it may very well be that this is exactly the kind of customers we want to retain. To summarize, there are two things the planner doesn't know - it doesn't know that there exists such corner cases where the real cost is far larger than the estimated cost, and it doesn't know that it's more important to keep the worst-case cost on a reasonable level than to minimize the average cost. In the ideal world postgres would have sufficiently good statistics to know that for user #7 it is better to chose the second index, but I suppose it would be easier if I was able to explicitly hide the account_transaction(account_id, created) index for this query. Well, I know of one way to do it ... but I suppose it's not a good idea to put drop index foo; select ...; rollback; into production ;-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] table partitioning and select max(id)
[Greg Smith] Here's the comment from that describing the main technique used to fix it: This module tries to replace MIN/MAX aggregate functions by subqueries of the form (SELECT col FROM tab WHERE ... ORDER BY col ASC/DESC LIMIT 1) Huh ... that sounds a bit like pg 8.0 to me ;-) I remember on 7.x one had to write select id from table order by id desc limit 1 to force through a quick index scan. This was fixed in 8.0 IIRC. I did test select id from table order by id desc limit 1 on my parent table yesterday, it would still do the seq-scan. Even adding a where-restriction to make sure only one partition was queried I still got the seq-scan. Unfortunately that change ends a series of 6 commits of optimizer refactoring in this area, so it's not the case that you just apply this one commit as a bug-fix to a 9.0 system. I have a project in process to do the full backport needed I might be able to share with you if that works out, and you're willing to run with a customer patched server process. In this particular case, wait for 9.1 seems to be the best option :-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] table partitioning and select max(id)
I implemented table partitioning, and it caused havoc with a select max(id) on the parent table - the query plan has changed from a lightningly fast backwards index scan to a deadly seq scan. Both partitions are set up with primary key index and draws new IDs from the same sequence ... select max(id) on both partitions are fast. Are there any tricks I can do to speed up this query? I can't add the ID to the table constraints, we may still get in old data causing rows with fresh IDs to get into the old table. (I decided to keep this short rather than include lots of details - but at least worth mentioning that we're using PG9) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Full Text index is not using during OR operation
Just a general note re the subject, I've also had troubles with postgres being unable to optimize a query with OR. The work-around, although a bit messy, was to use a UNION-query instead. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Difference between explain analyze and real execution time
[Tom Lane] EXPLAIN ANALYZE doesn't account for all of the runtime involved. In this case, I'd bet that session startup/shutdown is a big part of the difference. The session startup/shutdown should be the same for the real SQL and the broken SQL, shouldn't it? [Artur Zając] time psql -c 'explain analyze SELECT te.idt FROM t_positions AS te JOIN t_st AS stm ON (te.idt=stm.idt AND 4=stm.idm) WHERE te.idtr IN (347186)' Is this weidness only observed for this query? What happens with other queries? explain analyze select 1? explain analyze select * from t_positions where idtr=347816? plain select without explain analyze? etc? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
I just got this crazy, stupid or maybe genius idea :-) One thing that I've learned in this thread is that fat indexes (i.e. some index on some_table(a,b,c,d,e,f)) is to be avoided as much as possible. One of our biggest indexes looks like this: acc_trans(customer_id, trans_type, created) For the very most of the time an index like this would do: acc_trans(customer_id, trans_type, created) But then there are those few troublesome customers that have tens of thousands of transactions, they interactively inspect transaction listings through the web, sometimes the query give me my 20 most recent transactions of trans_type 6 gets stuck, maybe the customer has no transactions of trans type 6 and all the transactions needs to be scanned through. Since this is done interactively and through our front-end web page, we want all queries to be lightning fast. Now, my idea is to drop that fat index and replace it with conditional indexes for a dozen of heavy users - like those: acc_trans(trans_type, created) where customer_id=224885; acc_trans(trans_type, created) where customer_id=643112; acc_trans(trans_type, created) where customer_id=15; or maybe like this: acc_trans(customer_id, trans_type, created) where customer_id in ( ... ); Any comments? My sysadmin is worried that it would be a too big hit on performance when doing inserts. It may also cause more overhead when planning the queries. Is that significant? Is this idea genius or stupid or just somewhere in between? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 29 September 2010 10:03, Mark Kirkwood mark.kirkw...@catalyst.net.nz Yeah, I think the idea of trying to have a few smaller indexes for the 'hot' customers is a good idea. However I am wondering if just using single column indexes and seeing if the bitmap scan/merge of smaller indexes is actually more efficient is worth testing - i.e: acc_trans(trans_type); acc_trans(created); acc_trans(customer_id); My gut feeling tells me that it's not a good idea - consider that we want to pull out 20 rows from a 60M table. If I'm not mistaken, with bitmapping it's needed to do operations on the whole indexes - 60M bits is still 7.5 megabytes. Well, I suppose that nowadays it's relatively fast to bitmap 7.5 Mb of memory, but probably some orders of magnitude more than the few milliseconds it takes to pick out the 20 rows directly from the specialized index. Well, why rely on gut feelings - when things can be measured. I didn't take those figures from the production database server though, but at least it gives a hint on what to expect. First, using the three-key index for select * from acc_trans where customer_id=? and trans_type=? order by created desc limit 20. I chose one of the users with most transactions, and I tested with the most popular transaction type as well as one transaction type where he has just a handful of transactions. Both took significantly less than 1 ms to run. Then I deleted all indexes and created the three suggested indexes. Using the popular transaction type, it took 123 ms. Well, that's 500 times as much time, but still acceptable. Here is the query plan: = explain analyze select * from acc_trans where customer_id=67368 and trans_type=8 order by created desc limit 20; QUERY PLAN - Limit (cost=1486.23..1486.28 rows=20 width=200) (actual time=123.685..123.687 rows=3 loops=1) - Sort (cost=1486.23..1486.34 rows=43 width=200) (actual time=123.684..123.685 rows=3 loops=1) Sort Key: created Sort Method: quicksort Memory: 25kB - Bitmap Heap Scan on acc_trans (cost=1313.90..1485.08 rows=43 width=200) (actual time=121.350..123.669 rows=3 loops=1) Recheck Cond: ((trans_type = 8) AND (customer_id = 67368)) - BitmapAnd (cost=1313.90..1313.90 rows=43 width=0) (actual time=120.342..120.342 rows=0 loops=1) - Bitmap Index Scan on account_transaction_on_type (cost=0.00..256.31 rows=13614 width=0) (actual time=12.200..12.200 rows=43209 loops=1) Index Cond: (trans_type = 8) - Bitmap Index Scan on account_transaction_on_user (cost=0.00..1057.31 rows=56947 width=0) (actual time=104.578..104.578 rows=59133 loops=1) Index Cond: (users_id = 67368) Total runtime: 123.752 ms (12 rows) With the most popular trans type it chose another plan and it took more than 3s (totally unacceptable): = explain analyze select * from acc_trans where customer_id=67368 and trans_type=6 order by created desc limit 20; QUERY PLAN --- Limit (cost=0.00..44537.82 rows=20 width=200) (actual time=1746.288..3204.029 rows=20 loops=1) - Index Scan Backward using account_transaction_on_created on acc_trans (cost=0.00..55402817.90 rows=24879 width=200) (actual time=1746.285..3204.021 rows=20 loops=1) Filter: ((customer_id = 67368) AND (trans_type = 6)) Total runtime: 3204.079 ms (4 rows) Although this customer has several tens of thousands of transactions, dropping the three-key-index and use an index on users_id,created is clearly a better option than running out of memory: = explain analyze select * from acc_trans where customer_id=67368 and trans_type=8 order by created desc limit 20; QUERY PLAN --- Limit (cost=0.00..98524.88 rows=20 width=200) (actual time=0.669..197.012 rows=3 loops=1) - Index Scan Backward using account_transaction_by_user_ts on acc_trans (cost=0.00..211828.49 rows=43 width=200) (actual time=0.668..197.006 rows=3 loops=1) Index Cond: (customer_id = 67368) Filter: (trans_type = 8) Total runtime: 197.066 ms (5 rows) 0.2s sounds acceptable, it's just that this may be just a small part of building the web page, so it adds up ... and probably (I didn't check how profitable this customer is) this is probably exactly the kind of customer we wouldn't want to get annoyed with several seconds page load time. -- Sent via pgsql-performance mailing list
Re: [PERFORM] Memory usage - indexes
On 25 September 2010 00:00, Greg Smith g...@2ndquadrant.com wrote: Overindexed tables containing more columns than are actually selective is a very popular source of PostgreSQL slowdowns. It's easy to say oh, I look this data up using columns a,b,c, so lets put an index on a,b,c. But if an index on a alone is 1% selective, that's probably wrong; just index it instead, so that you have one lean, easy to maintain index there that's more likely to be in RAM at all times. Let the CPU chew on filtering out which of those 1% matches also match the (b,c) criteria instead. Hm ... yes, we have quite many of those indexes. Some of them we can't live without. Digging out 1% out of a fat 100M table (1M rows) when one really just needs 20 rows is just too costly. Well, I guess we should try to have a serious walk-through to see what indexes really are needed. After all, that really seems to be our main problem nowadays - some frequently used indexes doesn't fit very snuggly into memory. Every drop an index in a transaction block just to see how a query plan changes if it's not there anymore, then rollback so it never really went away? Great fun for this sort of experiment, try it sometime. Yes, I was playing a bit with it long time ago ... but it seems a bit risky to do this in the production environment ... wouldn't want inserts to get stuck due to locks. There is also the problem that we don't really have an overview of which queries would be affected if dropping an index. Best thing we can do is to drop an index and monitor the stats on seq scans, new slow queries popping up, etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 24 September 2010 18:23, Bob Lunney bob_lun...@yahoo.com wrote: Consult pg_statio_user_indexes to see which indexes have been used and how much. What is the main differences between pg_statio_user_indexes and pg_stat_user_indexes? Indexes with comparitively low usages rates aren't helping you much and are candidates for elimination. No doubt about that - but the question was, would it really help us to drop those indexes? I think the valid reasons for dropping indexes would be: 1) To speed up inserts, updates and deletes 2) To spend less disk space 3) Eventually, speed up nightly vacuum (it wouldn't be an issue with autovacuum though) 4) To spend less memory resources? I'm not at all concerned about 1 and 2 above - we don't have any performance issues on the write part, and we have plenty of disk capacity. We are still doing the nightly vacuum thing, and it does hurt us a bit since it's dragging ever more out in time. Anyway, it's number four I'm wondering most about - is it anything to be concerned about or not for the least frequently used indexes? An index that aren't being used would just stay on disk anyway, right? And if there are limited memory resources, the indexes that are most frequently used would fill up the cache space anyway? That's my thoughts at least - are they way off? We did have similar experiences some years ago - everything was running very fine all until one day when some semi-complicated very-frequently-run selects started taking several seconds to run rather than tens of milliseconds. I found that we had two slightly overlapping indexes like this ... account_transaction(customer_id, trans_type) account_transaction(customer_id, trans_type, created) both of those indexes where heavily used. I simply dropped the first one, and the problems disappeared. I assume that both indexes up to some point fitted snuggly into memory, but one day they were competing for the limited memory space, dropping the redundant index solved the problem all until the next hardware upgrade. I would never have found those indexes searching for the least used indexes in the pg_stat(io)_user_indexes view. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 24 September 2010 00:12, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: All good questions! Before (or maybe as well as) looking at index sizes vs memory I'd check to see if any of your commonly run queries have suddenly started to use different plans due to data growth, e.g: - index scan to seq scan (perhaps because effective_cache_size is too small now) - hash agg to sort (work_mem too small now) Would be trivial if we had a handful of different queries and knew the plans by heart ... but our setup is slightly more complex than that. I would have to log the plans, wouldn't I? How would you go about it? I was having some thoughts to make up some script to scan through the postgres log, extract some stats on the queries run, and even do some explains and store query plans. We've started to chase down on seq scans (causing us to create even more indexes and eating up more memory...). I have set up a simple system for archiving stats from pg_stat_user_tables now, like this: insert into tmp_pg_stat_user_tables select *,now() as snapshot from pg_stat_user_tables ; NBET= \d tmp_delta_pg_stat_user_tables View public.tmp_delta_pg_stat_user_tables Column | Type | Modifiers --+--+--- duration | interval | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch| bigint | n_tup_ins| bigint | n_tup_upd| bigint | n_tup_del| bigint | n_tup_hot_upd| bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamp with time zone | last_autovacuum | timestamp with time zone | last_analyze | timestamp with time zone | last_autoanalyze | timestamp with time zone | View definition: SELECT now() - b.snapshot AS duration, a.relname, a.seq_scan - b.seq_scan AS seq_scan, a.seq_tup_read - b.seq_tup_read AS seq_tup_read, a.idx_scan - b.idx_scan AS idx_scan, a.idx_tup_fetch - b.idx_tup_fetch AS idx_tup_fetch, a.n_tup_ins - b.n_tup_ins AS n_tup_ins, a.n_tup_upd - b.n_tup_upd AS n_tup_upd, a.n_tup_del - b.n_tup_del AS n_tup_del, a.n_tup_hot_upd - b.n_tup_hot_upd AS n_tup_hot_upd, a.n_live_tup, a.n_dead_tup, a.last_vacuum, a.last_autovacuum, a.last_analyze, a.last_autoanalyze FROM pg_stat_user_tables a, tmp_pg_stat_user_tables b WHERE b.snapshot = (( SELECT max(tmp_pg_stat_user_tables.snapshot) AS max FROM tmp_pg_stat_user_tables)) AND b.relname = a.relname; -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 24 September 2010 19:16, Brad Nicholson bnich...@ca.afilias.info wrote: [Brad Nicholson] Why is the vacuum dragging out over time? Is the size of your data increasing, are you doing more writes that leave dead tuples, or are your tables and/or indexes getting bloated? Digressing a bit here ... but the biggest reason is the data size increasing. We do have some bloat-problems as well - every now and then we decide to shut down the operation, use pg_dump to dump the entire database to an sql file and restore it. The benefits are dramatic, the space requirement goes down a lot, and often some of our performance-problems goes away after such an operation. Also, is there a reason why you do nightly vacuums instead of letting autovacuum handle the work? If it was to me, we would have had autovacuum turned on. We've had one bad experience when the autovacuumer decided to start vacuuming one of the biggest table at the worst possible moment - and someone figured autovacuum was a bad idea. I think we probably still would need regular vacuums to avoid that happening, but with autovacuum on, maybe we could have managed with regular vacuums only once a week or so. We started doing far less vacuuming when we let autovacuum handle things. What do you mean, that you could run regular vacuum less frequently, or that the regular vacuum would go faster? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 24 September 2010 21:06, Bob Lunney bob_lun...@yahoo.com wrote: First off, what version of PostgreSQL are you running? If you have 8.4, nightly vacuuming shouldn't be necessary with properly tuned autovacuum jobs. 8.3. We'll upgrade to 9.0 during the December holidays fwiw. But point taken, I will continue to push for autovacuum to be turned on. Anyway, I think the nightly vacuuming does have some merit. For some of the queries, most of the daytime we're quite sensitive to latency. Well, I guess the proper solution to that is to tune the autovacuum configuration so it acts less aggressively at the times of the day where we need low latency... You're correct, unused indexes will remain on disk, but indexes that don't completely fit into memory must be read from disk for each index scan, and that hurts performance. (In fact, it will suddenly drop like a rock. BTDT.) Sounds quite a lot like our problems nowadays - as well as previous time when I found that overlapping index that could be dropped. By making smaller equivalent indexes on partitioned data the indexes for individual partitions are more likely to stay in memory, which is particularly important when multiple passes are made over the index by a query. I was looking a bit into table partitioning some years ago, but didn't really find any nice way to partition our tables. One solution would probably be to partition by creation date and set up one partition for each year, but it seems like a butt ugly solution, and I believe it would only help if the select statement spans a date range on the creation time. You are correct on all the points you make concerning indexes, but point 4 is the one I'm referring to. You discovered this independently yourself, according to your anecdote about the overlapping indexes. Yes, but that was the heavily used index ... my belief is that the _unused_ index, or infrequently used index wouldn't cause such memory problems. (Then again, I suppose it would be faster to scan a non-optimal index that is in memory than an optimal index that is on disk?) Well, if both you and Greg Smith recommends to drop those indexes, I suppose we probably should do that ... ;-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 24 September 2010 00:12, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: Re index size, you could try indexes like: some_table(a) some_table(b) which may occupy less space, and the optimizer can bitmap and/or them to work like the compound index some_table(a,b). Hm ... never considered that ... but is it cost effective on large indexes? I guess I should do some testing ... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
On 24 September 2010 21:24, Brad Nicholson bnich...@ca.afilias.info wrote: The pertinent difference between pg_stat_user_indexes and pg_statio_user_indexes is the latter shows the number of blocks read from disk or found in the cache. I have a minor, but very important correction involving this point. The pg_statio tables show you what blocks are found in the Postgres buffer cache, and what ones are not. Right. Then, studying how the pg_statio table develops over time would probably give a hint on my first question in my original post ... how to check the hypothesis that we're running out of memory. That said, I've sent an email to our sysadmin asking him to consider the pg_buffercache module suggested by Greg Smith. Increasing the shared_buffers on the cost of OS caches would then have one selling point ... better possibilities to monitor the memory usage. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory usage - indexes
Thanks for spending your time on this ... amidst all the useful feedback I've received, I'd rate your post as the most useful post. 1) Are there any good ways to verify my hypothesis? You can confim easily whether the contents of the PostgreSQL buffer cache contain when you think they do by installing pg_buffercache. My paper and sample samples at http://www.pgcon.org/2010/schedule/events/218.en.html go over that. I've asked the sysadmin to consider installing it. From what I understood from other posts, the pg_statio_user_indexes and pg_statio_user_tables would also indicate to what extent data is found in shared buffers and not. Monitoring it over time could possibly help us predicting the tipping points before they happen. Though still, if most of the cacheing takes place on the OS level, one wouldn't learn that much from studying the shared buffers usage ... You can also integrate that with a look at the OS level information by using pgfincore: http://www.pgcon.org/2010/schedule/events/261.en.html ... ah, right ... :-) I've found that if shared_buffers is set to a largish size, you can find out enough information from look at it to have a decent idea what's going on without going to that depth. But it's available if you want it. Haven't studied it in details yet, but the information value in itself may be a selling point for increasing the buffer size. I have easily measurable improvements on client systems increasing shared_buffers into the 4GB - 8GB range. Popular indexes move into there, stay there, and only get written out at checkpoint time rather than all the time. Ours is at 12 GB, out of 70 GB total RAM. However, if you write heavily enough that much of this space gets dirty fast, you may not be be able to go that high before checkpoint issues start to make such sizes impractical. I think we did have some issues at some point ... we do have some applications that are very sensitive towards latency. Though, I think the problem was eventually solved. I think I somehow managed to deliver the message that it was not a good idea to store keep-alive-messages sent every second from multiple clients into the main production database, and that it was an equally bad idea to disconnect the clients after a three seconds timeout :-) Anyway, today we have mostly issues with read access, not write access. Using minimal values works for some people, particularly on Windows, Huh ... does it mean Windows have better OS cache handling than Linux? To me it sounds insane to run a database under a buggy GUI ... but I suppose I should keep that to myself :-) Yes, it would run slower, because now it has to sort through blocks in a larger index in order to find anything. How significant that is depends on the relative size of the indexes. To give a simple example, if (a) is 1GB, while (a,b) is 2GB, you can expect dropping (a) to halve the speed of index lookups. Fatter indexes just take longer to navigate through. Linear relationship between the time it takes to do index lookups vs the fatness of the index? That's not what I guessed in the first place ... but I suppose you're right. P.S. You seem busy re-inventing pgstatspack this week: http://pgfoundry.org/projects/pgstatspack/ does all of this take a snapshot of the stats and store it in the database for future analysis work for you. Working on that intead of continuing to hack individual storage/retrieve scripts for each statistics counter set would be a better contribution to the PostgreSQL community. Sometimes it takes more work to implement work already done by others than to reimplement the logics ... but anyway, I will have a look before I make more snapshot tables ;-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] locking issue on simple selects?
On 15 September 2010 12:05, Tobias Brox tobi...@gmail.com wrote: Recently we've frequently encountered issues where some simple selects (meaning, selects doing an index lookup and fetching one row) have become stuck for several minutes. Apparently all requests on one exact table gets stuck, all requests not related to said table are going through without any problems. Now I've set up all kind of logging regarding locks, so it seems like we're having issues that aren't lock-related. I just did a bit of research into one situation today. All while having this problem, there was one heavy query running in parallell ... not sure if that's relevant. Then comes one query that requires a seq scan on the problem table (that won't happen again - I just added a new index). Four seconds later comes another query requiring a simple index lookup. Still more queries comes in, most of them simple index lookups, but on different indexes. After one minute there are 25 queries in the pg_stat_activity view towards this table. It's not a particularly huge table. Moments later all 25 queries have been executed. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Memory usage - indexes
We've come to a tipping point with one of our database servers, it's generally quite loaded but up until recently it was handling the load well - but now we're seeing that it struggles to process all the selects fast enough. Sometimes we're observing some weird lock-like behaviour (see my other post on that), but most of the time the database server is just not capable of handling the load fast enough (causing the queries to pile up in the pg_stat_activity-view). My main hypothesis is that all the important indexes would fit snuggly into the memory before, and now they don't. We'll eventually get the server moved over to new and improved hardware, but while waiting for that to happen we need to do focus on reducing the memory footprint of the database. I have some general questions now ... 1) Are there any good ways to verify my hypothesis? Some months ago I thought of running some small memory-gobbling program on the database server just to see how much memory I could remove before we would see indications of the database being overloaded. It seems a bit radical, but I think the information learned from such an experiment would be very useful ... and we never managed to set up any testing environment that faithfully replicates production traffic. Anyway, it's sort of too late now that we're already observing performance problems even without the memory gobbling script running. 2) I've seen it discussed earlier on this list ... shared_buffers vs OS caches. Some claims that it has very little effect to adjust the size of the shared buffers. Anyway, isn't it a risk that memory is wasted because important data is stored both in the OS cache and the shared buffers? What would happen if using almost all the available memory for shared buffers? Or turn it down to a bare minimum and let the OS do almost all the cache handling? 3) We're discussing to drop some overlapping indexes ... i.e. to drop one out of two indexes looking like this: some_table(a) some_table(a,b) Would the query select * from some_table where a=? run slower if we drop the first index? Significantly? (in our situation I found that the number of distinct b's for each a is low and that the usage stats on the second index is quite low compared with the first one, so I think we'll drop the second index). 4) We're discussing to drop other indexes. Does it make sense at all as long as we're not experiencing problems with inserts/updates? I suppose that if the index isn't used it will remain on disk and won't affect the memory usage ... but what if the index is rarely used ... wouldn't it be better to do a seqscan on a table that is frequently accessed and mostly in memory than to consult an index that is stored on the disk? Sorry for all the stupid questions ;-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] locking issue on simple selects?
On 23 September 2010 22:55, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Have you turned on checkpoint logging? Yes ... it seems so: 13:19:13.840 - LOG: checkpoint complete: wrote 3849 buffers (0.2%); 0 transaction log file(s) added, 0 removed, 5 recycled; write=269.551 s, sync=0.103 s, total=269.953 s 13:19:13.841 - LOG: checkpoint starting: xlog 13:19:33 - the seq scan query towards the affected table started 13:20:31.454 - one of the index lookup queries towards the affected table was finished 13:20:43.176 - LOG: checkpoint complete: wrote 108199 buffers (6.9%); 0 transaction log file(s) added, 0 removed, 16 recycled; write=11.521 s, sync=77.533 s, total=89.335 s You might want to see if these are happening at some particular point in the checkpoint processing. If so, look through the archives for posts from Greg Smith on how to tune that -- he's worked out a nice methodology to iteratively improve your configuration in this regard. Thank you, I will ... hmm ... I found this blog post: http://blog.2ndquadrant.com/en/2010/01/measuring-postgresql-checkpoin.html Of course I'm doing it my own way: select *,now() as snapshot into tmp_pg_stat_bgwriter from pg_stat_bgwriter ; create view tmp_delta_pg_stat_bgwriter as select a.checkpoints_timed-b.checkpoints_timed as checkpoints_timed,a.checkpoints_req-b.checkpoints_req as checkpoints_req,a.buffers_checkpoint-b.buffers_checkpoint as buffers_checkpoint,a.buffers_clean-b.buffers_clean as buffers_clean,a.maxwritten_clean-b.maxwritten_clean as maxwritten_clean,a.buffers_backend-b.buffers_backend as buffers_backend,a.buffers_alloc-b.buffers_alloc as buffers_alloc, now()-b.snapshot as interval from pg_stat_bgwriter a , (select * from tmp_pg_stat_bgwriter order by snapshot desc limit 1) as b; Checkpoint timeout is set to 5 minutes. Right now we're having relatively low activity. I'm not sure how to read the stats below, but they look OK to me: select * from tmp_delta_pg_stat_bgwriter ; checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc | interval ---+-++---+--+-+---+- 3 | 0 | 8277 | 15 |0 | 185 | 18691 | 00:12:02.988842 (1 row) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] locking issue on simple selects?
We have a production database server ... it's quite busy but usually working completely fine, simple queries taking a fraction of a millisecond to run. Recently we've frequently encountered issues where some simple selects (meaning, selects doing an index lookup and fetching one row) have become stuck for several minutes. Apparently all requests on one exact table gets stuck, all requests not related to said table are going through without any problems. According to the pg_stat_activity view, all queries getting stuck was read-queries (selects), no updates or anything like that (some of the transactions were doing updates and/or inserts though). The obvious thought seems to be that this is a locking issue ... but it doesn't seem so. For one thing, AFAIK locking shouldn't affect selects, only updates? I've also looked through tons of logs without finding any obvious locking issues. In one of the instances, I could find that there were some open transactions doing updates on one row in the table and then later becoming stuck (as all other transactions) when doing a select on another row in the said table. My second thought was that the database is on the edge of being overloaded and that the memory situation is also just on the edge ... important indexes that used to be in memory now has to be accessed from the disk. Still, it doesn't make sense, we're not seeing any serious impact on the CPU iowait status, and it seems improbable that it should take minutes to load an index? There aren't any long-lasting transactions going on when the jam occurs. I haven't checked much up, usually the jam seems to resolve itself pretty instantly, but I think that at some point it took half a minute from the first query was finished until the pg_stat_activity view got back to normal (meaning typically 0-5 simultaneously processed queries). FWIW, we're running pg 8.3.11, transaction isolation level serializable. The machine is quad-core hyperthreaded (16 CPUs visible to the OS), a SAN is used for storage, and different RAIDs are used for the root partition, pg data and pg wals. Any ideas? I'm aware that some configuration (i.e. checkpoint interval etc) may cause significant delay on write-queries ... but this is only read-queries. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] locking issue on simple selects?
On 15 September 2010 15:39, Tom Lane t...@sss.pgh.pa.us wrote: An exclusive lock will block selects too. Have you looked into pg_locks for ungranted lock requests? Well - I thought so, we have a logging script that logs the content of the pg_locks table, it didn't log anything interesting but it may be a problem with the script itself. It does an inner join on pg_locks.relation = pg_class.oid but when I check now this join seems to remove most of the rows in the pg_locks table. Does it make sense at all to join pg_class with pg_locks? I will ask the sysadm to change to an outer join as for now. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] locking issue on simple selects?
On 15 September 2010 21:28, Greg Smith g...@2ndquadrant.com wrote: There are some useful examples of lock views on the wiki: http://wiki.postgresql.org/wiki/Lock_Monitoring http://wiki.postgresql.org/wiki/Lock_dependency_information http://wiki.postgresql.org/wiki/Find_Locks Thanks. I think those pages probably should be merged ... hmm ... if I manage to solve my locking issues I should probably try and contribute to the wiki. Reading the wiki pages, for me it boils down to three things: 1) the current query we're logging seems good enough except that we should do an outer join except for inner join towards pg_class, so I've asked our sysadm to fix it. 2) the middle query on http://wiki.postgresql.org/wiki/Lock_Monitoring seems very useful, and I've asked our sysadm to set up logging of this one as well. 3) That log_lock_waits config option that you and Brad points to seems very useful, so I've asked our sysadm to enable it. I also discovered that there is an attribute pg_stat_activity.waiting - I suppose it is 't' if a query is waiting for a lock? It seems quite useful ... reducing deadlock_timeout. It's set to one second, and some of the jams we have been experiencing has lasted for several minutes. I also think it should say in the pg log if there is a deadlock situation? I grepped for deadlock in the logs without finding anything. Well, we'll improve the logging, and wait for the next jam to occur ... and I'll make sure to post an update if/when I figure out something. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Disable WAL completely
[Erik Jones] Right. Without the xlog directory you'll have very little chance of ever doing any kind of clean stop/start of your database. If you don't need the reliability offered by Postgres's use of transaction logs you'll probably be much better served with a different database or even a completely different storage scheme than trying to make Postgres fit that bill. We actually have some postgres databases that are read-only, others that can be rebuilt by a script or from some old backup, and yet others that can be wiped completely without ill effects ... and others where we would prefer to keep all the data, but it would be no disaster if we lose some. Maybe we would be better off not using postgres for those purposes, but it's oh so much easier for us to stick to one database system ;-) We've considered both running postgres from a ram-disk and to have the fsync turned off for some of our databases, but right now we're running all off one host, fsync didn't reduce the performance that much, and after one disasterous power failure we considered that it was not worth the effort to have fsync turned off. That being said, postgres is probably not an optimal solution for an embedded system running on flash memory ;-) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Anyone using a SAN?
[Peter Koczan - Wed at 10:56:54AM -0600] We're considering setting up a SAN where I work. Is there anyone using a SAN, for postgres or other purposes? If so I have a few questions for you. Some time ago, my boss was planning to order more hardware - including a SAN - and coincidentally, SANs were discussed at this list as well. The consensus on this list seemed to be that running postgres on SAN is not cost efficiently - one would get better performance for a lower cost if the database host is connected directly to the disks - and also, buying the wrong SAN can cause quite some problems. My boss (with good help of the local SAN-pusher) considered that the arguments against the SAN solution on this list was not really valid for an enterprise user. The SAN-pusher really insisted that through a state-of-the-art SAN theoretically it should be possible to achieve far better bandwidth as well as lower latency to the disks. Personally, I don't have the clue, but all my colleagues believes him, so I guess he is right ;-) What I'm told is that the state-of-the-art SAN allows for an insane amount of hard disks to be installed, much more than what would fit into any decent database server. We've ended up buying a SAN, the physical installation was done last week, and I will be able to tell in some months if it was a good idea after all, or not. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Anyone using a SAN?
[Arjen van der Meijden] Your SAN-pusher should have a look at the HP-submissions for TPC-C... The recent Xeon systems are all without SAN's and still able to connect hundreds of SAS-disks. Yes, I had a feeling that the various alternative solutions for direct connection hadn't been investigated fully. I was pushing for it, but hardware is not my thing. Anyway, most likely the only harm done by chosing SAN is that it's more expensive than an equivalent solution with direct connected disks. Well, not my money anyway. ;-) Obviously its a bit difficult to share those 628 harddrives amongst several systems, but the argument your colleagues have for SAN isn't a very good one. As far as I've heard, you cannot really benefit much from this with postgres, one cannot have two postgres servers on two hosts sharing the same data (i.e. using one for failover or for CPU/memory-bound read queries). Having the SAN connected to several hosts gives us two benefits, if the database host goes down but not the SAN, it will be quite fast to start up a new postgres instance on a different host - and it will also be possible to take out backups real-time from the SAN without much performance-hit. Anyway, with a warm standby server as described on http://www.postgresql.org/docs/current/interactive/warm-standby.html one can achieve pretty much the same without a SAN. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] autovacuum: recommended?
[EMAIL PROTECTED] The table was quite huge (say 20k of products along with detailed descriptions etc.) and was completely updated and about 12x each day, i.e. it qrew to about 12x the original size (and 11/12 of the rows were dead). This caused a serious slowdown of the application each day, as the database had to scan 12x more data. The tables we had problems with are transaction-type tables with millions of rows and mostly inserts to the table ... and, eventually some few attributes being updated only on the most recent entries. I tried tuning a lot, but gave it up eventually. Vacuuming those tables took a long time (even if only a very small fraction of the table was touched) and the performance of the inserts to the table was reduced to a level that could not be accepted. By now we've just upgraded the hardware, so it could be worth playing with it again, but our project manager is both paranoid and conservative and proud of it, so I would have to prove that autovacuum is good for us before I'm allowed to turn it on again ;-) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] autovacuum: recommended?
[Gábor Farkas - Fri at 10:40:43AM +0100] my question is: is it recommended to use it? or in other words, should i only use autovacuum? or it's better to use manual-vacuuming? which one is the way of the future :) ? or should i use both auto-vacuum and manual-vacuum? Nightly vacuums are great if the activity on the database is very low night time. A combination is also good, the autovacuum will benefit from the nightly vacuum. My gut feeling says it's a good idea to leave autovacuum on, regardless of whether the nightly vacuums have been turned on or not. That being said, we have some huge tables in our database and pretty much traffic, and got quite some performance problems when the autovacuum kicked in and started vacuuming those huge tables, so we're currently running without. Autovacuum can be tuned to not touch those tables, but we've chosen to leave it off. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] SAN vs Internal Disks
[Decibel! - Tue at 06:07:44PM -0500] It's still in the same chassis, though, which means if you lose memory or mobo you're still screwed. In a SAN setup for redundancy, there's very little in the way of a single point of failure; generally only the backplane, and because there's very little that's on there it's extremely rare for one to fail. Funny, the only time we lost a database server was due to a backplane failure ... ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] SAN vs Internal Disks
We're also considering to install postgres on SAN - that is, my boss is convinced this is the right way to go. Advantages: 1. Higher I/O (at least the salesman claims so) 2. Easier to upgrade the disk capacity 3. Easy to set up warm standby functionality. (Then again, if the postgres server fails miserably, it's likely to be due to a disk crash). Also, my boss states that all big enterprises uses SAN nowadays. Disadvantages: 1. Risky? One gets the impression that there are frequent problems with data integrity when reading some of the posts in this thread. 2. Expensive 3. Single point of failure ... but that you have either it's a SAN or a local disk, one will anyway need good backup systems (and eventually warm standby-servers running from physically separated disks). 4. More complex setup? 5. If there are several hosts with write permission towards the same disk, I can imagine the risks being higher for data integrity breakages. Particularly, I can imagine that if two postgres instances is started up towards the same disk (due to some sysadmin mistake), it could be disasterous. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Long running transaction in pg_activity_log
[Sachchida Ojha - Wed at 04:40:09PM -0400] I see some long running transaction in my pg_activity_log table. My app becomes almost unusable. My question is How can I query the database to see what sql these transactions are running. IDLE in transaction means that no sql query is running at the moment. Most probably it's a problem with the application, it starts a transaction, but does not close it (through a commit or rollback). This is very harmful for the performance, as the hours, days and weeks pass by any database supporting transactions will get problems. Restarting the application and vacuuming is a one-time-shot which should solve the problem for a short while. For a permanent fix, the application needs to be fixed, or you'll have to ensure that the autocommit feature is used. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance Testing Utility
[Christo Du Preez - Wed at 12:25:20PM +0200] Is there some kind of performance testing utility available for postgresql Something I can run after installing postgresql to help me identify if my installation is optimal. I've been battling for days now trying to sort out performance issues and something like that may just identify issues I'm not even aware of or considering at this stage. If you are really having performance problems, my general experience is that you should look into the queries and usage patterns rather than the configuration. The server configuration can only give marginal benefits, compared to query and usage tuning. It often a good idea to turn on the stats collector, even if it slows down postgres a bit. One of the things the stats collector gives is the pg_stat_activity view, where you can find everything the server is working on exactly now; checking up this view while you are actually experiencing problems can give a lot of information. Another thing I've noticed, is that the activity from our applications often can come in bursts; the server can be like 70% idle most of the time, but when the server is struck by 4-5 IO-heavy queries at the same time in addition to the steady flow of simple queries, it can easily get trashed. I've made up an algorithm to stop this from happening, before running a transaction which is either heavy or not considered very important, the activity view will be scanned, and if the server is already working with many queries, the application will sleep a bit and try again - and eventually return an error message (please try again later) if it's doing interactive stuff. Another thing that we've experienced - be aware of pending transactions! It's important to commit or roll back every transaction within reasonable time - if (i.e. due to a programming mistake or a DBA starting a transaction in psql) a transaction is pending for several hours or even ays, it is really very bad for the performance. Another experience we have is that autovacuum can be quite naughty when one has some really huge tables. This can be tweaked by disabling autovacuum at those tables, and running a nightly vacuum instead. Apologies for not replying to your question, though ;-) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] pg_stat_* collection
[Alexander Staubo - Thu at 04:52:55PM +0200] I have been considering tarring them up as a proper release at some point. Anyone interested? Yes. Eventually I have my own collection as well: db_activity - counts the number of (all, slow, very slow, stuck idle in transaction) queries in progress; this is one of the better indicators on how busy/overloaded the database is. (I also have a separate script dumping the contents from pg_stat_activity to a log file, which I frequentlymonitor by tail -F). db_commits + db_rollbacks pr database - I'm not sure if those are useful for anything, will eventually remove them. Maybe nice to be able to compare the activity between different databases running on the same host, if they are comparable. db_connections - num of connections compared to max connections. Useful for alarms. db_hanging_transactions - age of oldest transaction. Useful for alarms, since hanging transactions can be very bad for the db performance. db_locks - monitors the number of locks. I've never actually needed this for anything, maybe I should remove it. db_num_backends - number of backends, sorted by databases. Probably not so useful. db_space (one for each database) - monitors space usage, found this script through google. db_xid_wraparound - gives alarms if the databases aren't beeing vacuumed. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] pg_stat_* collection
[Greg Smith - Fri at 12:53:55AM -0400] Munin is a very interesting solution to this class of problem. They've managed to streamline the whole data collection process by layering clever Perl hacks three deep. It's like the anti-SNMP--just build the simplest possible interface that will work and then stop designing. The result is so easy to work with that it's no surprise people like Munin. It's fairly easy to throw in new graphs, and I like that. One of the drawbacks is that it spends a lot of CPU building the graphs etc - if I continue adding graphs in my current speed, and we set up even more servers, soon it will take us more than five minutes generating the graphs. Also, local configuration can be tricky. Locally I fix this by loading a config file with a hard-coded path. Luckily, as long as the postgres munin plugins are run at localhost as the postgres user, most of them don't need any configuration. Still, it can be useful to tune the alarm thresholds. It's also completely inappropriate for any environment I work in, because there really is no thought of security whatsoever in the whole thing. What I'm still thinking about is whether it's possible to fix that issue while still keeping the essential simplicity that makes Munin so friendly. What layers of security do you need? We're using https, basic auth and ssh-tunnels. We've considered the munin data to be regarded as confidential, at the other hand it's nothing ultra-secret there; i.e. securing the backups of the production database probably deserves more attention. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Long running transactions again ...
We had problems again, caused by long running transactions. I'm monitoring the pg_stat_activity view, checking the query_start of all requests that are not idle - but this one slipped under the radar as the application was running frequent queries towards the database. That's not what concerns me most. We had two databases running under postgres at this host - like, main production database (A) and a separate smaller database for a separate project (B). As far as I understood postgres philosophy, the databases should be isolated from each other, i.e. one are not allowed to create a query that goes across the database borders (select * from A.customers join B.logins ...). So, I was surprised to see that the application working towards database B managed to jam up database A, to the extent that we couldn't get A vacuumed properly. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: OT: Munin (was Re: [PERFORM] Determining server load from client)
I have my postgres munin monitoring script at http://oppetid.no/~tobixen/pg_activity.munin.txt (had to suffix it with .txt to make the local apache happy). I would like to see what others have done as well. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: OT: Munin (was Re: [PERFORM] Determining server load from client)
[Erik Jones - Wed at 09:31:48AM -0500] I use cacti (http://cacti.net) which does the same thing that munin does but in php instead. Here's what I use to db stats to it (again, php): I haven't tried cacti, but our sysadm has done a little bit of research and concluded cacti is better. Maybe some day we'll move over. Munin is generating all the graphs statically every fifth minute, while cacti generates them on demand as far as I've understood. The munin approach is pretty bloat, since one usually would watch the graphs much more seldom than what they are generated (at least, we do). That's not really an argument since CPU is cheap nowadays - but a real argument is that the munin approach is less flexible. One would like to adjust the graph (like, min/max values for both axis) while watching quite some times. $actives_sql = SELECT COUNT(*) FROM pg_stat_activity WHERE current_query NOT ILIKE 'idle' AND now() - query_start '1 second';; So this one is quite similar to mine ... $db_stat_sql = SELECT {$argv[2]} FROM pg_stat_database WHERE datname='$db_name';; I was not aware of this view - it can probably be useful for us. I will add this one when I get the time ... (I'm at vacation now). ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] extract(field from timestamp) vs date dimension
Does anyone have experience with using postgres for data warehousing? Right, I saw one post suggestion to use mysql for a mostly read-only database ... but anyway, I think it's not a question to change the database platform for this project, at least not today ;-) Ralph Kimball seems to be some kind of guru on data warehousing, and in his books he's strongly recommending to have a date dimension - simply a table describing all dates in the system, and having attributes for what day of the week it is, month, day of the month, week number, bank holiday, anything special, etc. Well, it does make sense if adding lots of information there that cannot easily be pulled out from elsewhere - but as for now, I'm mostly only interessted in grouping turnover/profit by weeks/months/quarters/years/weekdays. It seems so much bloated to store this information, my gut feeling tells it should be better to generate them on the fly. Postgres even allows to create an index on an expression. The question is ... I'm curious about what would yield the highest performance - when choosing between: select extract(week from created), ... from some_table where ... group by extract(week from created), ... sort by extract(week from created), ... and: select date_dim.week_num, ... from some_table join date_dim ... where ... group by date_dim.week_num, ... sort by date_dim, week_num, ... The date_dim table would eventually cover ~3 years of operation, that is less than 1000 rows. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] extract(field from timestamp) vs date dimension
[Chad Wagner - Tue at 08:24:34AM -0500] I guess go with your gut, but at some point the expressions are going to be too complicated to maintain, and inefficient. The layout of my system is quite flexible, so it should eventually be fairly trivial to throw in a date dimension at a later stage. Calendar tables are very very common, because traditional date functions simply can't define business logic (especially things like month end close, quarter end close, and year end close) that doesn't have any repeating patterns (every 4th friday, 1st monday in the quarter, etc). Sure you can stuff it into a function, but it just isn't as maintainable as a table. So far I haven't been bothered with anything more complex than clean weeks, months, quarters, etc. I suppose the strongest argument for introducing date dimensions already now is that I probably will benefit from having conform and well-designed dimensions when I will be introducing more data marts. As for now I have only one fact table and some few dimensions in the system. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Planner statistics, correlations
We have a table with a timestamp attribute (event_time) and a state flag which usually changes value around the event_time (it goes to 4). Now we have more than two years of events in the database, and around 5k of future events. It is important to frequently pick out overdue events, say: select * from events where state4 and event_timenow() This query would usually yield between 0 and 100 rows - however, the planner doesn't see the correlation betewen state and event_time - since most of the events have event_timenow, the planner also assumes most of the events with state4 has event_timenow, so the expected number of rows is closer to 5k. This matters, because I have a query with joins, and I would really benefit from nested loops. (I've tried replacing now() above with different timestamps from the future and the past. I'm using pg 8.2) Any suggestions? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Planner statistics, correlations
[Peter Childs - Fri at 08:56:54AM +] Can you say what state might be rather than what it is not. I'm guess that state is an int but there is only a limited list of possible states, if you can say what it might be rather than what it is the index is more liklly to be used. explain select * from events where state in (1,2,3) and event_timenow() also estimates almost 5k of rows. I also tried: explain select * from events where state=2 and event_timenow() but get the same behaviour. Maybe it would help to partitionate the table every year? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Planner statistics, correlations
[Richard Huxton - Fri at 09:17:48AM +] Try a partial index: CREATE INDEX my_new_index ON events (event_time) WHERE state in (1,2,3); I have that, the index is used and the query is lightning fast - the only problem is that the planner is using the wrong estimates. This becomes a real problem when doing joins and more complex queries. Now, if that doesn't work you might want to split the query into two... Hm, that's an idea - to use a two-pass query ... first: select max(event_time) from events where state in (1,2,3); and then use the result: select * from events where event_time? and event_timenow() and state in (1,2,3) This would allow the planner to get the estimates in the right ballpark (given that the events won't stay for too long in the lower states), and it would in any case not be significantly slower than the straight-ahead approach - but quite inelegant. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Planner statistics, correlations
[Heikki Linnakangas - Fri at 10:41:34AM +] I thought about partitioning the table by state, putting rows with state=4 into one partition, and all others to another partition. That sounds like a good idea - but wouldn't that be costly when changing state? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Missing the point of autovacuum
[Daryl Herzmann - Sat at 12:59:03PM -0600] As the months have gone by, I notice many of my tables having *lots* of unused item pointers. For example, Probably not the issue here, but we had some similar issue where we had many long-running transactions - i.e. some careless colleague entering begin into his psql shell and leaving it running for some days without entering commit or rollback, plus some instances where the applications started a transaction without closing it. Perhaps I shouldn't be concerned with this? In all, I've got around 400 GB of data on postgresql, but am not sure how much of it is old data. I didn't count the zeroes, but autovacuum does have rules saying it will not touch the table until some percentages of it needs to be vacuumed off. This is of course configurable. Do I need to be running old fashioned 'vacuumdb' routinely as well? I guess I just don't understand why autovacuum is not automatically doing this for me and I have tables with so many unused item pointers. If you have some period of the day with less activity than else, it is a good idea running an old-fashionated vacuum as well. The regular vacuum process will benefit from any work done by the autovacuum. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] max/min and index usage
We're using 8.1 - I thought such a construct was safe in pg 8.1: select max(indexed_value) from huge_table; while earlier we had to use: select indexed_value from huge_table order by indexed_value desc limit 1; seems like I was wrong: mydb= explain analyze select indexed_value1 from mytable order by indexed_value1 desc limit 1; QUERY PLAN - Limit (cost=0.00..1.96 rows=1 width=4) (actual time=0.115..0.117 rows=1 loops=1) - Index Scan Backward using index1 on mytable (cost=0.00..23890756.52 rows=12164924 width=4) (actual time=0.111..0.111 rows=1 loops=1) Total runtime: 0.162 ms (3 rows) mydb= explain analyze select indexed_value2 from mytable order by indexed_value2 desc limit 1; QUERY PLAN - Limit (cost=0.00..0.04 rows=1 width=4) (actual time=0.128..0.130 rows=1 loops=1) - Index Scan Backward using index2 on mytable (cost=0.00..428231.16 rows=12164924 width=4) (actual time=0.124..0.124 rows=1 loops=1) Total runtime: 0.160 ms (3 rows) mydb= explain analyze select max(indexed_value2) from mytable; QUERY PLAN - Result (cost=0.04..0.05 rows=1 width=0) (actual time=11652.138..11652.139 rows=1 loops=1) InitPlan - Limit (cost=0.00..0.04 rows=1 width=4) (actual time=11652.122..11652.124 rows=1 loops=1) - Index Scan Backward using index2 on mytable (cost=0.00..428231.16 rows=12164924 width=4) (actual time=11652.117..11652.117 rows=1 loops=1) Filter: (indexed_value2 IS NOT NULL) Total runtime: 11652.200 ms (6 rows) mydb= explain analyze select max(indexed_value1) from mytable; QUERY PLAN - Result (cost=1.96..1.97 rows=1 width=0) (actual time=713.780..713.781 rows=1 loops=1) InitPlan - Limit (cost=0.00..1.96 rows=1 width=4) (actual time=713.767..713.770 rows=1 loops=1) - Index Scan Backward using index1 on mytable (cost=0.00..23890756.52 rows=12164924 width=4) (actual time=713.764..713.764 rows=1 loops=1) Filter: (indexed_value1 IS NOT NULL) Total runtime: 713.861 ms (6 rows) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] max/min and index usage
[Tobias Brox - Wed at 04:01:56AM +0100] We're using 8.1 - I thought such a construct was safe in pg 8.1: select max(indexed_value) from huge_table; while earlier we had to use: select indexed_value from huge_table order by indexed_value desc limit 1; seems like I was wrong: The difference is all about those NULL values ... those columns are quite sparsely populated in the table. The second query gives NULL, which is not much useful :-) However, I made a partial index to solve this problem - this query is able to use the partial index: select indexed_value from huge_table where indexed_value is not NULL order by indexed_value desc limit 1; while this one is not: select max(indexed_value) from huge_table; I guess this is a bug? :-) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] max/min and index usage
[Tom Lane - Tue at 10:29:53PM -0500] These are not actually exactly the same thing. In particular, I suppose your table contains a lot of nulls? Yes; I'm sorry I was a bit quick with the first posting. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Defining performance.
[Paul Lathrop - Thu at 02:59:27PM -0800] growing disk space usage. The DBA had come to the conclusion that the VACUUM command did/does not work on these systems, because even after a VACUUM FULL, the size of the database was continually increasing. So, as things stand with the PG7.2 machines, vacuuming is run nightly, and whenever the database size reaches 40Gb on disk (the point at which performance has degraded below tolerance), the DBA exports the data, deletes the database, and then imports the data, shrinking it to the actual size of the dataset. We found one reason why vacuuming didn't always work for us - we had long running transactions - in addition to killing the vacuum, it did really nasty things to the performance in general. To check for those transactions, I think it's needed to turn on stats_command_string in the config. I use this query to check: select * from pg_stat_activity where current_query'IDLE' order by query_start ; If you spot any IDLE in transaction with an old query_start timestamp, then that's most probably the reason. Long running transactions doesn't have to be idle ... check the pg_locks view for the lowest transactionid and compare (through the pid) with the pg_stat_activity view to find the actual backend. However, we still are suffering a gradual decrease in performance over time - or so the application engineers claim. The DBA and I have been banging our heads against this for a month. We're having the same issues, so we do the dumping and restoring every now and then to be sure everything is properly cleaned up. With 8.1. 1) How does one define 'performance' anyway? Is it average time to complete a query? If so, what kind of query? Is it some other metric? We have the same kind of problem, and the project leader (I sometimes refer him as the bottleneck ;-) is most concerned about iowait at our cpu graphs. Anyway, we do have other measures: - our applications does log the duration of each request towards the application as well as each query towards the database. If the request (this is web servers) is taking too long time, it's logged as error instead of debug. If a significant number of such errors is due to database calls taking too much time, then the performance is bad. Unfortunately, we have no way to automate such checking. - I've setting up two scripts pinging that pg_stat_activity view every now and then, logging how much gruff it finds there. Those two scripts are eventually to be merged. One is simply logging what it finds, the other is a plugin system to the Munin graphing package. I've thrown the scripts we use out here: http://oppetid.no/~tobixen/pg_activity_log.txt http://oppetid.no/~tobixen/pg_activity.munin.txt (I had to rename them to .txt to get the web server to play along). Those are very as-is, should certainly be modified a bit to fit to any other production environment. :-) The pg_activity_log dumps a single number indicating the stress level of the database to a file. I think this stress number, when taking out i.e. the 20% worst numbers from the file for each day, can indicate something about the performance of the database server. However, I haven't had the chance to discuss it with the bottleneck yet. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Defining performance.
[Jeff Davis - Thu at 04:57:54PM -0800] We're having the same issues, so we do the dumping and restoring every now and then to be sure everything is properly cleaned up. With 8.1. What's causing that? Is it index bloat? I would think a REINDEX would avoid having to dump/restore, right? A CLUSTER might also be necessary, depending on what kind of performance degradation you're experiencing. Am I missing something? Just as with Paul Lathrops case, the performance degradation is something perceived by the application developers. We haven't had time to actually verify reliably that the performance is actually beeing degraded, neither that the reload beeing done helps (after we resolved the pending transaction issue, anyway), nor look into what the possible reasons of this percieved degradation could be. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Defining performance.
[EMAIL PROTECTED] - Thu at 06:37:12PM -0600] As my dataset has gotten larger I have had to throw more metal at the problem, but I have also had to rethink my table and query design. Just because your data set grows linearly does NOT mean that the performance of your query is guaranteed to grow linearly! A sloppy query that runs OK with 3000 rows in your table may choke horribly when you hit 5. Then some limit is hit ... either the memory cache, or that the planner is doing an unlucky change of strategy when hitting 5. Anyway, it's very important when testing queries that they actually are tested on a (copy of the) production database, and not on an empty database or a database containing some random test data. If testing queries off the production database, it's important to have equal hardware and configuration on both servers. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Defining performance.
[Chris - Fri at 02:32:05PM +1100] Not really. A bad query is a bad query (eg missing a join element). It won't show up for 3000 rows, but will very quickly if you increase that by a reasonable amount. Even as simple as a missing index on a join column won't show up for a small dataset but will for a larger one. Ok, you're talking about O(n^2) and such stuff :-) It's a pretty common mistake to assume that a small dataset will behave exactly the same as a larger one - not always the case. No. :-) We had the worst experience when launching our product - it had been stress tested, but only by increasing the number of customers, not by increasing the overall size of the data set available for browsing. When opening the web pages for the public, this data set was already some ten-hundred times bigger than in the version enduring the stress tests - and the servers had no chances processing all the traffic. The worst bottle neck was not the database this time, but some horror algorithm one of the programmers had sneaked in ... poorly documented, cryptically programmed, slow ... and since I never understood that algorithm, I can only guess it must have been of O(n^2). ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Setting nice values
[Madison Kelly - Mon at 08:10:12AM -0500] to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! I must have been sleepy, listing up cons vs drawbacks ;-) Anyway, the central question is not the size of the job, but the size of the transactions within the job - if the job consists of many transactions, my test can be run before every transaction. Having transactions lasting for hours is a very bad thing to do, anyway. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Setting nice values
[Madison Kelly - Mon at 08:48:19AM -0500] Ah, sorry, long single queries is what you meant. No - long running single transactions :-) If it's only read-only queries, one will probably benefit by having one transaction for every query. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Setting nice values
[Madison Kelly - Thu at 10:25:07AM -0500] Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. I think you are right, and in any case, I don't think the niceness value won't help much if the bottleneck is iowait. In our application, I've made a special function for doing low-priority transactions which I believe is quite smart - though maybe not always. Before introducing this logic, we observed we had a tipping point, too many queries, and the database wouldn't swallow them fast enough, and the database server just jammed up, trying to work at too many queries at once, yielding the results far too slow. In the config file, I now have those two flags set: stats_start_collector = on stats_command_string = on This will unfortunately cause some CPU-load, but the benefit is great - one can actually check what the server is working with at any time: select * from pg_stat_activity with those, it is possible to check a special view pg_stat_activity - it will contain all the queries the database is working on right now. My idea is to peek into this table - if there is no active queries, the database is idle, and it's safe to start our low-priority transaction. If this view is full of stuff, one should certainly not run any low-priority transactions, rather sleep a bit and try again later. select count(*) from pg_stat_activity where not current_query like 'IDLE%' and query_start+?now() The algorithm takes four parameters, the time value to put in above, the maximum number of queries allowed to run, the sleep time between each attempt, and the amount of attempts to try before giving up. So here are the cons and drawbacks: con: Given small queries and small transactions, one can tune this in such a way that the low priority queries (almost) never causes significant delay for the higher priority queries. con: can be used to block users of an interactive query application to cause disturbances on the production database. con: can be used for pausing low-priority batch jobs to execute only when the server is idle. drawback: unsuitable for long-running queries and transactions drawback: with fixed values in the parameters above, one risks that the queries never gets run if the server is sufficiently stressed. drawback: the stats collection requires some CPU drawback: the select * from pg_stats_activity query requires some CPU drawback: the pg_stats_activity-view is constant within the transaction, so one has to roll back if there is activity (this is however not a really bad thing, because one certainly shouldn't live an idle transaction around if the database is stressed). ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Swappiness setting on a linux pg server
I just came to think about /proc/sys/swappiness ... When this one is set to a high number (say, 100 - which is maximum), the kernel will aggressively swap out all memory that is not beeing accessed, to allow more memory for caches. For a postgres server, OS caches are good, because postgres relies on the OS to cache indices, etc. At the other hand, for any production server it's very bad to experience bursts of iowait when/if the swapped out memory becomes needed - particularly if the server is used for interactive queries, like serving a web application. I know there are much religion on this topic in general, I'm just curious if anyone has done any serious thoughts or (even better!) experimenting with the swappiness setting on a loaded postgres server. I would assume that the default setting (60) is pretty OK and sane, and that modifying the setting would have insignificant effect. My religious belief is that, however insignificant, a higher setting would have been better :-) We're running linux kernel 2.6.17.7 (debian) on the postgres server, and our memory stats looks like this: total used free sharedbuffers cached Mem: 6083M 5846M 236M 0 31M 5448M -/+ buffers/cache:366M 5716M Swap: 2643M 2M 2640M In addition to the postgres server we're running some few cronscripts and misc on it - nothing significant though. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Swappiness setting on a linux pg server
[Jim C. Nasby - Thu at 10:28:31AM -0500] I think it'd be much better to experiment with using much larger shared_buffers settings. The conventional wisdom there is from 7.x days when you really didn't want a large buffer, but that doesn't really apply with the new buffer management we got in 8.0. I know of one site that doubled their performance by setting shared_buffers to 50% of memory. Oh, that's interessting. I will give it a shot. Our config is inheritated from the 7.x-days, so we have a fairly low setting compared to available memory. From the 7.x-days the logic was that a lot of careful thought has been given when designing the OS cache/buffer subsystem, we don't really want to reinvent the wheel or something like that. Sadly it's not easy to measure the overall performance impact of such tunings in a production environment, so such a setting tends to be tuned by religion rather than science :-) Something else to consider is that many people will put pg_xlog on the same drives as the OS (and swap). It's pretty important that those drives not have much activity other than pg_xlog, so any swap activity would have an even larger than normal impact on performance. Hm ... that's actually our current setting, we placed the postgres database itself on a separate disk, not the xlog. So we should have done it the other way around? No wonder the performance is badly affected by backups etc ... What else, I gave the swappiness a second thought, compared to our actual memory usage statistics ... turning down the swappiness would have no significant effect since we're only using 2M of swap (hardly significant) and our total memory usage by applications (including the pg shared buffers) is less than 400M out of 6G. Maybe we could have moved some 50M of this to swap, but that's not really significant compared to our 6G of memory. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Swappiness setting on a linux pg server
[Jim C. Nasby - Thu at 10:28:31AM -0500] I think it'd be much better to experiment with using much larger shared_buffers settings. The conventional wisdom there is from 7.x days when you really didn't want a large buffer, but that doesn't really apply with the new buffer management we got in 8.0. I know of one site that doubled their performance by setting shared_buffers to 50% of memory. I've upped it a bit, but it would require a server restart to get the new setting into effect. This is relatively expensive for us. Does anyone else share the viewpoint of Nasby, and does anyone have recommendation for a good value? Our previous value was 200M, and I don't want to go to the extremes just yet. We have 6G of memory totally. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Swappiness setting on a linux pg server
[Jim C. Nasby - Thu at 11:31:26AM -0500] Yeah, test setups are a good thing to have... We would need to replicate the production traffic as well to do reliable tests. Well, we'll get to that one day ... The issue with pg_xlog is you don't need bandwidth... you need super-low latency. The best way to accomplish that is to get a battery-backed RAID controller that you can enable write caching on. Sounds a bit risky to me :-) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Swappiness setting on a linux pg server
[Jim C. Nasby - Thu at 11:45:32AM -0500] The issue with pg_xlog is you don't need bandwidth... you need super-low latency. The best way to accomplish that is to get a battery-backed RAID controller that you can enable write caching on. Sounds a bit risky to me :-) Well, you do need to understand what happens if the machine does lose power... namely you have a limited amount of time to get power back to the machine so that the controller can flush that data out. Other than that, it's not very risky. We have burned ourself more than once due to unreliable raid controllers ... quantities of memory. So in your case, 600M wouldn't be pushing things much at all. Even 1G wouldn't be that out of the ordinary. Also remember that the more memory for shared_buffers, the less for sorting/hashes/etc. (work_mem) What do you mean, a high value for the shared_buffers implicates I can/should lower the work_mem value? Or just that I should remember to have more than enough memory for both work_mem, shared_buffers and OS caches? What is a sane value for the work_mem? It's currently set to 8M. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Swappiness setting on a linux pg server
[Jim C. Nasby - Thu at 12:00:39PM -0500] What's reasonable for work_mem depends on your workload. If you've got some reporting queries that you know aren't run very concurrently they might benefit from large values of work_mem. For stats.distributed.net, I set work_mem to something like 2MB in the config file, but the nightly batch routines manually set it up to 256M or more, because I know that those only run one at a time, and having that extra memory means a lot of stuff that would otherwise have to spill to disk now stays in memory. That sounds like a good idea; indeed we do have some few heavy reporting queries and they are not run much concurrently (the application checks the pg_stat_activity table and will disallow reports to be taken out if there is too much activity there). We probably would benefit from raising the work mem just for those reports, and lower it for the rest of the connections. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Swappiness setting on a linux pg server
[Ron - Thu at 03:10:35PM -0400] Jim is correct that traditional 7.x folklore regarding shared buffer size is nowhere near as valid for 8.x. Jim tends to know what he is talking about when speaking about pg operational issues. I would not doubt it, but it's always better to hear it from more people :-) Nonetheless, YMMV. The only sure way to know what is best for your SW running on your HW under your load conditions is to test, test, test. Certainly. My time and possibilities for testing is not that great at the moment, and in any case I believe some small adjustments won't cause the really significant results. In any case, our database server is not on fire at the moment and people are not angry because of slow reports at the moment. (actually, I started this thread out of nothing but curiousity ... triggered by somebody complaining about his desktop windows computer swapping too much :-) So, for this round of tunings I'm more than satisfied just relying on helpful rules of the thumb. A= Find out how much RAM your OS image needs. Usually 1/3 to 2/3 of a GB is plenty. A quick glance on free already revealed we are using less than 400 MB out of 6G totally (with the 7.x-mindset that the OS should take care of cacheing), and according to our previous settings, the shared buffers was eating 200 MB of this - so most of our memory is free. B= Find out how much RAM pg tasks need during typical peak usage and how much each of those tasks is using. I believe we have quite good control of the queries ... there are safeguards to prevent most of the heavy queries to run concurrently, and the lighter queries shouldn't spend much memory, so it should be safe for us to bump up the setting a bit. In any case, I guess the OS is not that bad at handling the memory issue. Unused memory will be used relatively intelligently (i.e. buffering the temp files used by sorts) and overuse of memory will cause some swapping (which is probably quite much worse than using temp files directly, but a little bit of swapping is most probably not a disaster). ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table
[Matthew T. O'Connor - Wed at 02:33:10PM -0400] In addition autovacuum respects the work of manual or cron based vacuums, so if you issue a vacuum right after a daily batch insert / update, autovacuum won't repeat the work of that manual vacuum. I was experimenting a bit with autovacuum now. To make the least effect possible, I started with a too high cost_delay/cost_limit-ratio. The effect of this was that autovacuum never finished the transactions it started with, and this was actually causing the nightly vacuum to not do it's job good enough. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table
[Matthew T. O'Connor - Sun at 10:42:34AM -0400] Yeah, I think if the delay settings are too high it can cause problems, that's part of the reason we have yet to turn these on be default since we won't have enough data to suggest good values. Can you tell us what settings you finally settled on? I'm still not yet settled, and the project manager is critical to autovacuum (adds complexity, no obvious benefits from it, we see from the CPU graphs that it's causing iowait, iowait is bad). We're going to run autovacuum biweekly now to see what effect it has on the server load. I've been using the cost/delay-setting of 200/200 for a week now, and I'm going to continue with 100/150 for a while. Are there any known disadvantages of lowering both values to the extreme - say, 20/20 instead of 200/200? That would efficiently mean sleep as often as possible, and sleep for 1 ms for each cost unit spent if I've understood the system right. Are there any logs that can help me, and eventually, are there any ready-made scripts for checking when autovacuum is running, and eventually for how long it keeps its transactions? I'll probably write up something myself if not. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] long running transactions
While doing a verbose vacuum, I'm constantly hitting things like: DETAIL: 3606 dead row versions cannot be removed yet. I believe this is a problem, because I still do have some empty tables requireing up to 3-400 ms just to check if the table is empty (see thread slow queue-like empty table). If pg_stat_activity.query_start actually is the start time of the transaction, then we've gotten rid of all the real long-running transactions. Then again, if pg_stat_activity.query_start actually was the start time of the transaction, the attribute would have been called pg_stat_activity.transaction_start, right? Is there any way to find the longest running transaction? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] long running transactions
[Tom Lane - Tue at 12:23:40PM -0400] Look in pg_locks to see the lowest-numbered transaction ID --- each transaction will be holding exclusive lock on its own XID. You can correlate that back to pg_stat_activity via the PID. Thanks a lot for the quick reply - I've already identified one long-running transaction. (I'm not allowed to order by xid, and not allowed to cast it to anything, how come?) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] long running transactions
[Tobias Brox - Tue at 06:39:13PM +0200] Thanks a lot for the quick reply - I've already identified one long-running transaction. belonging to autovacuum ... how come? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] long running transactions
[Tom Lane - Tue at 12:42:52PM -0400] belonging to autovacuum ... how come? Blocked on someone else's lock, maybe? hardly, the autovacuum is the only one having such a low transaction id, and also the only one hanging around when waiting a bit and rechecking the pg_locks table. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] long running transactions
[Tom Lane - Tue at 01:18:27PM -0400] Hmph. Is the autovac process actually doing anything (strace would be revealing)? It's definitively doing something; mostly reading, but also some few writes, semops and opens. If not, can you attach to the autovac process with gdb and get a stack trace to see where it's blocked? (gdb) bt #0 0xb7c599f8 in select () from /lib/tls/libc.so.6 #1 0x08253c53 in pg_usleep () #2 0x0812ee93 in vacuum_delay_point () #3 0x0812f2a5 in lazy_vacuum_rel () #4 0x0812ef7b in lazy_vacuum_rel () #5 0x0812b4b6 in vac_update_relstats () #6 0x0812a995 in vacuum () #7 0x0818d2ca in autovac_stopped () #8 0x0818ceae in autovac_stopped () #9 0x0818c848 in autovac_stopped () #10 0x0818c4e2 in autovac_start () #11 0x08192c11 in PostmasterMain () #12 0x08191dcf in PostmasterMain () #13 0x081541b1 in main () It'll come back after the autovacuum naptime. If it gets stuck again, please investigate. It seems stuck, has had the same transid for a long while, and the number of undeletable dead rows in our tables are increasing. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] long running transactions
[Tom Lane - Tue at 02:04:55PM -0400] It seems stuck, has had the same transid for a long while, and the number of undeletable dead rows in our tables are increasing. Perhaps you have overly aggressive vacuum cost delay settings? Perhaps, though I wouldn't expect it to sleep in the middle of a transaction - and also, it really did seem to me that it's doing work rather than only sleeping. The transaction id for the vacuum process is the same now as when I wrote the previous email, and the number of dead unremovable rows have increased steadily. The settings in effect are: autovacuum_vacuum_cost_delay = 500 autovacuum_vacuum_cost_limit = 200 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] long running transactions
[Tom Lane - Tue at 02:26:53PM -0400] autovacuum_vacuum_cost_delay = 500 autovacuum_vacuum_cost_limit = 200 Well, that's going to cause it to sleep half a second after every dozen or so page I/Os. I think you'd be well advised to reduce the delay. Modified it to 20/250, and it definitively helped. Sorry for the list verbosity; I should have been able to resolve this myself already some 2-3 emails ago :-) I wanted a soft introduction of autovac in production, and assumed that it was better to begin with too much sleep than too little! Well, well. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Simple join optimized badly?
[Jim C. Nasby - Mon at 04:18:27PM -0500] I can agree to that, but we'll never get any progress so long as every time hints are brought up the response is that they're evil and should never be in the database. I'll also say that a very simple hinting language (ie: allowing you to specify access method for a table, and join methods) would go a huge way towards enabling app developers to get stuff done now while waiting for all these magical optimizer improvements that have been talked about for years. Just a comment from the side line; can't the rough set enable_seqscan=off be considered as sort of a hint anyway? There have been situations where we've actually had to resort to such crud. Beeing able to i.e. force a particular index is something I really wouldn't put into the application except for as a very last resort, _but_ beeing able to force i.e. the use of a particular index in an interactive 'explain analyze'-query would often be ... if not outright useful, then at least very interessting. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow queue-like empty table
[Csaba Nagy - Thu at 10:45:35AM +0200] So you should check for idle in transaction sessions, those are bad... or any other long running transaction. Thank you (and others) for pointing this out, you certainly set us on the right track. We did have some few unclosed transactions; transactions not beeing ended by rollback or commit. We've been fixing this, beating up the programmers responsible and continued monitoring. I don't think it's only due to those queue-like tables, we've really seen a significant improvement on the graphs showing load and cpu usage on the database server after we killed all the idle in transaction. I can safely relax still some weeks before I need to do more optimization work :-) (oh, btw, we didn't really beat up the programmers ... too big geographical distances ;-) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Unsubscribe
To be a bit constructive, could it be an idea to add unsubscribe information as one of the standard tailer tips? Then unsubscribe info wouldn't appear in every mail, but often enough for people considering to unsubscribe. To be totally non-constructive, let me add a bit to the noise below: [Bruno] If you really can't figure out how to unsubscribe from a list, you should contact the list owner, not the list. The list members can't unsubscribe you (and it isn't their job to) and the owner may not be subscribed to the list. If he can't find out how to unsubscribe from the list, how can he be expected to figure out the owner address? [Joshua] It is ridiculous that this community expects people to read email headers to figure out how to unsubscribe from our lists. I always check the headers when I want to unsubscribe from any mailing list, and I think most people on this list have above average knowledge of such technical details. Of course, on a list with this many recepients there will always be some exceptions ... ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Multi-key index not beeing used - bug?
Look at this: NBET= explain select * from account_transaction where users_id=123456 order by created desc limit 10; QUERY PLAN --- Limit (cost=0.00..27.40 rows=10 width=213) - Index Scan Backward using account_transaction_on_user_and_timestamp on account_transaction (cost=0.00..1189.19 rows=434 width=213) Index Cond: (users_id = 123456) (3 rows) NBET= explain select * from account_transaction where users_id=123456 order by created desc, id desc limit 10; QUERY PLAN -- Limit (cost=1114.02..1114.04 rows=10 width=213) - Sort (cost=1114.02..1115.10 rows=434 width=213) Sort Key: created, id - Index Scan using account_transaction_by_users_id on account_transaction (cost=0.00..1095.01 rows=434 width=213) Index Cond: (users_id = 123456) (5 rows) In case the explains doesn't explain themself good enough: we have a transaction table with ID (primary key, serial), created (a timestamp) and a users_id. Some of the users have generated thousands of transactions, and the above query is a simplified version of the query used to show the users their last transactions. Since we have a large user base hammering our servers with this request, the speed is significant. We have indices on the users_id field and the (users_id, created)-tuple. The timestamp is set by the application and has a resolution of 1 second - so there may easily be several transactions sharing the same timestamp, but this is an exception not the rule. I suppose the developers needed to add the ID to the sort list to come around a bug, but still prefering to have the primary sorting by created to be able to use the index. One workaround here is to order only by id desc and create a new index on (users_id, id) - but I really don't like adding more indices to the transaction table. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Multi-key index not beeing used - bug?
[Tom Lane - Wed at 04:33:54PM -0400] We have indices on the users_id field and the (users_id, created)-tuple. Neither of those indexes can provide the sort order the query is asking for. Ah; that's understandable - the planner have two options, to do a index traversion without any extra sorting, or to take out everything and then sort. What I'd like postgres to do is to traverse the index and do some sorting for every unique value of created. Maybe such a feature can be found in future releases - like Postgres 56.3? ;-) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Poor performance on very simple query ?
[Arnaud Lesauvage - Tue at 01:25:10PM +0200] I have a performance problem, but I am not sure whether it really is a problem or not. QUERY PLAN -- Seq Scan on table1 (cost=0.00..23.69 rows=10 width=35) (actual time=0.023..0.734 rows=7 loops=1) Filter: ((gid = 33) OR (gid = 110) OR (gid = 65) OR (gid = 84) OR (gid = 92) OR (gid = 94) OR (gid = 13) OR (gid = 7) OR (gid = 68) OR (gid = 41)) Total runtime: 0.801 ms (3 rows) I have run VACUUM FULL on this table many times... I don't know what to try next ! What is wrong here (because I hope that something is wrong) ? Thanks a lot for your help ! Did you try analyze as well? It's weird it's using seq scan, since you have a primary key it's supposed to have an index ... though 500 rows is little. I just checked up our own production database, takes 0.08 ms to fetch a row by ID from one of our tables containing 176k with rows. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Poor performance on very simple query ?
[Tobias Brox - Tue at 02:10:04PM +0200] Did you try analyze as well? It's weird it's using seq scan, since you have a primary key it's supposed to have an index ... though 500 rows is little. I just checked up our own production database, takes 0.08 ms to fetch a row by ID from one of our tables containing 176k with rows. Oh, the gid is not primary key. I guess I should also apologize for adding noise here :-) Make an index here! :-) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Poor performance on very simple query ?
[Arnaud Lesauvage - Tue at 02:13:59PM +0200] Tobias Brox wrote: Oh, the gid is not primary key. I guess I should also apologize for adding noise here :-) Yes, it is a primary key, but I am the noise maker here ! ;-) Oh - it is. How can you have a default value on a primary key? Will it use the index if you do analyze? Is there an index on the table at all, do you get it up if you ask for a description of the table (\d tablename)? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] slow queue-like empty table
I have a query which really should be lightning fast (limit 1 from index), but which isn't. I've checked the pg_locks table, there are no locks on the table. The database is not under heavy load at the moment, but the query seems to draw CPU power. I checked the pg_locks view, but found nothing locking the table. It's a queue-like table, lots of rows beeing added and removed to the queue. The queue is currently empty. Have a look: NBET= vacuum verbose analyze my_queue; INFO: vacuuming public.my_queue INFO: index my_queue_pkey now contains 34058 row versions in 390 pages DETAIL: 288 index pages have been deleted, 285 are current ly reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: my_queue: found 0 removable, 34058 nonremovable row versions in 185 pages DETAIL: 34058 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing public.my_queue INFO: my_queue: scanned 185 of 185 pages, containing 0 live rows and 34058 dead rows; 0 rows in sample, 0 estimated total rows VACUUM NBET= explain analyze select bet_id from my_queue order by bet_id limit 1; QUERY PLAN --- Limit (cost=0.00..0.04 rows=1 width=4) (actual time=402.525..402.525 rows=0 loops=1) - Index Scan using my_queue_pkey on stats_bet_queue (cost=0.00..1314.71 rows=34058 width=4) (actual time=402.518..402.518 rows=0 loops=1) Total runtime: 402.560 ms (3 rows) NBET= select count(*) from my_queue; count --- 0 (1 row) It really seems like some transaction is still viewing the queue, since it found 38k of non-removable rows ... but how do I find the pid of the transaction viewing the queue? As said, the pg_locks didn't give me any hints ... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] slow queue-like empty table
[Tobias Brox - Thu at 08:56:31AM +0200] It really seems like some transaction is still viewing the queue, since it found 38k of non-removable rows ... but how do I find the pid of the transaction viewing the queue? As said, the pg_locks didn't give me any hints ... Dropping the table and recreating it solved the immediate problem, but there must be some better solution than that? :-) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Merge Join vs Nested Loop
[Tom Lane - Tue at 06:09:56PM -0400] If your tables are small enough to fit (mostly) in memory, then the planner tends to overestimate the cost of a nestloop because it fails to account for cacheing effects across multiple scans of the inner table. This is addressed in 8.2, but in earlier versions about all you can do is reduce random_page_cost, and a sane setting of that (ie not less than 1.0) may not be enough to push the cost estimates where you want them. Still, reducing random_page_cost ought to be your first recourse. Thank you. Reducing the random page hit cost did reduce the nested loop cost significantly, sadly the merge join costs where reduced even further, causing the planner to favor those even more than before. Setting the effective_cache_size really low solved the issue, but I believe we rather want to have a high effective_cache_size. Eventually, setting the effective_cache_size to near-0, and setting random_page_cost to 1 could maybe be a desperate measure. Another one is to turn off merge/hash joins and seq scans. It could be a worthwhile experiment if nothing else :-) The bulk of our database is historical data that most often is not touched at all, though one never knows for sure until the queries have run all through - so table partitioning is not an option, it seems like. My general idea is that nested loops would cause the most recent data and most important part of the indexes to stay in the OS cache. Does this make sense from an experts point of view? :-) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Merge Join vs Nested Loop
[Scott Marlowe - Wed at 09:58:30AM -0500] Have you tried chaning the cpu_* cost options to see how they affect merge versus nested loop? As said in the original post, increasing any of them shifts the planner towards nested loops instead of merge_join. I didn't check which one of the cost constants made the most impact. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Merge Join vs Nested Loop
[Scott Marlowe - Wed at 10:19:24AM -0500] So, by decreasing them, you should move away from nested loops then, right? Has that not worked for some reason? I want to move to nested loops, they are empirically faster in many of our queries, and that makes sense since we've got quite big tables and most of the queries only touch a small partition of the data. I've identified that moving any of the cost constants (including random_page_cost) upwards gives me the right result, but I'm still wary if this is the right thing to do. Even if so, what constants should I target first? I could of course try to analyze a bit what constants give the biggest impact. Then again, we have many more queries hitting the database than the few I'm doing research into (and those I'm doing research into is even very simplified versions of the real queries). ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Merge Join vs Nested Loop
[Scott Marlowe - Wed at 10:31:35AM -0500] And remember, you can always change any of those settings in session for just this one query to force the planner to make the right decision. sure ... I could identify the most problematic queries, and hack up the software application to modify the config settings for those exact queries ... but it's a very ugly solution. :-) Particularly if Tom Lane is correct saying the preferance of merge join instead of nested loop is indeed a bug. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] autovacuum on a -mostly- r/o table
[Edoardo Ceccarelli - Wed at 06:49:23PM +0200] ...another thing is, how could autovacuum check for machine load, this is something I cannot imagine right now... One solution I made for our application, is to check the pg_stats_activity view. It requires some config to get the stats available in that view, though. When the application is to start a low-priority transaction, it will first do: select count(*) from pg_stat_activity where current_query not like 'IDL%' and query_start+?now(); if the returned value is high, the application will sleep a bit and try again later. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] autovacuum on a -mostly- r/o table
[Edoardo Ceccarelli - Wed at 06:08:30PM +0200] We also activated the autovacuum feature to give it a try and that's were our problems started. (...) How can I configure the vacuum to run after the daily batch insert/update? I think you shouldn't use autovacuum in your case. We haven't dared testing out autovacuum yet even though we probably should, so we're running vacuum at fixed times of the day. We have a very simple script to do this, the most important part of it reads: echo vacuum verbose analyze; | psql $DB_NAME $logdir/$filename 21 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Merge Join vs Nested Loop
I found a way to survive yet some more weeks :-) One of the queries we've had most problems with today is principially something like: select A.*,sum(B.*) from A join B where A.createdx and ... order by A.created desc limit 32 group by A.* There is by average two rows in B for every row in A. Note the 'limit 32'-part. I rewrote the query to: select A.*,(select sum(B.*) from B ...) where A.createdx and ... order by A.created desc limit 32; And voila, the planner found out it needed just some few rows from A, and execution time was cutted from 1-2 minutes down to 20 ms. :-) I've also started thinking a bit harder about table partitioning, if we add some more redundancy both to the queries and the database, it may help us drastically reduce the real expenses of some of the merge joins... ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Merge Join vs Nested Loop
I have some odd cases here joining two tables - the planner insists on Merge Join, but Nested Loop is really faster - and that makes sense, since I'm selecting just a small partition of the data available. All planner constants seems to be set at the default values, the only way to get a shift towards Nested Loops seems to be to raise the constants. I believe our memory is big enough to hold the indices, and that the effective_cache_size is set to a sane value (but how to verify that, anyway?). What causes the nested loops to be estimated so costly - or is it the merge joins that are estimated too cheaply? Should I raise all the planner cost constants, or only one of them? Here are some sample explains: prod= explain analyze select * from ticket join users on users_id=users.id where ticket.created'2006-09-25 17:00'; QUERY PLAN -- Nested Loop (cost=0.00..67664.15 rows=10977 width=675) (actual time=0.038..202.877 rows=10627 loops=1) - Index Scan using ticket_on_created on ticket (cost=0.00..11665.94 rows=10977 width=80) (actual time=0.014..35.571 rows=10627 loops=1) Index Cond: (created '2006-09-25 17:00:00'::timestamp without time zone) - Index Scan using users_pkey on users (cost=0.00..5.00 rows=1 width=595) (actual time=0.007..0.008 rows=1 loops=10627) Index Cond: (outer.users_id = users.id) Total runtime: 216.612 ms (6 rows) prod= explain analyze select * from ticket join users on users_id=users.id where ticket.created'2006-09-25 16:00'; QUERY PLAN Merge Join (cost=12844.93..68580.37 rows=11401 width=675) (actual time=106.631..1712.458 rows=11554 loops=1) Merge Cond: (outer.id = inner.users_id) - Index Scan using users_pkey on users (cost=0.00..54107.38 rows=174508 width=595) (actual time=0.041..1215.221 rows=174599 loops=1) - Sort (cost=12844.93..12873.43 rows=11401 width=80) (actual time=105.753..123.905 rows=11554 loops=1) Sort Key: ticket.users_id - Index Scan using ticket_on_created on ticket (cost=0.00..12076.68 rows=11401 width=80) (actual time=0.074..65.297 rows=11554 loops=1) Index Cond: (created '2006-09-25 16:00:00'::timestamp without time zone) Total runtime: 1732.452 ms (8 rows) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] The need for full vacuum / reindex
By occation, we dropped the whole production database and refreshed it from a database backup - and all our performance problems seems to have gone. I suppose this means that to keep the database efficient, one eventually does have to do reindexing and/or full vacuum from time to time? -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Swapping
[Carlos Henrique Reimer - Wed at 03:25:15PM -0300] I´m trying to tune a linux box with a 12 GB database and 4 GB RAM. First of all I would like to stop the swapping, so the shared_buffers and sort_mem were decreased but even so it started swapping two hours after DBMS started up. I would like to know some suggestions how to discover why is it swapping? I agree with Tom Lane, nothing to worry about. Swapping is not a problem per se, aggressive swapping is a problem. If you are absolutely sure you want to ban all swapping, use swapoff? I'd trust linux to handle swap/cache sensibly. Eventually, become involved with kernel hacking ;-) -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Swapping
[Tobias Brox - Wed at 09:22:17PM +0200] I'd trust linux to handle swap/cache sensibly. Eventually, become involved with kernel hacking ;-) Of course, there are also some files in /proc/sys/vm that you may want to peek into, for tuning the swapping. Particularly, at later 2.6-kernels (I'm running 2.6.12) you have the file /proc/sys/vm/swappiness, where the number should be some percentage. I'm not completely sure how it works, but I suppose that the higher you set it, the more likely it is to swap out memory not beeing used. I think the default setting is probably sane, but you may want to google a bit about it. -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Limit + group + join
[Tom Lane] I looked into this and (...) I've committed some changes that hopefully will let 8.1 be smarter about GROUP BY ... LIMIT queries. [Mark Kirkwood] Very nice :-) (...) This is 8.1devel from today. Splendid :-) Unfortunately we will not be upgrading for some monthes still, but anyway I'm happy. This provides yet another good argument for upgrading sooner. I'm also happy to see such a perfect match: - A problem that can be reduced from beeing complex and production-specific, to simple and easily reproducible. - Enthusiastic people testing it and pinpointing even more precisely what conditions will cause the condition - Programmers actually fixing the issue - Testers verifying that it was fixed Long live postgresql! :-) -- Notice of Confidentiality: This email is sent unencrypted over the network, and may be stored on several email servers; it can be read by third parties as easy as a postcard. Do not rely on email for confidential information. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Limit + group + join
Consider this setup - which is a gross simplification of parts of our production system ;-) create table c (id integer primary key); create table b (id integer primary key, c_id integer); create index b_on_c on b(c_id) insert into c (select ... lots of IDs ...); insert into b (select id, id from c); /* keep it simple :-) */ Now, I'm just interessted in some few rows. All those gives good plans: explain select c.id from c order by c.id limit 1; explain select c.id from c group by c.id order by c.id limit 1; explain select c.id from c join b on c_id=c.id order by c.id limit 1; ... BUT ... combining join, group and limit makes havoc: explain select c.id from c join b on c_id=c.id group by c.id order by c.id desc limit 5; QUERY PLAN - Limit (cost=3809.65..3809.67 rows=5 width=4) - Group (cost=3809.65..3940.59 rows=26187 width=4) - Sort (cost=3809.65..3875.12 rows=26188 width=4) Sort Key: c.id - Hash Join (cost=559.34..1887.89 rows=26188 width=4) Hash Cond: (outer.id = inner.c_id) - Seq Scan on c (cost=0.00..403.87 rows=26187 width=4) - Hash (cost=403.87..403.87 rows=26187 width=4) - Seq Scan on b (cost=0.00..403.87 rows=26187 width=4) (9 rows) I get the same behaviour on pg 7.4.7 and pg 8.0.2. Of course, I can probably use subqueries instead of join - though, I would have wished the planner could do better ;-) -- Notice of Confidentiality: This information may be confidential, and blah-blah-blah - so please keep your eyes closed. Please delete and destroy this email. Failure to comply will cause my lawyer to yawn. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Limit + group + join
[Jeffrey W. Baker - Thu at 06:56:59PM -0700] explain select c.id from c join b on c_id=c.id group by c.id order by c.id desc limit 5; Where's b in this join clause? join b on c_id=c.id It just a funny way of writing: select c.id from c,b where c_id=c.id group by c.id order by c.id desc limit 5; It looks like a cartesian product to me. No. The query will return exactly the same as the simplest query: select c.id from c order by c.id desc limit 5; As said, this is a gross oversimplification of the production envorinment. In the production environment, I really need to use both join, group and limit. I tested a bit with subqueries, it was not a good solution (selecting really a lot of rows and aggregates from many of the tables). The next idea is to hack it up by manually finding out where the limit will cut, and place a restriction in the where-part of the query. -- Notice of Confidentiality: This information may be confidential, and blah-blah-blah - so please keep your eyes closed. Please delete and destroy this email. Failure to comply will cause my lawyer to yawn. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] partial index regarded more expensive
So, I have a table game with a timestamp attribute 'game_end', ranging from jan-2005 to present. The game table also have an attribute state, with live games beeing in state 2, and ended games beeing in state 4 (so, game_end+deltanow() usually means state=4). There are also an insignificant number of games in states 1,3. This query puzzles me: select * from game where game_end'2005-07-30' and state in (3,4); Now, one (at least me) should believe that the best index would be a partial index, resolved_game_by_date btree (game_end) WHERE ((state = 3) OR (state = 4)) NBET= explain analyze select * from game where game_end'2005-07-30' and state in (3,4); QUERY PLAN Index Scan using resolved_game_by_date on game (cost=0.00..7002.87 rows=7147 width=555) (actual time=0.220..86.234 rows=3852 loops=1) Index Cond: (game_end '2005-07-30 00:00:00'::timestamp without time zone) Filter: ((state = 3) OR (state = 4)) Total runtime: 90.568 ms (4 rows) Since state has only two significant states, I wouldn't believe this index to be any good: game_by_state btree (state) ...and it seems like I'm right: NBET= explain analyze select * from game where game_end'2005-07-30' and state in (3,4); QUERY PLAN Index Scan using game_by_state, game_by_state on game (cost=0.00..4413.78 rows=7147 width=555) (actual time=0.074..451.771 rows=3851 loops=1) Index Cond: ((state = 3) OR (state = 4)) Filter: (game_end '2005-07-30 00:00:00'::timestamp without time zone) Total runtime: 457.132 ms (4 rows) Now, how can the planner believe the game_by_state-index to be better? ('vacuum analyze game' did not significantly impact the numbers, and I've tried running the queries some times with and without the game_by_state-index to rule out cacheing effects) -- Tobias Brox This signature has been virus scanned, and is probably safe to read. This mail may contain confidential information, please keep your eyes closed. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] partial index regarded more expensive
[PFC - Wed at 08:15:13PM +0200] why not simply create an index on (game_end, state) ? No, the planner prefers to use the partial index (I dropped the index on game(state)). -- Tobias Brox, Nordicbet IT dept This signature has been virus scanned, and is probably safe to read. This mail may contain confidential information, please keep your eyes closed. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] nice/low priority Query
The short question: Is there any ways to give postgresql a hint that a particular SQL call should be run at lower priority? Since every db connection has a pid, I can manually run renice to scheduele it by the OS - but of course I can't do it manually all the time. The long story: We have a constantly growing database, and hence also a constantly growing load on the database server. A hardware upgrade has for different reasons been postponed, and it's still beeing postponed. We were hitting the first capacity problems in June, though so far I've managed to keep the situation in check by tuning the configuration, adding indices, optimizing queries, doing cacheing in the application, and at one point in the code I'm even asking the database for explain plan, grepping out the estimated cost number, and referring the user to take contact with the IT-dept if he really needs the report. But I digress. Still there are lots of CPU power available - normally the server runs with 50-80% of the CPUs idle, it's just the spikes that kills us. We basically have two kind of queries that are significant - an ever-ongoing critical rush of simple queries, both reading and writing to the database, plus some few heavy non-critical read-only queries that may cause significant iowait. The problem comes when we are so unlucky that two or three heavy queries are run simultaneously; we get congestion problems - instead of the applications just running a bit slower, they run _much_ slower. Ideally, if it was trivial to give priorities, it should be possible to keep the CPUs running at 100% for hours without causing critical problems...? -- Tobias Brox, +47-91700050 Tromso, Norway ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] nice/low priority Query
[Tobias Brox - Tue at 06:04:34PM +0200] (...) and at one point in the code I'm even asking the database for explain plan, grepping out the estimated cost number, and referring the user to take contact with the IT-dept if he really needs the report. But I digress. I just came to think about some more dirty tricks I can do. I have turned on stats collection in the configuration; now, if I do: select count(*) from pg_stat_activity where not current_query like 'IDLE%'; or, eventually: select count(*) from pg_stat_activity where not current_query like 'IDLE%' and query_start+'1 second'now(); it will give a hint about how busy the database server is, thus I can eventually let the application sleep and retry if there are any other heavy queries in progress. -- Tobias Brox, +47-91700050 Nordicbet, IT dept ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org