[PERFORM] slow result
Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; PostgreSQL return result in 28 sec every time. although MS-SQL return result in 0.02 sec every time. My server is a DELL PowerEdge 2600 with bi-processor Xeon at 3.2 Ghz with 3GBytes RAM My PostgreSQL Conf is * log_connections = yes syslog = 2 effective_cache_size = 5 sort_mem = 1 max_connections = 200 shared_buffers = 3000 vacuum_mem = 32000 wal_buffers = 8 max_fsm_pages = 2000 max_fsm_relations = 100 Can you tell me is there a way to enhence performance ? Thank you +-+ | Laurent Manchon | | Email: [EMAIL PROTECTED] | +-+
Re: [PERFORM] slow result
Am 23.01.2007 um 11:34 schrieb Laurent Manchon: Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; count(*) is doing a full tablescan over all your 80 rows. This is a well known feature of postgres :-/ So enhancing the performance is currently only possible by having faster disk drives. -- Heiko W.Rupp [EMAIL PROTECTED], http://www.dpunkt.de/buch/ 3-89864-429-4.html ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] slow result
am Tue, dem 23.01.2007, um 11:34:52 +0100 mailte Laurent Manchon folgendes: Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; If i remember correctly, i saw this question yesterday on an other list... Answer: Because PG force a sequencial scan. You can read a lot about this in the archives. Here some links to explanations: http://www.pervasive-postgres.com/instantkb13/article.aspx?id=10117cNode=0T1L6L http://sql-info.de/postgresql/postgres-gotchas.html#1_7 http://www.varlena.com/GeneralBits/49.php Hope that helps, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow result
On Tue, Jan 23, 2007 at 11:34:52AM +0100, Laurent Manchon wrote: I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; Contrary to your expectations, this is _not_ a query you'd expect to be fast in Postgres. Try real queries from your application instead -- most likely, you'll find them to be much master. (If not, come back with the query, the schema and the EXPLAIN ANALYZE output of your query, and you'll usually get help nailing down the issues. :-) ) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] slow result
On Tue, Jan 23, 2007 at 11:55:41AM +0100, Steinar H. Gunderson wrote: you'll find them to be much master. s/master/faster/ /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] slow result
Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; PostgreSQL return result in 28 sec every time. although MS-SQL return result in 0.02 sec every time. My server is a DELL PowerEdge 2600 with bi-processor Xeon at 3.2 Ghz with 3GBytes RAM My PostgreSQL Conf is * log_connections = yes syslog = 2 effective_cache_size = 5 sort_mem = 1 max_connections = 200 shared_buffers = 3000 vacuum_mem = 32000 wal_buffers = 8 max_fsm_pages = 2000 max_fsm_relations = 100 Can you tell me is there a way to enhence performance ? Thank you +-+ | Laurent Manchon | | Email: [EMAIL PROTECTED] | +-+
Re: [PERFORM] slow result
am Tue, dem 23.01.2007, um 13:34:19 +0100 mailte Laurent Manchon folgendes: Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; PLEASE READ THE ANSWERS FOR YOUR OTHER MAILS. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] extract(field from timestamp) vs date dimension
Does anyone have experience with using postgres for data warehousing? Right, I saw one post suggestion to use mysql for a mostly read-only database ... but anyway, I think it's not a question to change the database platform for this project, at least not today ;-) Ralph Kimball seems to be some kind of guru on data warehousing, and in his books he's strongly recommending to have a date dimension - simply a table describing all dates in the system, and having attributes for what day of the week it is, month, day of the month, week number, bank holiday, anything special, etc. Well, it does make sense if adding lots of information there that cannot easily be pulled out from elsewhere - but as for now, I'm mostly only interessted in grouping turnover/profit by weeks/months/quarters/years/weekdays. It seems so much bloated to store this information, my gut feeling tells it should be better to generate them on the fly. Postgres even allows to create an index on an expression. The question is ... I'm curious about what would yield the highest performance - when choosing between: select extract(week from created), ... from some_table where ... group by extract(week from created), ... sort by extract(week from created), ... and: select date_dim.week_num, ... from some_table join date_dim ... where ... group by date_dim.week_num, ... sort by date_dim, week_num, ... The date_dim table would eventually cover ~3 years of operation, that is less than 1000 rows. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] extract(field from timestamp) vs date dimension
On 1/23/07, Tobias Brox [EMAIL PROTECTED] wrote: Ralph Kimball seems to be some kind of guru on data warehousing, and in his books he's strongly recommending to have a date dimension - simply a table describing all dates in the system, and having I would tend to agree with this line of thought. out from elsewhere - but as for now, I'm mostly only interessted in grouping turnover/profit by weeks/months/quarters/years/weekdays. It seems so much bloated to store this information, my gut feeling tells it should be better to generate them on the fly. Postgres even allows to create an index on an expression. I guess go with your gut, but at some point the expressions are going to be too complicated to maintain, and inefficient. Calendar tables are very very common, because traditional date functions simply can't define business logic (especially things like month end close, quarter end close, and year end close) that doesn't have any repeating patterns (every 4th friday, 1st monday in the quarter, etc). Sure you can stuff it into a function, but it just isn't as maintainable as a table. -- Chad http://www.postgresqlforums.com/
Re: [PERFORM] extract(field from timestamp) vs date dimension
[Chad Wagner - Tue at 08:24:34AM -0500] I guess go with your gut, but at some point the expressions are going to be too complicated to maintain, and inefficient. The layout of my system is quite flexible, so it should eventually be fairly trivial to throw in a date dimension at a later stage. Calendar tables are very very common, because traditional date functions simply can't define business logic (especially things like month end close, quarter end close, and year end close) that doesn't have any repeating patterns (every 4th friday, 1st monday in the quarter, etc). Sure you can stuff it into a function, but it just isn't as maintainable as a table. So far I haven't been bothered with anything more complex than clean weeks, months, quarters, etc. I suppose the strongest argument for introducing date dimensions already now is that I probably will benefit from having conform and well-designed dimensions when I will be introducing more data marts. As for now I have only one fact table and some few dimensions in the system. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] slow result
In response to Laurent Manchon [EMAIL PROTECTED]: I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; PostgreSQL return result in 28 sec every time. although MS-SQL return result in 0.02 sec every time. My server is a DELL PowerEdge 2600 with bi-processor Xeon at 3.2 Ghz with 3GBytes RAM While there's truth in everything that's been said by others, the query should not take _that_ long. I just tried a count(*) on a table with 460,000 rows, and it took less than a second. count(*) in PostgreSQL is not likely to compare to most other RDBMS for the reasons others have stated, but counting 800,000 rows shouldn't take 28 seconds. The standard question applies: have you vacuumed recently? My PostgreSQL Conf is * log_connections = yes syslog = 2 effective_cache_size = 5 sort_mem = 1 max_connections = 200 shared_buffers = 3000 vacuum_mem = 32000 wal_buffers = 8 max_fsm_pages = 2000 max_fsm_relations = 100 Can you tell me is there a way to enhence performance ? On our 4G machines, we use shared_buffers=24 (which equates to about 2G). The only reason I don't set it higher is that FreeBSD has a limit on shared memory of 2G. The caveat here is that I'm running a mix of 8.1 and 8.2. There have been significant improvements in both the usage of shared memory, and the optimization of count(*) since 7.4, so the first suggestion I have is to upgrade your installation. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] slow result
At 07:34 AM 1/23/2007, Laurent Manchon wrote: Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: 1= Upgrade to the latest stable version of pg. That would be 8.2.x You are very much in the Dark Ages pg version wise. pg 8.x has significant IO enhancements. Especially compared to 7.4. select count(*)from tbl; PostgreSQL return result in 28 sec every time. although MS-SQL return result in 0.02 sec every time. 2= pg actually counts how many rows there are in a table. MS-SQL looks up a count value from a internal data table... which can be wrong in extraordinarily rare circumstances in a MVCC DBMS (which MS-SQL is !not!. MS-SQL uses the older hierarchical locking strategy for data protection.) Since pg actually scans the table for the count, pg's count will always be correct. No matter what. Since MS-SQL does not use MVCC, it does not have to worry about the corner MVCC cases that pg does. OTOH, MVCC _greatly_ reduces the number of cases where one transaction can block another compared to the locking strategy used in MS-SQL. This means in real day to day operation, pg is very likely to handle OLTP loads and heavy loads better than MS-SQL will. In addition, MS-SQL is a traditional Codd Date table oriented DBMS. pg is an object oriented DBMS. Two very different products with very different considerations and goals (and initially designed at very different times historically.) Compare them under real loads using real queries if you are going to compare them. Comparing pg and MS-SQL using fluff queries like count(*) is both misleading and a waste of effort. My server is a DELL PowerEdge 2600 with bi-processor Xeon at 3.2 Ghz with 3GBytes RAM My PostgreSQL Conf is * log_connections = yes syslog = 2 effective_cache_size = 5 sort_mem = 1 max_connections = 200 shared_buffers = 3000 vacuum_mem = 32000 wal_buffers = 8 max_fsm_pages = 2000 max_fsm_relations = 100 Can you tell me is there a way to enhence performance ? There are extensive FAQs on what the above values should be for pg. The lore is very different for pg 8.x vs pg 7.x Thank you You're welcome. Ron Peacetree ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Postgres processes have a burst of CPU usage
Hello all, I have a setup in which four client machines access a Postgres database (8.1.1) (on a Linux box). So, there are connections from each machine to the database; hence, the Linux box has about 2 postgres processes associated with each machine. I am using the JDBC driver (postgresql-8.1-404.jdbc3.jar) to talk to the database. I am also using the Spring framework(1.2.2) and Hibernate (3.0.5) on top of JDBC. I use Apache's DBCP database connection pool (1.2.1). Now, there is one particular update that I make from one of the client machines - this involves a reasonably large object graph (from the Java point of view). It deletes a bunch of rows (around 20 rows in all) in 4-5 tables and inserts another bunch into the same tables. When I do this, I see a big spike in the CPU usage of postgres processes that are associated with ALL the client machines, not just the one I executed the delete/insert operation on. The spike seems to happen a second or two AFTER the original update completes and last for a few seconds. Is it that this operation is forcibly clearing some client cache on ALL the postgres processes? Why is there such an interdependency? Can I set some parameter to turn this off? Regards and thanks, S.Aiylam Do you Yahoo!? Everyone is raving about the all-new Yahoo! Mail beta. http://new.mail.yahoo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] extract(field from timestamp) vs date dimension
On 1/23/07, Tobias Brox [EMAIL PROTECTED] wrote: Does anyone have experience with using postgres for data warehousing? Right, I saw one post suggestion to use mysql for a mostly read-only database ... but anyway, I think it's not a question to change the database platform for this project, at least not today ;-) Ralph Kimball seems to be some kind of guru on data warehousing, and in his books he's strongly recommending to have a date dimension - simply a table describing all dates in the system, and having attributes for what day of the week it is, month, day of the month, week number, bank holiday, anything special, etc. Well, it does make sense if adding lots of information there that cannot easily be pulled out from elsewhere - but as for now, I'm mostly only interessted in grouping turnover/profit by weeks/months/quarters/years/weekdays. It seems so much bloated to store this information, my gut feeling tells it should be better to generate them on the fly. Postgres even allows to create an index on an expression. The question is ... I'm curious about what would yield the highest performance - when choosing between: select extract(week from created), ... from some_table where ... group by extract(week from created), ... sort by extract(week from created), ... and: select date_dim.week_num, ... from some_table join date_dim ... where ... group by date_dim.week_num, ... sort by date_dim, week_num, ... The date_dim table would eventually cover ~3 years of operation, that is less than 1000 rows. In my opinion, I would make a date_dim table for this case. I would strongly advice against making a date_id field, just use the date itself as the p-key (i wouldn't bother with RI links to the table though). I would also however make a function and use this to make the record: create or replace function make_date_dim(in_date date) returns date_dim as $$ [...] And make date_dim records this way: insert into date_dim select * from make_dim('01/01/2001'::date); (or pre-insert with generate_series). now you get the best of both worlds: you can join to the table for the general case or index via function for special case indexes. for example suppose you had to frequently count an account's sales by fiscal year quarter irrespective of year: create index q_sales_idx on account_sale(account_no, (make_dim(sale_date)).fiscal_quarter); also you can use the function in place of a join if you want. In some cases the join may be better, though. merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow result
On Tue, Jan 23, 2007 at 11:34:52 +0100, Laurent Manchon [EMAIL PROTECTED] wrote: Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; PostgreSQL return result in 28 sec every time. although MS-SQL return result in 0.02 sec every time. Besides the other advice mentioned in this thread, check that you don't have a lot of dead tuples in that table. 28 seconds seems a bit high for even a sequential scan of 80 tuples unless they are pretty large. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Bad Row Count Estimate on View with 8.2
Hello, I discovered a query which is taking 70 seconds on 8.2.1 which used to take under a second on 8.1.2. I was digging into what was causing it and I believe the problem is a view which the planner estimates will return 1 row when it actually returns 3500. When I join with the view, the planner ends up using a nested loop because it thinks the right branch will run once instead of 3500 times. I've analyzed all the tables and played around with the default_statistics_target, but still the planner estimates 1 row. I was wondering if anybody else has any other ideas? Here is the query the view is defined as: SELECT foo.fund_id, foo.owner_trader_id, foo.strategy_id, foo.cf_account_id, foo.instrument_id, sum(foo.pos) AS pos, sum(foo.cost) AS cost FROM ( SELECT om_position.fund_id, om_position.owner_trader_id, om_position.strategy_id, om_position.cf_account_id, om_position.instrument_id, om_position.pos, om_position.cost FROM om_position WHERE om_position.as_of_date = date(now()) UNION ALL SELECT om_trade.fund_id, om_trade.owner_trader_id, om_trade.strategy_id, om_trade.cf_account_id, om_trade.instrument_id, om_trade.qty::numeric(22,9) AS pos, om_trade.cost FROM om_trade WHERE om_trade.process_state = 0 OR om_trade.process_state = 2 ) foo GROUP BY foo.fund_id, foo.owner_trader_id, foo.strategy_id, foo.cf_account_id, foo.instrument_id; Here is explain analyze from both 8.1.2 and 8.2.1 with default_statistics_target=10 and tables freshly analyzed: 8.1.2 HashAggregate (cost=4760.33..4764.95 rows=308 width=168) (actual time=56.873..71.293 rows=3569 loops=1) - Append (cost=0.00..4675.85 rows=3072 width=54) (actual time=0.037..38.261 rows=3715 loops=1) - Index Scan using as_of_date_om_position_index on om_position (cost=0.00..4637.10 rows=3071 width=54) (actual time=0.031..14.722 rows=3559 loops=1) Index Cond: (as_of_date = date(now())) - Bitmap Heap Scan on om_trade (cost=4.01..8.03 rows=1 width=48) (actual time=0.118..0.917 rows=156 loops=1) Recheck Cond: ((process_state = 0) OR (process_state = 2)) - BitmapOr (cost=4.01..4.01 rows=1 width=0) (actual time=0.079..0.079 rows=0 loops=1) - Bitmap Index Scan on om_trade_partial_process_state_index (cost=0.00..2.00 rows=1 width=0) (actual time=0.060..0.060 rows=156 loops=1) Index Cond: (process_state = 0) - Bitmap Index Scan on om_trade_partial_process_state_index (cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (process_state = 2) Total runtime: 82.398 ms 8.2.1 HashAggregate (cost=6912.51..6912.53 rows=1 width=200) (actual time=19.005..24.137 rows=3569 loops=1) - Append (cost=0.00..6406.73 rows=28902 width=200) (actual time=0.037..11.569 rows=3715 loops=1) - Index Scan using as_of_date_om_position_index on om_position (cost=0.00..4333.82 rows=2964 width=53) (actual time=0.035..4.884 rows=3559 loops=1) Index Cond: (as_of_date = date(now())) - Bitmap Heap Scan on om_trade (cost=464.40..1783.89 rows=25938 width=49) (actual time=0.060..0.380 rows=156 loops=1) Recheck Cond: ((process_state = 0) OR (process_state = 2)) - BitmapOr (cost=464.40..464.40 rows=308 width=0) (actual time=0.041..0.041 rows=0 loops=1) - Bitmap Index Scan on om_trade_partial_process_state_index (cost=0.00..225.72 rows=154 width=0) (actual time=0.032..0.032 rows=156 loops=1) Index Cond: (process_state = 0) - Bitmap Index Scan on om_trade_partial_process_state_index (cost=0.00..225.72 rows=154 width=0) (actual time=0.003..0.003 rows=0 loops=1) Index Cond: (process_state = 2) Total runtime: 27.193 ms Here is explain analyze from 8.2.1 with default_statistics_target=1000 and tables freshly analyzed: HashAggregate (cost=5344.36..5344.37 rows=1 width=200) (actual time=18.826..23.950 rows=3569 loops=1) - Append (cost=0.00..5280.01 rows=3677 width=200) (actual time=0.031..11.606 rows=3715 loops=1) - Index Scan using as_of_date_om_position_index on om_position (cost=0.00..5224.44 rows=3502 width=54) (actual time=0.029..4.903 rows=3559 loops=1) Index Cond: (as_of_date = date(now())) - Bitmap Heap Scan on om_trade (cost=9.91..18.79 rows=175 width=49) (actual time=0.069..0.394 rows=156 loops=1) Recheck Cond: ((process_state = 0) OR (process_state = 2)) - BitmapOr (cost=9.91..9.91 rows=2 width=0) (actual time=0.050..0.050 rows=0 loops=1) - Bitmap Index Scan on om_trade_partial_process_state_index (cost=0.00..5.57 rows=2 width=0) (actual time=0.039..0.039 rows=156 loops=1) Index Cond: (process_state = 0) - Bitmap Index Scan on
Re: [PERFORM] slow result
Laurent Manchon wrote: Hi, I have a slow response of my PostgreSQL database 7.4 using this query below on a table with 80 rows: select count(*)from tbl; PostgreSQL return result in 28 sec every time. Can you post the results of: analyze verbose tbl; explain analyze select count(*) from tbl; The first will give us some info about how many pages tbl has (in 7.4 ISTR it does not state the # of dead rows... but anyway), the second should help us deduce why it is so slow. Also as others have pointed out, later versions are quite a bit faster for sequential scans... Cheers Mark ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] extract(field from timestamp) vs date dimension
Tobias Brox wrote: I suppose the strongest argument for introducing date dimensions already now is that I probably will benefit from having conform and well-designed dimensions when I will be introducing more data marts. As for now I have only one fact table and some few dimensions in the system. Another factors to consider is that end user tools (and end users) may find a date/time dimension helpful. Best wishes Mark ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] how to plan for vacuum?
Hi, For I can not find too much information about how to use vacuum, I want to ask some general information about the guideline of vacuum planning. 1. How do we know if autovacuum is enough for my application, or should I setup a vacuum manually from cron for my application? 2. How to set the GUC parameters for autovacuum? There are two sets of parameters for autovacuum: - vacuum threshold and scale factor (500/0.2) - analyze threshold and scale factor(250/0.1) Is there any guideline to set these parameters? When does it need to change the default values? 3. How to tune cost-based delay vacuum? I had searched in performance list; it seems that most of the practices are based on experience / trial-and-error approach to meet the requirement of disk utilization or CPU utilization. Is there any other guild line to set them? For when autovacuum is turned on by default, if the parameters for vacuum have not been set well, it will make the system rather unstable. So I just wonder if we should setup a section in the manual about the tips of vacuum, then many users can easily set the vacuum parameters for their system. Best Regards Galy Lee NTT OSS Center ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Bad Row Count Estimate on View with 8.2
Dave Dutcher [EMAIL PROTECTED] writes: I discovered a query which is taking 70 seconds on 8.2.1 which used to take under a second on 8.1.2. I was digging into what was causing it and I believe the problem is a view which the planner estimates will return 1 row when it actually returns 3500. This is evidently a failure of estimate_num_groups(). However, I do not see any difference in that code between 8.1 and 8.2 branch tips. I do notice a possibly-relevant change that was applied in 8.1.4: 2006-05-02 00:34 tgl * src/backend/: optimizer/path/allpaths.c, utils/adt/selfuncs.c (REL8_1_STABLE): Avoid assuming that statistics for a parent relation reflect the properties of the union of its child relations as well. This might have been a good idea when it was originally coded, but it's a fatally bad idea when inheritance is being used for partitioning. It's better to have no stats at all than completely misleading stats. Per report from Mark Liberman. The bug arguably exists all the way back, but I've only patched HEAD and 8.1 because we weren't particularly trying to support partitioning before 8.1. Eventually we ought to look at deriving union statistics instead of just punting, but for now the drop kick looks good. I think this was only meant to apply to table inheritance situations, but on reflection it might affect UNION queries too. The question is whether the numbers it was using before really mean anything --- they seem to have been better-than-nothing in your particular case, but I'm not sure that translates to a conclusion that we should depend on 'em. In fact, since there isn't any parent relation in a UNION, I'm not sure that this patch actually changed your results ... but I'm not seeing what else would've ... regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] how to plan for vacuum?
Just have one example here: workload: run pgbench in 365x24x7 database size: 100GB the workload distribution: 06:00-24:00 100tps 00:00-06:00 20tps how should we plan vacuum for this situation to get the highest performance? Best regards Galy Galy Lee wrote: Hi, For I can not find too much information about how to use vacuum, I want to ask some general information about the guideline of vacuum planning. 1. How do we know if autovacuum is enough for my application, or should I setup a vacuum manually from cron for my application? 2. How to set the GUC parameters for autovacuum? There are two sets of parameters for autovacuum: - vacuum threshold and scale factor (500/0.2) - analyze threshold and scale factor(250/0.1) Is there any guideline to set these parameters? When does it need to change the default values? 3. How to tune cost-based delay vacuum? I had searched in performance list; it seems that most of the practices are based on experience / trial-and-error approach to meet the requirement of disk utilization or CPU utilization. Is there any other guild line to set them? For when autovacuum is turned on by default, if the parameters for vacuum have not been set well, it will make the system rather unstable. So I just wonder if we should setup a section in the manual about the tips of vacuum, then many users can easily set the vacuum parameters for their system. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org