Re: [PERFORM] strange performance regression between 7.4 and 8.1
I would just like to note here that this is an example of inefficient strategy. We could all agree (up to a certain economical point) that Alex saved the most expensive one thousand dollars of his life. I don't know the financial status nor the size of your organization, but I'm sure that you have selected the path that has cost you more. In the future, an investment on memory for a (let's say) rather small database should be your first attempt. Yours, Rodrigo Madera On 3/6/07, Alex Deucher [EMAIL PROTECTED] wrote: On 3/6/07, Ron [EMAIL PROTECTED] wrote: At 10:25 AM 3/6/2007, Alex Deucher wrote: On 3/5/07, Guido Neitzer [EMAIL PROTECTED] wrote: On 05.03.2007, at 19:56, Alex Deucher wrote: Yes, I started setting that up this afternoon. I'm going to test that tomorrow and post the results. Good - that may or may not give some insight in the actual bottleneck. You never know but it seems to be one of the easiest to find out ... Well, the SAN appears to be the limiting factor. I set up the DB on the local scsi discs (software RAID 1) and performance is excellent (better than the old server). Thanks for everyone's help. Alex What kind of SAN is it and how many + what kind of HDs are in it? Assuming the answers are reasonable... It's a Hitachi WMS/Tagmastore. 105 hitachi SATA drives; 4 Gbps FC. Profile the table IO pattern your workload generates and start allocating RAID sets to tables or groups of tables based on IO pattern. For any table or group of tables that has a significant level of write IO, say = ~25% of the IO mix, try RAID 5 or 6 first, but be prepared to go RAID 10 if performance is not acceptable. Right now it's designed for max capacity: big RAID 5 groups. I expect I'll probably need RAID 10 for decent performance. Don't believe any of the standard lore regarding what tables to put where or what tables to give dedicated spindles to. Profile, benchmark, and only then start allocating dedicated resources. For instance, I've seen situations where putting pg_xlog on its own spindles was !not! the right thing to do. Right. Thanks for the advice. I'll post my results when I get around to testing some new SAN configurations. Alex ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Rodrigo Madera wrote: I would just like to note here that this is an example of inefficient strategy. We could all agree (up to a certain economical point) that Alex saved the most expensive one thousand dollars of his life. I don't know the financial status nor the size of your organization, but I'm sure that you have selected the path that has cost you more. In the future, an investment on memory for a (let's say) rather small database should be your first attempt. Alex may have made the correct, rational choice, given the state of accounting at most corporations. Corporate accounting practices and the budgetary process give different weights to cash and labor. Labor is fixed, and can be grossly wasted without (apparently) affecting the quarterly bottom line. Cash expenditures come directly off profits. It's shortsighted and irrational, but nearly 100% of corporations operate this way. You can waste a week of your time and nobody complains, but spend a thousand dollars, and the company president is breathing down your neck. When we answer a question on this forum, we need to understand that the person who needs help may be under irrational, but real, constraints, and offer appropriate advice. Sure, it's good to fight corporate stupidity, but sometimes you just want to get the system back online. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 01:34 PM 3/8/2007, Craig A. James wrote: Rodrigo Madera wrote: I would just like to note here that this is an example of inefficient strategy. We could all agree (up to a certain economical point) that Alex saved the most expensive one thousand dollars of his life. I don't know the financial status nor the size of your organization, but I'm sure that you have selected the path that has cost you more. In the future, an investment on memory for a (let's say) rather small database should be your first attempt. Alex may have made the correct, rational choice, given the state of accounting at most corporations. Corporate accounting practices and the budgetary process give different weights to cash and labor. Labor is fixed, and can be grossly wasted without (apparently) affecting the quarterly bottom line. Cash expenditures come directly off profits. It's shortsighted and irrational, but nearly 100% of corporations operate this way. You can waste a week of your time and nobody complains, but spend a thousand dollars, and the company president is breathing down your neck. When we answer a question on this forum, we need to understand that the person who needs help may be under irrational, but real, constraints, and offer appropriate advice. Sure, it's good to fight corporate stupidity, but sometimes you just want to get the system back online. Craig All good points. However, when we allow or help (even tacitly by looking the other way) our organizations to waste IT dollars we increase the risk that we are going to be paid less because there's less money. Or even that we will be unemployed because there's less money (as in we wasted enough money we went out of business). The correct strategy is to Speak Their Language (tm) to the accounting and management folks and give them the information needed to Do The Right Thing (tm) (or at least authorize you doing it ;-) ). They may still not be / act sane, but at that point your hands are clean. (...and if your organization has a habit of Not Listening to Reason (tm), strongly consider finding a new job before you are forced to by their fiscal or managerial irresponsibility.) Cap Ex may not be the same as Discretionary Expenses, but at the end of the day dollars are dollars. Any we spend in one place can't be spent in any other place; and there's a finite pile of them. Spending 10x as much in labor and opportunity costs (you can only do one thing at a time...) as you would on CapEx to address a problem is simply not smart money management nor good business. Even spending 2x as much in that fashion is probably not. Cheers, Ron Peacetree ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
I would just like to note here that this is an example of inefficient strategy. [ ... ] Alex may have made the correct, rational choice, given the state of accounting at most corporations. Corporate accounting practices and the budgetary process give different weights to cash and labor. Labor is fixed, and can be grossly wasted without (apparently) affecting the quarterly bottom line. Cash expenditures come directly off profits. It's shortsighted and irrational, but nearly 100% of corporations operate this way. You can waste a week of your time and nobody complains, but spend a thousand dollars, and the company president is breathing down your neck. When we answer a question on this forum, we need to understand that the person who needs help may be under irrational, but real, constraints, and offer appropriate advice. Sure, it's good to fight corporate stupidity, but sometimes you just want to get the system back online. Another thing --- which may or may not apply to Alex's case and to the particular state of the thread, but it's still related and IMHO important to take into account: There may be other consrtaints that makes it impossible to even consider a memory upgrade --- for example, us (our project). We *rent* the servers from a Web hoster (dedicated servers). This particular hoster does not even offer the possibility of upgrading the hardware --- 2GB of RAM, take it r leave it. Period. In other cases, the memory upgrade has a *monthly* cost (and quite often I find it excessive --- granted, that may be just me). So, $50 or $100 per month *additional* expenses may be considerable. Now, yet another thing that you (Craig) seem to be missing: you're simply putting the expense of all this time under the expenses column in exchange for solving the particular problem --- gaining the insight on the internals and performance tuning techniques for PG may well be worth tens of thousands of dollars for his company in the future. The quick and dirty solution is not giving a damn about knowledge but to the ability to solve the problem at hand *now*, at whatever petty cash cost because it looks more cost effective (when seen from the non-irrational accounting point of view, that is) --- but isn't going for the quick and dirty solution without learning anything from the experience also shortsighted ??? Carlos -- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Ron wrote: Speak Their Language (tm) [ ... ] Do The Right Thing (tm) [...] Not Listening to Reason (tm), [...] fiscal or managerial irresponsibility.) And *here*, of all the instances, you don't put a (TM) sign ?? Tsk-tsk-tsk :-) Carlos -- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/5/07, Guido Neitzer [EMAIL PROTECTED] wrote: On 05.03.2007, at 19:56, Alex Deucher wrote: Yes, I started setting that up this afternoon. I'm going to test that tomorrow and post the results. Good - that may or may not give some insight in the actual bottleneck. You never know but it seems to be one of the easiest to find out ... Well, the SAN appears to be the limiting factor. I set up the DB on the local scsi discs (software RAID 1) and performance is excellent (better than the old server). Thanks for everyone's help. Alex ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 10:25 AM 3/6/2007, Alex Deucher wrote: On 3/5/07, Guido Neitzer [EMAIL PROTECTED] wrote: On 05.03.2007, at 19:56, Alex Deucher wrote: Yes, I started setting that up this afternoon. I'm going to test that tomorrow and post the results. Good - that may or may not give some insight in the actual bottleneck. You never know but it seems to be one of the easiest to find out ... Well, the SAN appears to be the limiting factor. I set up the DB on the local scsi discs (software RAID 1) and performance is excellent (better than the old server). Thanks for everyone's help. Alex What kind of SAN is it and how many + what kind of HDs are in it? Assuming the answers are reasonable... Profile the table IO pattern your workload generates and start allocating RAID sets to tables or groups of tables based on IO pattern. For any table or group of tables that has a significant level of write IO, say = ~25% of the IO mix, try RAID 5 or 6 first, but be prepared to go RAID 10 if performance is not acceptable. Don't believe any of the standard lore regarding what tables to put where or what tables to give dedicated spindles to. Profile, benchmark, and only then start allocating dedicated resources. For instance, I've seen situations where putting pg_xlog on its own spindles was !not! the right thing to do. Best Wishes, Ron Peacetree ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/6/07, Ron [EMAIL PROTECTED] wrote: At 10:25 AM 3/6/2007, Alex Deucher wrote: On 3/5/07, Guido Neitzer [EMAIL PROTECTED] wrote: On 05.03.2007, at 19:56, Alex Deucher wrote: Yes, I started setting that up this afternoon. I'm going to test that tomorrow and post the results. Good - that may or may not give some insight in the actual bottleneck. You never know but it seems to be one of the easiest to find out ... Well, the SAN appears to be the limiting factor. I set up the DB on the local scsi discs (software RAID 1) and performance is excellent (better than the old server). Thanks for everyone's help. Alex What kind of SAN is it and how many + what kind of HDs are in it? Assuming the answers are reasonable... It's a Hitachi WMS/Tagmastore. 105 hitachi SATA drives; 4 Gbps FC. Profile the table IO pattern your workload generates and start allocating RAID sets to tables or groups of tables based on IO pattern. For any table or group of tables that has a significant level of write IO, say = ~25% of the IO mix, try RAID 5 or 6 first, but be prepared to go RAID 10 if performance is not acceptable. Right now it's designed for max capacity: big RAID 5 groups. I expect I'll probably need RAID 10 for decent performance. Don't believe any of the standard lore regarding what tables to put where or what tables to give dedicated spindles to. Profile, benchmark, and only then start allocating dedicated resources. For instance, I've seen situations where putting pg_xlog on its own spindles was !not! the right thing to do. Right. Thanks for the advice. I'll post my results when I get around to testing some new SAN configurations. Alex ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] strange performance regression between 7.4 and 8.1
Hello, I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). The original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB of ram running Solaris on local scsi discs. The new server is a sun Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux (AMD64) on a 4 Gbps FC SAN volume. When we created the new database it was created from scratch rather than copying over the old one, however the table structure is almost identical (UTF8 on the new one vs. C on the old). The problem is queries are ~10x slower on the new hardware. I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? Thanks, Alex ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 01.03.2007, at 13:40, Alex Deucher wrote: I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? As mentioned last week: Did you actually try to use the local drives for speed testing? It might be that the SAN introduces latency especially for random access you don't see on local drives. cug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/5/07, Guido Neitzer [EMAIL PROTECTED] wrote: On 01.03.2007, at 13:40, Alex Deucher wrote: I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? As mentioned last week: Did you actually try to use the local drives for speed testing? It might be that the SAN introduces latency especially for random access you don't see on local drives. Yes, I started setting that up this afternoon. I'm going to test that tomorrow and post the results. Alex cug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
* Alex Deucher: I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). Sorry for asking, but is this a typo? Do you mean 16 *TB* instead of 16 *GB*? If it's really 16 GB, you should check if it's cheaper to buy more RAM than to fiddle with the existing infrastructure. however the table structure is almost identical (UTF8 on the new one vs. C on the old). Locale settings make a huge difference for sorting and LIKE queries. We usually use the C locale and SQL_ASCII encoding, mostly for performance reasons. (Proper UTF-8 can be enforced through constraints if necessary.) -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Florian Weimer wrote: * Alex Deucher: I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). Sorry for asking, but is this a typo? Do you mean 16 *TB* instead of 16 *GB*? If it's really 16 GB, you should check if it's cheaper to buy more RAM than to fiddle with the existing infrastructure. This brings me to a related question: Do I need to specifically configure something to take advantage of such increase of RAM? In particular, is the amount of things that postgres can do with RAM limited by the amount of shared_buffers or some other parameter? Should shared_buffers be a fixed fraction of the total amount of physical RAM, or should it be the total amount minus half a gigabyte or so? As an example, if one upgrades a host from 1GB to 4GB, what would be the right thing to do in the configuration, assuming 8.1 or 8.2? (at least what would be the critical aspects?) Thanks, Carlos -- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/2/07, Florian Weimer [EMAIL PROTECTED] wrote: * Alex Deucher: I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). Sorry for asking, but is this a typo? Do you mean 16 *TB* instead of 16 *GB*? If it's really 16 GB, you should check if it's cheaper to buy more RAM than to fiddle with the existing infrastructure. Yes, 16 GB. I'd rather not shell out for more ram, if I'm not even sure that will help. The new system should be faster, or at least as fast, so I'd like to sort out what's going on before I buy more ram. however the table structure is almost identical (UTF8 on the new one vs. C on the old). Locale settings make a huge difference for sorting and LIKE queries. We usually use the C locale and SQL_ASCII encoding, mostly for performance reasons. (Proper UTF-8 can be enforced through constraints if necessary.) I suppose that might be a factor. How much of a performance difference do you see between utf-8 and C? Alex ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 08:56 AM 3/2/2007, Carlos Moreno wrote: Florian Weimer wrote: * Alex Deucher: I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). Sorry for asking, but is this a typo? Do you mean 16 *TB* instead of 16 *GB*? If it's really 16 GB, you should check if it's cheaper to buy more RAM than to fiddle with the existing infrastructure. This brings me to a related question: Do I need to specifically configure something to take advantage of such increase of RAM? In particular, is the amount of things that postgres can do with RAM limited by the amount of shared_buffers or some other parameter? Should shared_buffers be a fixed fraction of the total amount of physical RAM, or should it be the total amount minus half a gigabyte or so? As an example, if one upgrades a host from 1GB to 4GB, what would be the right thing to do in the configuration, assuming 8.1 or 8.2? (at least what would be the critical aspects?) Thanks, Carlos Unfortunately, pg does not (yet! ;-) ) treat all available RAM as a common pool and dynamically allocate it intelligently to each of the various memory data structures. So if you increase your RAM, you will have to manually change the entries in the pg config file to take advantage of it. (and start pg after changing it for the new config values to take effect) The pertinent values are all those listed under Memory in the annotated pg conf file: shared_buffers, work_mem, maintenance_work_mem, etc. http://www.powerpostgresql.com/Downloads/annotated_conf_80.html Cheers, Ron Peacetree ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: here are some examples. Analyze is still running on the new db, I'll post results when that is done. Mostly what our apps do is prepared row selects from different tables: select c1,c2,c3,c4,c5 from t1 where c1='XXX'; old server: db=# EXPLAIN ANALYZE select c1,c2 from t1 where c2='6258261'; QUERY PLAN --- Index Scan using t1_c2_index on t1 (cost=0.00..166.89 rows=42 width=26) (actual time=5.722..5.809 rows=2 loops=1) Index Cond: ((c2)::text = '6258261'::text) Total runtime: 5.912 ms (3 rows) db=# EXPLAIN ANALYZE select c1,c2 from t1 where c1='6258261'; QUERY PLAN Index Scan using t1_c1_key on t1 (cost=0.00..286.08 rows=72 width=26) (actual time=12.423..12.475 rows=12 loops=1) Index Cond: ((c1)::text = '6258261'::text) Total runtime: 12.538 ms (3 rows) new server: db=# EXPLAIN ANALYZE select c1,c2 from t1 where c2='6258261'; QUERY PLAN Index Scan using t1_c2_index on t1 (cost=0.00..37.63 rows=11 width=26) (actual time=33.461..51.377 rows=2 loops=1) Index Cond: ((c2)::text = '6258261'::text) Total runtime: 51.419 ms (3 rows) db=# EXPLAIN ANALYZE select c1,c2 from t1 where c1='6258261'; QUERY PLAN Index Scan using t1_c1_index on t1 (cost=0.00..630.45 rows=2907 width=26) (actual time=45.733..46.271 rows=12 loops=1) Index Cond: ((c1)::text = '6258261'::text) Total runtime: 46.325 ms (3 rows) Notice the huge disparity here betwen the expected number of rows (2907) and the actual rows? That's indicative of needing to run analyze. The time is only about 4x the 7.4 runtime and that's with the analyze running merrily along in the background. It's probably not as bad off as you think. At least this query isn't 10x. :-) Run these again for us after analyze is complete. well, while the DB isn't 10x, the application using the DB shoes a 10x decrease in performance. Pages that used to take 5 seconds to load take 50 secs (I supposed the problem is compounded as there are several queries per page.). Anyway, new numbers after the analyze. Unfortunately, they are improved, but still not great: old server: db=# EXPLAIN ANALYZE select c1,c2 from t1 where c2='6258261'; QUERY PLAN --- Index Scan using t1_c2_index on t1 (cost=0.00..166.89 rows=42 width=26) (actual time=0.204..0.284 rows=2 loops=1) Index Cond: ((c2)::text = '6258261'::text) Total runtime: 0.421 ms (3 rows) db=# EXPLAIN ANALYZE select c1,c2 from t1 where c1='6258261'; QUERY PLAN -- Index Scan using t1_c1_key on t1 (cost=0.00..286.08 rows=72 width=26) (actual time=0.299..0.354 rows=12 loops=1) Index Cond: ((c1)::text = '6258261'::text) Total runtime: 0.451 ms (3 rows) new server: db=# EXPLAIN ANALYZE select c1,c2 from t1 where c2='6258261'; QUERY PLAN -- Index Scan using t1_c2_index on t1 (cost=0.00..37.63 rows=11 width=26) (actual time=0.126..0.134 rows=2 loops=1) Index Cond: ((c2)::text = '6258261'::text) Total runtime: 0.197 ms (3 rows) db=# EXPLAIN ANALYZE select c1,c2 from t1 where c1='6258261'; QUERY PLAN -- Index Scan using t1_c1_index on t1 (cost=0.00..630.45 rows=2907 width=26) (actual time=5.820..5.848 rows=12 loops=1) Index Cond: ((c1)::text = '6258261'::text) Total runtime: 5.899 ms (3 rows) Here's another example: old server: db=# EXPLAIN ANALYZE select c1,c2 from t1 where c2='601'; QUERY PLAN Index
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 10:16 AM 3/2/2007, Alex Deucher wrote: On 3/2/07, Florian Weimer [EMAIL PROTECTED] wrote: * Alex Deucher: I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). Sorry for asking, but is this a typo? Do you mean 16 *TB* instead of 16 *GB*? If it's really 16 GB, you should check if it's cheaper to buy more RAM than to fiddle with the existing infrastructure. Yes, 16 GB. I'd rather not shell out for more ram, if I'm not even sure that will help. The new system should be faster, or at least as fast, so I'd like to sort out what's going on before I buy more ram. OK. You a= went from pg 7.4.x to 8.1.4 AND b= you changed from 4 SPARC CPUs (how many cores? If this is 4...) to 2 2C Opterons AND (SPEC and TPC bench differences between these CPUs?) c= you went from a Sun box to a white box AND (memory subsystem differences? other differences?) d= you went from local HD IO to a SAN (many differences hidden in that one line... ...and is the physical layout of tables and things like pg_xlog sane on the SAN?) ...and you did this by just pulling over the old DB onto the new HW? May I suggest that it is possible that your schema, queries, etc were all optimized for pg 7.x running on the old HW? (explain analyze shows the old system taking ~1/10 the time per row as well as estimating the number of rows more accurately) RAM is =cheap=. Much cheaper than the cost of a detective hunt followed by rework to queries, schema, etc. Fitting the entire DB into RAM is guaranteed to help unless this is an OLTP like application where HD IO is required to be synchronous.. If you can fit the entire DB comfortably into RAM, do it and buy yourself the time to figure out the rest of the story w/o impacting on production performance. Cheers, Ron Peacetree ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/2/07, Ron [EMAIL PROTECTED] wrote: At 10:16 AM 3/2/2007, Alex Deucher wrote: On 3/2/07, Florian Weimer [EMAIL PROTECTED] wrote: * Alex Deucher: I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). Sorry for asking, but is this a typo? Do you mean 16 *TB* instead of 16 *GB*? If it's really 16 GB, you should check if it's cheaper to buy more RAM than to fiddle with the existing infrastructure. Yes, 16 GB. I'd rather not shell out for more ram, if I'm not even sure that will help. The new system should be faster, or at least as fast, so I'd like to sort out what's going on before I buy more ram. OK. You a= went from pg 7.4.x to 8.1.4 AND yes. b= you changed from 4 SPARC CPUs (how many cores? If this is 4...) to 2 2C Opterons AND (SPEC and TPC bench differences between these CPUs?) 4 single core 800 Mhz sparcs to 2 dual core 2.2 Ghz opterons. c= you went from a Sun box to a white box AND (memory subsystem differences? other differences?) The new hardware is Sun as well. X4100s running Linux. It should be faster all around because the old server is 5 years old. d= you went from local HD IO to a SAN (many differences hidden in that one line... ...and is the physical layout of tables and things like pg_xlog sane on the SAN?) ...and you did this by just pulling over the old DB onto the new HW? We rebuild the DB from scratch on the new server. Same table structure though. We reloaded from the source material directly. May I suggest that it is possible that your schema, queries, etc were all optimized for pg 7.x running on the old HW? (explain analyze shows the old system taking ~1/10 the time per row as well as estimating the number of rows more accurately) RAM is =cheap=. Much cheaper than the cost of a detective hunt followed by rework to queries, schema, etc. Fitting the entire DB into RAM is guaranteed to help unless this is an OLTP like application where HD IO is required to be synchronous.. If you can fit the entire DB comfortably into RAM, do it and buy yourself the time to figure out the rest of the story w/o impacting on production performance. Perhaps so. I just don't want to spend $1000 on ram and have it only marginally improve performance if at all. The old DB works, so we can keep using that until we sort this out. Alex Cheers, Ron Peacetree ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Alex Deucher [EMAIL PROTECTED] writes: Anyway, new numbers after the analyze. Unfortunately, they are improved, but still not great: Why are the index names different between the old and new servers? Is that just cosmetic, or is 8.2 actually picking a different (and less suitable) index for the c1 queries? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/2/07, Tom Lane [EMAIL PROTECTED] wrote: Alex Deucher [EMAIL PROTECTED] writes: Anyway, new numbers after the analyze. Unfortunately, they are improved, but still not great: Why are the index names different between the old and new servers? Is that just cosmetic, or is 8.2 actually picking a different (and less suitable) index for the c1 queries? That's just cosmetic. They are the same. Alex ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On Fri, 2007-03-02 at 10:03, Alex Deucher wrote: On 3/2/07, Ron [EMAIL PROTECTED] wrote: At 10:16 AM 3/2/2007, Alex Deucher wrote: d= you went from local HD IO to a SAN (many differences hidden in that one line... ...and is the physical layout of tables and things like pg_xlog sane on the SAN?) ...and you did this by just pulling over the old DB onto the new HW? We rebuild the DB from scratch on the new server. Same table structure though. We reloaded from the source material directly. I would REALLY recommend testing this machine out with a simple software RAID-1 pair of SCSI or SATA drives just to eliminate or confirm the SAN as the root problem. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 11:03 AM 3/2/2007, Alex Deucher wrote: On 3/2/07, Ron [EMAIL PROTECTED] wrote: May I suggest that it is possible that your schema, queries, etc were all optimized for pg 7.x running on the old HW? (explain analyze shows the old system taking ~1/10 the time per row as well as estimating the number of rows more accurately) RAM is =cheap=. Much cheaper than the cost of a detective hunt followed by rework to queries, schema, etc. Fitting the entire DB into RAM is guaranteed to help unless this is an OLTP like application where HD IO is required to be synchronous.. If you can fit the entire DB comfortably into RAM, do it and buy yourself the time to figure out the rest of the story w/o impacting on production performance. Perhaps so. I just don't want to spend $1000 on ram and have it only marginally improve performance if at all. The old DB works, so we can keep using that until we sort this out. Alex 1= $1000 worth of RAM is very likely less than the $ worth of, say, 10 hours of your time to your company. Perhaps much less. (Your =worth=, not your pay or even your fully loaded cost. This number tends to be = 4x what you are paid unless the organization you are working for is in imminent financial danger.) You've already put more considerably more than 10 hours of your time into this... 2= If the DB goes from not fitting completely into RAM to being completely RAM resident, you are almost 100% guaranteed a big performance boost. The exception is an OLTP like app where DB writes can't be done a-synchronously (doing financial transactions, real time control systems, etc). Data mines should never have this issue. 3= Whether adding enough RAM to make the DB RAM resident (and re-configuring conf, etc, appropriately) solves the problem or not, you will have gotten a serious lead as to what's wrong. ...and I still think looking closely at the actual physical layout of the tables in the SAN is likely to be worth it. Cheers, Ron Peacetree ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Am Donnerstag 01 März 2007 21:44 schrieb Alex Deucher: Hello, I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). The original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB of ram running Solaris on local scsi discs. The new server is a sun Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux (AMD64) on a 4 Gbps FC SAN volume. When we created the new database it was created from scratch rather than copying over the old one, however the table structure is almost identical (UTF8 on the new one vs. C on the old). The problem is queries are ~10x slower on the new hardware. I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? 1. Do you use NUMA ctl for locking the db on one node ? 2. do you use bios to interleave memeory ? 3. do you expand cache over mor than one numa node ? Thanks, Alex ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- ATRSoft GmbH Rosellstrasse 9 D 50354 Hürth Deutschland Tel .: +49(0)2233 691324 Geschäftsführer Anton Rommerskirchen Köln HRB 44927 STNR 224/5701 - 1010 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Florian Weimer escribió: Locale settings make a huge difference for sorting and LIKE queries. We usually use the C locale and SQL_ASCII encoding, mostly for performance reasons. (Proper UTF-8 can be enforced through constraints if necessary.) Hmm, you are aware of varchar_pattern_ops and related opclasses, right? That helps for LIKE queries in non-C locales (though you do have to keep almost-duplicate indexes). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/2/07, Ron [EMAIL PROTECTED] wrote: At 11:03 AM 3/2/2007, Alex Deucher wrote: On 3/2/07, Ron [EMAIL PROTECTED] wrote: May I suggest that it is possible that your schema, queries, etc were all optimized for pg 7.x running on the old HW? (explain analyze shows the old system taking ~1/10 the time per row as well as estimating the number of rows more accurately) RAM is =cheap=. Much cheaper than the cost of a detective hunt followed by rework to queries, schema, etc. Fitting the entire DB into RAM is guaranteed to help unless this is an OLTP like application where HD IO is required to be synchronous.. If you can fit the entire DB comfortably into RAM, do it and buy yourself the time to figure out the rest of the story w/o impacting on production performance. Perhaps so. I just don't want to spend $1000 on ram and have it only marginally improve performance if at all. The old DB works, so we can keep using that until we sort this out. Alex 1= $1000 worth of RAM is very likely less than the $ worth of, say, 10 hours of your time to your company. Perhaps much less. (Your =worth=, not your pay or even your fully loaded cost. This number tends to be = 4x what you are paid unless the organization you are working for is in imminent financial danger.) You've already put more considerably more than 10 hours of your time into this... 2= If the DB goes from not fitting completely into RAM to being completely RAM resident, you are almost 100% guaranteed a big performance boost. The exception is an OLTP like app where DB writes can't be done a-synchronously (doing financial transactions, real time control systems, etc). Data mines should never have this issue. 3= Whether adding enough RAM to make the DB RAM resident (and re-configuring conf, etc, appropriately) solves the problem or not, you will have gotten a serious lead as to what's wrong. ...and I still think looking closely at the actual physical layout of the tables in the SAN is likely to be worth it. How would I go about doing that? Thanks, Alex ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 02:43 PM 3/2/2007, Alex Deucher wrote: On 3/2/07, Ron [EMAIL PROTECTED] wrote: ...and I still think looking closely at the actual physical layout of the tables in the SAN is likely to be worth it. How would I go about doing that? Alex Hard for me to give specific advice when I don't know what SAN product we are talking about nor what kind of HDs are in it nor how those HDs are presently configured... I quote you in an earlier post: The RAID groups on the SAN were set up for maximum capacity rather than for performance. Using it for the databases just came up recently. That implies to me that the SAN is more or less set up as a huge 105 HD (assuming this number is correct? We all know how assume is spelled...) JBOD or RAID 5 (or 6, or 5*, or 6*) set. =IF= that is true, tables are not being given dedicated RAID groups. That implies that traditional lore like having pg_xlog on dedicated spindles is being ignored. Nor is the more general Best Practice of putting the most heavily used tables onto dedicated spindles being followed. In addition, the most space efficient RAID levels: 5* or 6*, are not the best performing one (RAID 10 striping your mirrors) In short, configuring a SAN for maximum capacity is exactly the wrong thing to do if one is planning to use it in the best way to support DB performance. I assume (there's that word again...) that there is someone in your organization who understands how the SAN is configured and administered. You need to talk to them about these issues. Cheers, Ron ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/2/07, Ron [EMAIL PROTECTED] wrote: At 02:43 PM 3/2/2007, Alex Deucher wrote: On 3/2/07, Ron [EMAIL PROTECTED] wrote: ...and I still think looking closely at the actual physical layout of the tables in the SAN is likely to be worth it. How would I go about doing that? Alex Hard for me to give specific advice when I don't know what SAN product we are talking about nor what kind of HDs are in it nor how those HDs are presently configured... I quote you in an earlier post: The RAID groups on the SAN were set up for maximum capacity rather than for performance. Using it for the databases just came up recently. That implies to me that the SAN is more or less set up as a huge 105 HD (assuming this number is correct? We all know how assume is spelled...) JBOD or RAID 5 (or 6, or 5*, or 6*) set. =IF= that is true, tables are not being given dedicated RAID groups. That implies that traditional lore like having pg_xlog on dedicated spindles is being ignored. Nor is the more general Best Practice of putting the most heavily used tables onto dedicated spindles being followed. In addition, the most space efficient RAID levels: 5* or 6*, are not the best performing one (RAID 10 striping your mirrors) In short, configuring a SAN for maximum capacity is exactly the wrong thing to do if one is planning to use it in the best way to support DB performance. I assume (there's that word again...) that there is someone in your organization who understands how the SAN is configured and administered. You need to talk to them about these issues. Ah OK. I see what you are saying; thank you for clarifying. Yes, the SAN is configured for maximum capacity; it has large RAID 5 groups. As I said earlier, we never intended to run a DB on the SAN, it just happened to come up, hence the configuration. Alex ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 02.03.2007, at 14:20, Alex Deucher wrote: Ah OK. I see what you are saying; thank you for clarifying. Yes, the SAN is configured for maximum capacity; it has large RAID 5 groups. As I said earlier, we never intended to run a DB on the SAN, it just happened to come up, hence the configuration. So why not dumping the stuff ones, importing into a PG configured to use local discs (Or even ONE local disc, you might have the 16GB you gave as a size for the db on the local machine, right?) and testing whether the problem is with PG connecting to the SAN. So you have one factor less to consider after all your changes. Maybe it's just that something in the chain from PG to the actual HD spindles kills your random access performance for getting the actual rows. cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On Fri, 2 Mar 2007, Guido Neitzer wrote: On 02.03.2007, at 14:20, Alex Deucher wrote: Ah OK. I see what you are saying; thank you for clarifying. Yes, the SAN is configured for maximum capacity; it has large RAID 5 groups. As I said earlier, we never intended to run a DB on the SAN, it just happened to come up, hence the configuration. So why not dumping the stuff ones, importing into a PG configured to use local discs (Or even ONE local disc, you might have the 16GB you gave as a size for the db on the local machine, right?) and testing whether the problem is with PG connecting to the SAN. So you have one factor less to consider after all your changes. Maybe it's just that something in the chain from PG to the actual HD spindles kills your random access performance for getting the actual rows. I am actually starting to think that the SAN may be introducing some amount of latency that is enough to kill your random IO which is what all of the queries in question are doing - look up in index - fetch row from table. If you have the time, it would be totally worth it to test with a local disk and see how that affects the speed. I would think that even with RAID5, a SAN with that many spindles would be quite fast in raw throughput, but perhaps it's just seek latency that's killing you. When you run the bonnie tests again, take note of what the seeks/sec is compared with the old disk. Also, you should run bonnie with the -b switch to see if that causes significant slowdown of the writes...maybe minor synced write activity to pg_xlog is bogging the entire system down. Is the system spending most of its time in IO wait? Also, another item of note might be the actual on disk DB size..I wonder if it has changed significantly going from SQL_ASCII to UTF8. In 8.1 you can do this: SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size FROM pg_database; In 7.4, you'll need to install the dbsize contrib module to get the same info. -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/2/07, Jeff Frost [EMAIL PROTECTED] wrote: On Fri, 2 Mar 2007, Guido Neitzer wrote: On 02.03.2007, at 14:20, Alex Deucher wrote: Ah OK. I see what you are saying; thank you for clarifying. Yes, the SAN is configured for maximum capacity; it has large RAID 5 groups. As I said earlier, we never intended to run a DB on the SAN, it just happened to come up, hence the configuration. So why not dumping the stuff ones, importing into a PG configured to use local discs (Or even ONE local disc, you might have the 16GB you gave as a size for the db on the local machine, right?) and testing whether the problem is with PG connecting to the SAN. So you have one factor less to consider after all your changes. Maybe it's just that something in the chain from PG to the actual HD spindles kills your random access performance for getting the actual rows. I am actually starting to think that the SAN may be introducing some amount of latency that is enough to kill your random IO which is what all of the queries in question are doing - look up in index - fetch row from table. If you have the time, it would be totally worth it to test with a local disk and see how that affects the speed. I would think that even with RAID5, a SAN with that many spindles would be quite fast in raw throughput, but perhaps it's just seek latency that's killing you. When you run the bonnie tests again, take note of what the seeks/sec is compared with the old disk. Also, you should run bonnie with the -b switch to see if that causes significant slowdown of the writes...maybe minor synced write activity to pg_xlog is bogging the entire system down. Is the system spending most of its time in IO wait? Also, another item of note might be the actual on disk DB size..I wonder if it has changed significantly going from SQL_ASCII to UTF8. In 8.1 you can do this: SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size FROM pg_database; In 7.4, you'll need to install the dbsize contrib module to get the same info. I'm beginning the think the same thing. I'm planning to try the tests above next week. I'll let you know what I find out. Thanks! Alex ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] strange performance regression between 7.4 and 8.1
Hello, I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). The original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB of ram running Solaris on local scsi discs. The new server is a sun Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux (AMD64) on a 4 Gbps FC SAN volume. When we created the new database it was created from scratch rather than copying over the old one, however the table structure is almost identical (UTF8 on the new one vs. C on the old). The problem is queries are ~10x slower on the new hardware. I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? Thanks, Alex ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On Thu, 1 Mar 2007, Joshua D. Drake wrote: Alex Deucher wrote: Hello, I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). The original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB of ram running Solaris on local scsi discs. The new server is a sun Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux (AMD64) on a 4 Gbps FC SAN volume. When we created the new database it was created from scratch rather than copying over the old one, however the table structure is almost identical (UTF8 on the new one vs. C on the old). The problem is queries are ~10x slower on the new hardware. I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? Also, an explain analyze from both the 7.4 and 8.1 systems with one of the 10x slower queries would probably be handy. What do you mean by created from scratch rather than copying over the old one? How did you put the data in? Did you run analyze after loading it? Is autovacuum enabled and if so, what are the thresholds? -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote: Alex Deucher wrote: Hello, I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). The original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB of ram running Solaris on local scsi discs. The new server is a sun Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux (AMD64) on a 4 Gbps FC SAN volume. When we created the new database it was created from scratch rather than copying over the old one, however the table structure is almost identical (UTF8 on the new one vs. C on the old). The problem is queries are ~10x slower on the new hardware. I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? I'm running the autovacuum process on the 8.1 server. vacuuming on the old server was done manually. default_statistics_target and effective_cache_size are set to the the defaults on both. postgres 7.4 server: # - Memory - shared_buffers = 82000 # 1000min 16, at least max_connections*2, 8KB each sort_mem = 8000# 1024min 64, size in KB vacuum_mem = 32000 # 8192min 1024, size in KB # - Free Space Map - #max_fsm_pages = 2 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 postgres 8.1 server: # - Memory - shared_buffers = 10 # min 16 or max_connections*2, 8KB each temp_buffers = 2000 #1000 # min 100, 8KB each max_prepared_transactions = 100 #5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1#1024 # min 64, size in KB maintenance_work_mem = 524288 #16384# min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB I've also tried using the same settings from the old server on the new one; same performance issues. Thanks, Alex Sincerely, Joshua D. Drake Thanks, Alex ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Joshua D. Drake wrote: Alex Deucher wrote: Hello, I have noticed a strange performance regression and I'm at a loss as to what's happening. We have a fairly large database (~16 GB). The original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB of ram running Solaris on local scsi discs. The new server is a sun Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux (AMD64) on a 4 Gbps FC SAN volume. When we created the new database it was created from scratch rather than copying over the old one, however the table structure is almost identical (UTF8 on the new one vs. C on the old). The problem is queries are ~10x slower on the new hardware. I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? Also, an explain analyze from both the 7.4 and 8.1 systems with one of the 10x slower queries would probably be handy. I'll run some and get back to you. What do you mean by created from scratch rather than copying over the old one? How did you put the data in? Did you run analyze after loading it? Is autovacuum enabled and if so, what are the thresholds? Both the databases were originally created from xml files. We just re-created the new one from the xml rather than copying the old database over. I didn't manually run analyze on it, but we are running the autovacuum process: autovacuum = on #off# enable autovacuum subprocess? autovacuum_naptime = 360 #60# time between autovacuum runs, in secs autovacuum_vacuum_threshold = 1 #1000 # min # of tuple updates before # vacuum autovacuum_analyze_threshold = 5000 #500# min # of tuple updates before Thanks, Alex -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On Thu, 1 Mar 2007, Alex Deucher wrote: Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? I'm running the autovacuum process on the 8.1 server. vacuuming on the old server was done manually. default_statistics_target and effective_cache_size are set to the the defaults on both. postgres 7.4 server: # - Memory - shared_buffers = 82000 # 1000min 16, at least max_connections*2, 8KB each sort_mem = 8000# 1024min 64, size in KB vacuum_mem = 32000 # 8192min 1024, size in KB # - Free Space Map - #max_fsm_pages = 2 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 postgres 8.1 server: # - Memory - shared_buffers = 10 # min 16 or max_connections*2, 8KB each temp_buffers = 2000 #1000 # min 100, 8KB each max_prepared_transactions = 100 #5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1#1024 # min 64, size in KB maintenance_work_mem = 524288 #16384# min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB I've also tried using the same settings from the old server on the new one; same performance issues. If this is a linux system, could you give us the output of the 'free' command? Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? I'm running the autovacuum process on the 8.1 server. vacuuming on the old server was done manually. default_statistics_target and effective_cache_size are set to the the defaults on both. postgres 7.4 server: # - Memory - shared_buffers = 82000 # 1000min 16, at least max_connections*2, 8KB each sort_mem = 8000# 1024min 64, size in KB vacuum_mem = 32000 # 8192min 1024, size in KB # - Free Space Map - #max_fsm_pages = 2 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 postgres 8.1 server: # - Memory - shared_buffers = 10 # min 16 or max_connections*2, 8KB each temp_buffers = 2000 #1000 # min 100, 8KB each max_prepared_transactions = 100 #5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1#1024 # min 64, size in KB maintenance_work_mem = 524288 #16384# min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB I've also tried using the same settings from the old server on the new one; same performance issues. If this is a linux system, could you give us the output of the 'free' command? total used free sharedbuffers cached Mem: 80598528042868 16984 02287888648 -/+ buffers/cache: 1539927905860 Swap: 15631224 2164 15629060 Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write yes it's set to the default. speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP luna12-san 16000M 58896 91 62931 9 35870 5 54869 82 145504 13 397.7 0 effective_cache_size is the default. Alex ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On Thu, 1 Mar 2007, Alex Deucher wrote: On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? I'm running the autovacuum process on the 8.1 server. vacuuming on the old server was done manually. default_statistics_target and effective_cache_size are set to the the defaults on both. postgres 7.4 server: # - Memory - shared_buffers = 82000 # 1000min 16, at least max_connections*2, 8KB each sort_mem = 8000# 1024min 64, size in KB vacuum_mem = 32000 # 8192min 1024, size in KB # - Free Space Map - #max_fsm_pages = 2 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 postgres 8.1 server: # - Memory - shared_buffers = 10 # min 16 or max_connections*2, 8KB each temp_buffers = 2000 #1000 # min 100, 8KB each max_prepared_transactions = 100 #5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1#1024 # min 64, size in KB maintenance_work_mem = 524288 #16384# min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB I've also tried using the same settings from the old server on the new one; same performance issues. If this is a linux system, could you give us the output of the 'free' command? total used free sharedbuffers cached Mem: 80598528042868 16984 02287888648 -/+ buffers/cache: 1539927905860 Swap: 15631224 2164 15629060 So, I would set effective_cache_size = 988232 (7905860/8). Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write yes it's set to the default. speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP luna12-san 16000M 58896 91 62931 9 35870 5 54869 82 145504 13 397.7 0 So, you're getting 62MB/s writes and 145MB/s reads. Just FYI, that write speed is about the same as my single SATA drive write speed on my workstation, so not that great. The read speed is decent, though and with that sort of read performance, you might want to lower random_page_cost to something like 2.5 or 2 so the planner will tend to prefer index scans. -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: Vacuum? Analayze? default_statistics_target? How many shared_buffers? effective_cache_size? work_mem? I'm running the autovacuum process on the 8.1 server. vacuuming on the old server was done manually. default_statistics_target and effective_cache_size are set to the the defaults on both. postgres 7.4 server: # - Memory - shared_buffers = 82000 # 1000min 16, at least max_connections*2, 8KB each sort_mem = 8000# 1024min 64, size in KB vacuum_mem = 32000 # 8192min 1024, size in KB # - Free Space Map - #max_fsm_pages = 2 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 postgres 8.1 server: # - Memory - shared_buffers = 10 # min 16 or max_connections*2, 8KB each temp_buffers = 2000 #1000 # min 100, 8KB each max_prepared_transactions = 100 #5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1#1024 # min 64, size in KB maintenance_work_mem = 524288 #16384# min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB I've also tried using the same settings from the old server on the new one; same performance issues. If this is a linux system, could you give us the output of the 'free' command? total used free sharedbuffers cached Mem: 80598528042868 16984 02287888648 -/+ buffers/cache: 1539927905860 Swap: 15631224 2164 15629060 So, I would set effective_cache_size = 988232 (7905860/8). Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write yes it's set to the default. speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP luna12-san 16000M 58896 91 62931 9 35870 5 54869 82 145504 13 397.7 0 So, you're getting 62MB/s writes and 145MB/s reads. Just FYI, that write speed is about the same as my single SATA drive write speed on my workstation, so not that great. The read speed is decent, though and with that sort of read performance, you might want to lower random_page_cost to something like 2.5 or 2 so the planner will tend to prefer index scans. Right, but the old box was getting ~45MBps on both reads and writes, so it's an improvement for me :) Thanks for the advice, I'll let you know how it goes. Alex ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
At 07:36 PM 3/1/2007, Jeff Frost wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write yes it's set to the default. speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP luna12-san 16000M 58896 91 62931 9 35870 5 54869 82 145504 13 397.7 0 So, you're getting 62MB/s writes and 145MB/s reads. Just FYI, that write speed is about the same as my single SATA drive write speed on my workstation, so not that great. The read speed is decent, though and with that sort of read performance, you might want to lower random_page_cost to something like 2.5 or 2 so the planner will tend to prefer index scans. Right, but the old box was getting ~45MBps on both reads and writes, so it's an improvement for me :) Thanks for the advice, I'll let you know how it goes. Do you think that is because you have a different interface between you and the SAN? ~45MBps is pretty slow - your average 7200RPM ATA133 drive can do that and costs quite a bit less than a SAN. Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. Remember that pg, even pg 8.2.3, has a known history of very poor insert speed (see comments on this point by Josh Berkus, Luke Lonergan, etc) For some reason, the code changes that have resulted in dramatic improvements in pg's read speed have not had nearly the same efficacy for writes. Bottom line: pg presently has a fairly low and fairly harsh upper bound on write performance. What exactly that bound is has been the subject of some discussion, but IIUC the fact of its existence is well established. Various proposals for improving the situation exist, I've even made some of them, but AFAIK this is currently considered one of the tough pg problems. Cheers, Ron Peacetree ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
\ Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. The direct attached scsi discs on the old database server we getting 45MBps not the SAN. The SAN got 62/145Mbps, which is not as bad. How many spindles you got in that SAN? We have 4 servers on the SAN each with it's own 4 GBps FC link via an FC switch. I'll try and re-run the numbers when the servers are idle this weekend. Alex -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote: \ Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. The direct attached scsi discs on the old database server we getting 45MBps not the SAN. The SAN got 62/145Mbps, which is not as bad. How many spindles you got in that SAN? 105 IIRC. Alex ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On Thu, 1 Mar 2007, Alex Deucher wrote: On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write yes it's set to the default. speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP luna12-san 16000M 58896 91 62931 9 35870 5 54869 82 145504 13 397.7 0 So, you're getting 62MB/s writes and 145MB/s reads. Just FYI, that write speed is about the same as my single SATA drive write speed on my workstation, so not that great. The read speed is decent, though and with that sort of read performance, you might want to lower random_page_cost to something like 2.5 or 2 so the planner will tend to prefer index scans. Right, but the old box was getting ~45MBps on both reads and writes, so it's an improvement for me :) Thanks for the advice, I'll let you know how it goes. Do you think that is because you have a different interface between you and the SAN? ~45MBps is pretty slow - your average 7200RPM ATA133 drive can do that and costs quite a bit less than a SAN. Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. The direct attached scsi discs on the old database server we getting 45MBps not the SAN. The SAN got 62/145Mbps, which is not as bad. We have 4 servers on the SAN each with it's own 4 GBps FC link via an FC switch. I'll try and re-run the numbers when the servers are idle this weekend. Sorry, I thought the old server was also attached to the SAN. My fault for not hanging onto the entire email thread. I think you're mixing and matching your capitol and lower case Bs in your sentence above though. :-) I suspect what you really mean is The SAN got 62/145MBps (megabytes/sec) and teh FC link is 4Gbps (gigabits/sec) or 500MBps. Is that correct? If so, and seeing that you think there are 105 spindles on the SAN, I'd say you're either maxxing out the switch fabric of the SAN with your servers or you have a really poorly performing SAN in general, or you just misunderstood the . As a comparison With 8 WD Raptors configured in a RAID10 with normal ext3 I get about 160MB/s write and 305MB/s read performance. Hopefully the SAN has lots of other super nifty features that make up for the poor performance. :-( -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: On 3/1/07, Jeff Frost [EMAIL PROTECTED] wrote: On Thu, 1 Mar 2007, Alex Deucher wrote: Postgresql might be choosing a bad plan because your effective_cache_size is way off (it's the default now right?). Also, what was the block read/write yes it's set to the default. speed of the SAN from your bonnie tests? Probably want to tune random_page_cost as well if it's also at the default. --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP luna12-san 16000M 58896 91 62931 9 35870 5 54869 82 145504 13 397.7 0 So, you're getting 62MB/s writes and 145MB/s reads. Just FYI, that write speed is about the same as my single SATA drive write speed on my workstation, so not that great. The read speed is decent, though and with that sort of read performance, you might want to lower random_page_cost to something like 2.5 or 2 so the planner will tend to prefer index scans. Right, but the old box was getting ~45MBps on both reads and writes, so it's an improvement for me :) Thanks for the advice, I'll let you know how it goes. Do you think that is because you have a different interface between you and the SAN? ~45MBps is pretty slow - your average 7200RPM ATA133 drive can do that and costs quite a bit less than a SAN. Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. The direct attached scsi discs on the old database server we getting 45MBps not the SAN. The SAN got 62/145Mbps, which is not as bad. We have 4 servers on the SAN each with it's own 4 GBps FC link via an FC switch. I'll try and re-run the numbers when the servers are idle this weekend. Sorry, I thought the old server was also attached to the SAN. My fault for not hanging onto the entire email thread. I think you're mixing and matching your capitol and lower case Bs in your sentence above though. :-) whoops :) I suspect what you really mean is The SAN got 62/145MBps (megabytes/sec) and teh FC link is 4Gbps (gigabits/sec) or 500MBps. Is that correct? If so, and seeing that you think there are 105 spindles on the SAN, I'd say you're either maxxing out the switch fabric of the SAN with your servers or you have a really poorly performing SAN in general, or you just misunderstood the . As a comparison With 8 WD Raptors configured in a RAID10 with normal ext3 I get about 160MB/s write and 305MB/s read performance. Hopefully the SAN has lots of other super nifty features that make up for the poor performance. :-( It's big and reliable (and compared to lots of others, relatively inexpensive) which is why we bought it. We bought it mostly as a huge file store. The RAID groups on the SAN were set up for maximum capacity rather than for performance. Using it for the databases just came up recently. Alex ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Alex Deucher wrote: On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote: \ Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. The direct attached scsi discs on the old database server we getting 45MBps not the SAN. The SAN got 62/145Mbps, which is not as bad. How many spindles you got in that SAN? 105 IIRC. You have 105 spindles are you are only get 62megs on writes? That seems about half what you should be getting. (at least). Joshua D. Drake Alex -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote: Alex Deucher wrote: On 3/1/07, Joshua D. Drake [EMAIL PROTECTED] wrote: \ Is the SAN being shared between the database servers and other servers? Maybe it was just random timing that gave you the poor write performance on the old server which might be also yielding occassional poor performance on the new one. The direct attached scsi discs on the old database server we getting 45MBps not the SAN. The SAN got 62/145Mbps, which is not as bad. How many spindles you got in that SAN? 105 IIRC. You have 105 spindles are you are only get 62megs on writes? That seems about half what you should be getting. (at least). Take the numbers with grain of salt. They are by no means a thorough evaluation. I just ran bonnie a couple times to get a rough reference point. I can do a more thorough analysis. Alex Joshua D. Drake Alex ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq