Re: [HACKERS] [PERFORM] Estimation problem with a LIKE clause containing a /
Tom Lane [EMAIL PROTECTED] writes: This rule works for all the locales I have installed ... but I don't have any Far Eastern locales installed. Also, my test cases are only covering ASCII characters, and I believe many locales have some non-ASCII letters that sort after 'Z'. I'm not sure how hard we need to try to cover those corner cases, though. It is ultimately only an estimate... If I understand correctly what we're talking about it's generating estimates for LIKE 'foo%' using the algorithm which makes sense for C locale which means generating the next range of values which start with 'foo%'. It seems to me the problematic situations is when the most-frequent-values come into play. Being off slightly in the histogram isn't going to generate very inaccurate estimates but including or not a most-frequent-value could throw off the estimate severely. Could we not use the bogus range to calculate the histogram estimate but apply the LIKE pattern directly to the most-frequent-values instead of applying the bogus range? Or would that be too much code re-organization for now? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] dell versus hp
* Scott Marlowe: If the right two disks fail in a RAID-10 you lose everything. Admittedly, that's a pretty remote possibility, It's not, unless you carefully layout the RAID-1 subunits so that their drives aren't physically adjacent. 8-/ I don't think many controllers support that. -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD
On Nov 9, 2007 7:06 AM, Ivan Voras [EMAIL PROTECTED] wrote: I just read this document and thought I should share it with this list: http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf Nice presentation. Thanks for posting it on here. Among other things (FreeBSD advocacy, mostly :) ), it contains a direct comparison between MySQL and PostgreSQL on various platforms, with PostgreSQL winning! :) -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] PostgreSQL vs MySQL, and FreeBSD
Hi, I just read this document and thought I should share it with this list: http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf Among other things (FreeBSD advocacy, mostly :) ), it contains a direct comparison between MySQL and PostgreSQL on various platforms, with PostgreSQL winning! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD
Among other things (FreeBSD advocacy, mostly :) ), it contains a direct comparison between MySQL and PostgreSQL on various platforms, with PostgreSQL winning! Hello, If the queries are complex, this is understable. I had a performance review of a Hibernate project (Java Object Relation Mapping) using MySQL. ORM produces easily complex queries with joins and subqueries. MySQL uses nested loops for subqueries which lead to performance issues with growing database size. They state in their documentation that for version 5.2 there are improvements planned regarding this kind of query. Best Regards Sebastian ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD
On Nov 9, 2007, at 6:06 AM, Ivan Voras wrote: Hi, I just read this document and thought I should share it with this list: http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf Among other things (FreeBSD advocacy, mostly :) ), it contains a direct comparison between MySQL and PostgreSQL on various platforms, with PostgreSQL winning! Which is typical for those who aren't in on the FUD :) Erik Jones Software Developer | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD
On Nov 9, 2007 9:41 AM, Sebastian Hennebrueder [EMAIL PROTECTED] wrote: If the queries are complex, this is understable. I had a performance review of a Hibernate project (Java Object Relation Mapping) using MySQL. ORM produces easily complex queries with joins and subqueries. MySQL uses nested loops for subqueries which lead to performance issues with growing database size. They state in their documentation that for version 5.2 there are improvements planned regarding this kind of query. So, MySQL 5.2 will be catching up to version 7.1 or 7.2 of PostgreSQL in that regard? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] dell versus hp
Apart from the disks, you might also investigate using Opterons instead of Xeons. there appears to be some significant dent in performance between Opteron and Xeon. Xeons appear to spend more time in passing around ownership of memory cache lines in case of a spinlock. It's not yet clear whether or not here has been worked around the issue. You should at least investigate it a bit. We're using a HP DL385 ourselves which performs quite well. -R- Tore Halset wrote: Hello. 1) Dell 2900 (5U) 8 * 146 GB SAS 15Krpm 3,5 8GB ram Perc 5/i. battery backup. 256MB ram. 2 * 4 Xeon 2,66GHz 2) Dell 2950 (2U) 8 * 146 GB SAS 10Krpm 2,5 (not really selectable, but I think the webshop is wrong..) 8GB ram Perc 5/i. battery backup. 256MB ram. 2 * 4 Xeon 2,66GHz 3) HP ProLiant DL380 G5 (2U) 8 * 146 GB SAS 10Krpm 2,5 8GB ram P400 raid controller. battery backup. 512MB ram. 2 * 2 Xeon 3GHz ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD
On Fri, 9 Nov 2007, Sebastian Hennebrueder wrote: If the queries are complex, this is understable. The queries used for this comparison are trivial. There's only one table involved and there are no joins. It's testing very low-level aspects of performance. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] Estimation problem with a LIKE clause containing a /
Gregory Stark [EMAIL PROTECTED] writes: Could we not use the bogus range to calculate the histogram estimate but apply the LIKE pattern directly to the most-frequent-values instead of applying the bogus range? Or would that be too much code re-organization for now? We have already done that for quite some time. It won't help Guillaume's case anyhow: he's got no MCVs, presumably because the field is unique. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] Estimation problem with a LIKE clause containing a /
On Nov 9, 2007 5:33 PM, Tom Lane [EMAIL PROTECTED] wrote: he's got no MCVs, presumably because the field is unique. It is. The ancestors field contains the current folder itself so the id of the folder (which is the primary key) is in it. -- Guillaume ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] dell versus hp
Apart from the disks, you might also investigate using Opterons instead of Xeons. there appears to be some significant dent in performance between Opteron and Xeon. Xeons appear to spend more time in passing around ownership of memory cache lines in case of a spinlock. It's not yet clear whether or not here has been worked around the issue. You should at least investigate it a bit. We're using a HP DL385 ourselves which performs quite well. Not atm. Until new benchmarks are published comparing AMD's new quad-core with Intel's ditto, Intel has the edge. http://tweakers.net/reviews/657/6 -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] dell versus hp
On Nov 9, 2007 10:40 AM, Claus Guttesen [EMAIL PROTECTED] wrote: Apart from the disks, you might also investigate using Opterons instead of Xeons. there appears to be some significant dent in performance between Opteron and Xeon. Xeons appear to spend more time in passing around ownership of memory cache lines in case of a spinlock. It's not yet clear whether or not here has been worked around the issue. You should at least investigate it a bit. We're using a HP DL385 ourselves which performs quite well. Not atm. Until new benchmarks are published comparing AMD's new quad-core with Intel's ditto, Intel has the edge. http://tweakers.net/reviews/657/6 For 8 cores, it appears AMD has the lead, read this (stolen from another thread): http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] work_mem and shared_buffers
Does the amount of memory allocate to work_mem get subtracted from shared_buffers? Example: If work_mem is 1M and there are 10 connections and shared_buffers is 100M then would the total be 90 M left for shared_buffers? Or does the amount of memory allocated for work_mem have nothing to do with shared_buffers? Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu
Re: [PERFORM] work_mem and shared_buffers
Campbell, Lance wrote: Does the amount of memory allocate to work_mem get subtracted from shared_buffers? Example: If work_mem is 1M and there are 10 connections and shared_buffers is 100M then would the total be 90 M left for shared_buffers? Or does the amount of memory allocated for work_mem have nothing to do with shared_buffers? No, they're completely separate. Note that a connection can use more than work_mem of memory. For example, if you run a query with multiple Sort or hash-nodes, each such node allocates up to work_mem of memory. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD
On Fri, 9 Nov 2007 11:11:18 -0500 (EST) Greg Smith [EMAIL PROTECTED] wrote: On Fri, 9 Nov 2007, Sebastian Hennebrueder wrote: If the queries are complex, this is understable. The queries used for this comparison are trivial. There's only one table involved and there are no joins. It's testing very low-level aspects of performance. Actually, what it's really showing is parallelism, and I've always expected PostgreSQL to come out on top in that arena. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] dell versus hp
On Fri, 9 Nov 2007, Scott Marlowe wrote: Not atm. Until new benchmarks are published comparing AMD's new quad-core with Intel's ditto, Intel has the edge. http://tweakers.net/reviews/657/6 For 8 cores, it appears AMD has the lead, read this (stolen from another thread): http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf This issue isn't simple, and it may be the case that both conclusions are correct in their domain but testing slightly different things. The sysbench test used by the FreeBSD benchmark is a much simpler than what the tweakers.net benchmark simulates. Current generation AMD and Intel processors are pretty close in performance, but guessing which will work better involves a complicated mix of both CPU and memory issues. AMD's NUMA architecture does some things better, and Intel's memory access takes a second hit in designs that use FB-DIMMs. But Intel has enough of an advantage on actual CPU performance and CPU caching that current designs are usually faster regardless. For an interesting look at the low-level details here, the current mainstream parts are compared at http://techreport.com/articles.x/11443/13 and a similar comparison for the just released quad-core Opterons is at http://techreport.com/articles.x/13176/12 Nowadays Intel vs. AMD is tight enough that I don't even worry about that part in the context of a database application (there was still a moderate gap when the Tweakers results were produced a year ago). On a real server, I'd suggest being more worried about how good the disk controller is, what the expansion options are there, and relative $/core. In the x86/x64 realm, I don't feel CPU architecture is a huge issue right now when you're running a database. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] work_mem and shared_buffers
Wow. That is a nice logging feature in 8.3! Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -Original Message- From: Bill Moran [mailto:[EMAIL PROTECTED] Sent: Friday, November 09, 2007 2:08 PM To: Campbell, Lance Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] work_mem and shared_buffers On Fri, 9 Nov 2007 12:08:57 -0600 Campbell, Lance [EMAIL PROTECTED] wrote: How do you know when you should up the value of work_mem? Just play with the number. Is there a query I could do that would tell me if PostgreSql is performing SQL that could use more memory for sorting? 8.2 and older, it can be difficult to know, and I don't have a specific recommendation. 8.3 includes a parameter to log the usage of temporary files by Postgres. When a sort can't fit in the available memory, it uses a temp file, thus you could use this new feature to track when sorts don't fit in work_mem. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] dell versus hp
On Nov 8, 2007, at 3:56 PM, Alan Hodgson wrote: You can't touch RAID 10 for performance or reliability. The only reason to use RAID 5 or RAID 6 is to get more capacity out of the same drives. Maybe you can't, but I can. I guess I have better toys than you :-) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] work_mem and shared_buffers
On Fri, 9 Nov 2007 12:08:57 -0600 Campbell, Lance [EMAIL PROTECTED] wrote: How do you know when you should up the value of work_mem? Just play with the number. Is there a query I could do that would tell me if PostgreSql is performing SQL that could use more memory for sorting? 8.2 and older, it can be difficult to know, and I don't have a specific recommendation. 8.3 includes a parameter to log the usage of temporary files by Postgres. When a sort can't fit in the available memory, it uses a temp file, thus you could use this new feature to track when sorts don't fit in work_mem. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] work_mem and shared_buffers
On Nov 9, 2007 12:08 PM, Campbell, Lance [EMAIL PROTECTED] wrote: How do you know when you should up the value of work_mem? Just play with the number. Is there a query I could do that would tell me if PostgreSql is performing SQL that could use more memory for sorting? Trial and error. Note that you can set work_mem for a given session. While it may seem that making work_mem bigger will always help, that's not necessarily the case. Using this query: select count(*) from (select * from myreporttable where lasttime now() - interval '1 week' order by random() ) as l I did the following: (I ran the query by itself once to fill the buffers / cache of the machine with the data) work_mem Time: 1000kB 29215.563 ms 4000kB 20612.489 ms 8000kB 18408.087 ms 16000kB 16893.964 ms 32000kB 17681.221 ms 64000kB 22439.988 ms 125MB 23398.891 ms 250MB 25461.797 ms Note that my best time was at around 16 Meg work_mem. This data set is MUCH bigger than 16 Meg, it's around 300-400 Meg. But work_mem optimized out at 16 Meg. Btw, I tried it going as high as 768 Meg, and it was still slower than 16M. This machine has 2 Gigs ram and is optimized for IO not CPU performance. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] work_mem and shared_buffers
It is amazing, how after working with databases very actively for over 8 years, I am still learning things. Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -Original Message- From: Scott Marlowe [mailto:[EMAIL PROTECTED] Sent: Friday, November 09, 2007 1:13 PM To: Campbell, Lance Cc: Heikki Linnakangas; pgsql-performance@postgresql.org Subject: Re: [PERFORM] work_mem and shared_buffers On Nov 9, 2007 12:08 PM, Campbell, Lance [EMAIL PROTECTED] wrote: How do you know when you should up the value of work_mem? Just play with the number. Is there a query I could do that would tell me if PostgreSql is performing SQL that could use more memory for sorting? Trial and error. Note that you can set work_mem for a given session. While it may seem that making work_mem bigger will always help, that's not necessarily the case. Using this query: select count(*) from (select * from myreporttable where lasttime now() - interval '1 week' order by random() ) as l I did the following: (I ran the query by itself once to fill the buffers / cache of the machine with the data) work_mem Time: 1000kB 29215.563 ms 4000kB 20612.489 ms 8000kB 18408.087 ms 16000kB 16893.964 ms 32000kB 17681.221 ms 64000kB 22439.988 ms 125MB 23398.891 ms 250MB 25461.797 ms Note that my best time was at around 16 Meg work_mem. This data set is MUCH bigger than 16 Meg, it's around 300-400 Meg. But work_mem optimized out at 16 Meg. Btw, I tried it going as high as 768 Meg, and it was still slower than 16M. This machine has 2 Gigs ram and is optimized for IO not CPU performance. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] work_mem and shared_buffers
On Nov 9, 2007 1:19 PM, Campbell, Lance [EMAIL PROTECTED] wrote: It is amazing, how after working with databases very actively for over 8 years, I am still learning things. The fun thing about postgresql is that just when you've got it figured out, somebody will come along and improve it in such a way as to make your previously gathered knowledge obsolete. In a good way. I imagine in a few years, hardly anyone using postgresql will remember the ancient art of having either apostrophes in a row inside your plpgsql functions... ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] work_mem and shared_buffers
On Nov 9, 2007, at 1:24 PM, Scott Marlowe wrote: On Nov 9, 2007 1:19 PM, Campbell, Lance [EMAIL PROTECTED] wrote: It is amazing, how after working with databases very actively for over 8 years, I am still learning things. The fun thing about postgresql is that just when you've got it figured out, somebody will come along and improve it in such a way as to make your previously gathered knowledge obsolete. In a good way. I imagine in a few years, hardly anyone using postgresql will remember the ancient art of having either apostrophes in a row inside your plpgsql functions... Speaking of that devil, I started working with Postgres mere months after that particular evil went away but we still have a good bit of plpgsql with it in production. I've been meaning to convert it and clean it up for a while now. Would you, or anybody, happen to know of any scripts out there that I could grab to make a quick job, no brains required of it? Erik Jones Software Developer | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] Estimation problem with a LIKE clause containing a /
Tom, Just to confirm you that your last commit fixed the problem: lbo=# explain analyze select * from cms_items where ancestors LIKE '1062/%'; QUERY PLAN --- Seq Scan on cms_items (cost=0.00..688.26 rows=*9097* width=103) (actual time=0.011..22.605 rows=11326 loops=1) Filter: ((ancestors)::text ~~ '1062/%'::text) Total runtime: 30.022 ms (3 rows) Thanks for your time. -- Guillaume ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] work_mem and shared_buffers
On Nov 9, 2007 2:38 PM, Erik Jones [EMAIL PROTECTED] wrote: I imagine in a few years, hardly anyone using postgresql will remember the ancient art of having either apostrophes in a row inside your plpgsql functions... Speaking of that devil, I started working with Postgres mere months after that particular evil went away but we still have a good bit of plpgsql with it in production. I've been meaning to convert it and clean it up for a while now. Would you, or anybody, happen to know of any scripts out there that I could grab to make a quick job, no brains required of it? Man, I can't think of any. I'd assume you'd need to look for the longest occurance of ' marks, and replace it with one field, say $1$ or something, then the next smaller set, with $2$ or something and so on. I imagine one could write a script to do it. Luckily, we only had one or two levels of ' marks in any of our stored procs, so it was only a few minutes each time I edited one to switch it over. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Can I Determine if AutoVacuum Does Anything?
We've had our PostgreSQL 8.1.4 installation configured to autovacuum since January, but I suspect it might not be doing anything. Perhaps I can determine what happens through the log files? Is there a summary of which when to log settings in postgresql.conf should be set to get at least table-level messages about yes/no decisions? The only message I see now is very terse, indicating that autovacuum does run: LOG: autovacuum: processing database dc_prod I suspect there's a problem because there appears to be 78% overhead in the database size, whereas I would expect 10-15% based on what I've read. This is not good for some Seq Scan operations on large tables (the root problem I'm starting to tackle). Notes: [+] Last week I restored a production backup into my development sandbox with a psql -f, then ran a vacuumdb -a z on it. After that, I noticed that the size of the production database is 78% larger than development, using select pg_database_size('dc_prod') in pgAdmin3. Prod is 5.9GB, but my Dev is 3.3GB. [+] The worst table has about 2.7x overhead, according to select relpages/reltuples from pg_class queries. Here are the relevant postgresql.conf settings in production. I can't speak to their suitability, but I think they should reclaim some unused space for reuse. #stats_start_collector = on #stats_block_level = off stats_row_level = on #stats_reset_on_server_start = off autovacuum = on autovacuum_naptime = 360 autovacuum_vacuum_threshold = 1000 autovacuum_analyze_threshold = 500 autovacuum_vacuum_scale_factor = 0.04 autovacuum_analyze_scale_factor = 0.02 autovacuum_vacuum_cost_delay = 10 autovacuum_vacuum_cost_limit = -1 I was suspicious that the stat_row_level might not work because stat_block_level is off. But I see pg_stat_user_tables.n_tup_ins, pg_stat_user_tables.n_tup_upd and pg_stat_user_tables.n_tup_del are all increasing (slowly but surely). Thanks, David Crane http://www.donorschoose.org http://www.donorschoose.org Teachers Ask. You Choose. Students Learn.
Re: [PERFORM] Can I Determine if AutoVacuum Does Anything?
David Crane wrote: We've had our PostgreSQL 8.1.4 installation configured to autovacuum since January, but I suspect it might not be doing anything. Perhaps I can determine what happens through the log files? Is there a summary of which when to log settings in postgresql.conf should be set to get at least table-level messages about yes/no decisions? The only message I see now is very terse, indicating that autovacuum does run: Yeah, you have to set log_min_messages to debug2 to get useful output for autovacuum. This is fixed in 8.3, but for earlier version there is nothing short of patching the server. autovacuum = on autovacuum_naptime = 360 This is a bit on the high side, but it may not be very important. Keep in mind that in 8.2 and earlier, it means how long between autovac checks, so if there are many databases, it could be long before one autovac run in a particular database and the next one. (In 8.3 it has been redefined to mean the interval between runs on every database). autovacuum_vacuum_threshold = 1000 autovacuum_analyze_threshold = 500 These are the default values but for small tables they seem high as well. IIRC your problem is actually with big tables, for which it doesn't make much of a difference. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Join performance
Pepe Barbe wrote: Hello, I am having an issue on PostgreSQL 8.0.12. In the past we had performance issues with the query planner for queries on some tables where we knew we had indexes and it was doing a sequential scan, and for this reason we issue SET enable_seqscan = FALSE for some queries. Recently we have stumbled upon one of these kind of queries that is giving terrible performance, because seqscan is disabled. I've reduced the problem to a a command like this one: SELECT * from gsm_sector_metrics NATURAL JOIN gsm_amr_metrics INNER JOIN temp_busy_hr USING(start_time,bsc_id,sect_id); Where temp_busy_hr is a temporary table. Have you tried analyzing the temp_busy_hr table? Possibly adding an index to the temp table can help if you are doing lots of queries. If the previous is issued with seqscan TRUE, it runs within reasonable time, else it runs for ever. The query plan for the previous query with enable_seqscan = TRUE: It would be worth know how far the estimates are out. Also, have you tried altering the statistics target for relevant columns to increase the accuracy? QUERY PLAN Limit (cost=0.00..384555.98 rows=1 width=3092) - Nested Loop (cost=0.00..384555.98 rows=1 width=3092) Join Filter: ((inner.bsc_id = outer.bsc_id) AND (inner.site_id = outer.site_id) AND (inner.sect_id = outer.sect_id)) - Nested Loop (cost=0.00..368645.64 rows=28 width=1192) Join Filter: ((outer.sect_id = inner.sect_id) AND (outer.bsc_id = inner.bsc_id)) - Seq Scan on temp_busy_hr (cost=0.00..24.00 rows=1400 width=24) - Index Scan using gsm_amr_start_time_idx on gsm_amr_metrics (cost=0.00..226.66 rows=2094 width=1168) Index Cond: (outer.start_time = gsm_amr_metrics.start_time) - Index Scan using gsm_sector_start_time_idx on gsm_sector_metrics t1 (cost=0.00..528.77 rows=1973 width=1936) Index Cond: (t1.start_time = outer.start_time) (10 rows) and the plan for enable_seqscan = FALSE: QUERY PLAN Limit (cost=10097.16.. 100720844.01 rows=1 width=3092) - Nested Loop (cost=10097.16..100720844.01 rows=1 width=3092) Join Filter: ((inner.bsc_id = outer.bsc_id) AND (inner.site_id = outer.site_id) AND (inner.sect_id = outer.sect_id)) - Merge Join (cost=10097.16..100704933.67 rows=28 width=1192) Merge Cond: (outer.start_time = inner.start_time) Join Filter: ((inner.sect_id = outer.sect_id) AND (inner.bsc_id = outer.bsc_id)) - Index Scan using gsm_amr_start_time_idx on gsm_amr_metrics (cost=0.00..631211.45 rows=6005551 width=1168) - Sort (cost=10097.16..10100.66 rows=1400 width=24) Sort Key: temp_busy_hr.start_time - Seq Scan on temp_busy_hr (cost=1.00..10024.00 rows=1400 width=24) - Index Scan using gsm_sector_start_time_idx on gsm_sector_metrics t1 (cost=0.00..528.77 rows=1973 width=1936) Index Cond: (t1.start_time = outer.start_time) (12 rows) Any ideas what could I try to fix this problem? Thanks, Pepe ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match