Re: [PERFORM] Select performance vs. mssql
select count(*) from mtable where day='Mon' Results: 1. P3 600 512MB RAM MSSQL. It takes about 4-5 secs to run. If I run a few queries and everything is cached, it is sometimes just 1 second. 2. Athlon 1.3 Ghz 1GB RAM. PostgreSQL takes 7 seconds. I have played with the buffers setting and currently have it at 7500. At 2 it took over 20 seconds to run. 5 seconds vs 7 isn't that big of a deal, but 1 second vs 7 seconds is. Also, the slower performance is with much lesser hardware. Post the result of this for us: explain analyze select count(*) from mtable where day='Mon'; On both machines. Chris ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Select performance vs. mssql
Post the result of this for us: explain analyze select count(*) from mtable where day='Mon'; On both machines. Hi Chris -- PostgreSQL Machine: Aggregate (cost=140122.56..140122.56 rows=1 width=0) (actual time=24516.000..24516.000 rows=1 loops=1) - Index Scan using day on mtable (cost=0.00..140035.06 rows=35000 width=0) (actual time=47.000..21841.000 rows=1166025 loops=1) Index Cond: (day = 'Mon'::bpchar) Total runtime: 24516.000 ms (Note this took 24 seconds after fresh reboot, next execution was 11, and execution without explain analyze was 6.7 seconds) MSSQL Machine: That Explain Analyze command doesn't work for MSSQL, but I did view the Query plan. 97% of it was Scanning a particular range of rows from a nonclustered index Thanks for your help --Mark __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Select performance vs. mssql
mark durrant wrote: PostgreSQL Machine: Aggregate (cost=140122.56..140122.56 rows=1 width=0) (actual time=24516.000..24516.000 rows=1 loops=1) - Index Scan using day on mtable (cost=0.00..140035.06 rows=35000 width=0) (actual time=47.000..21841.000 rows=1166025 loops=1) Index Cond: (day = 'Mon'::bpchar) Total runtime: 24516.000 ms Have you run ANALYZE? Clustering the table on the day index (via the CLUSTER command) would be worth trying. -Neil ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Select performance vs. mssql
Mark, MSSQL Machine: That Explain Analyze command doesn't work for MSSQL, try this: set showplan_all on go select ... go Harald ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Need help to decide Mysql vs Postgres
Hi all, From whatever reading and surfing I have done, I have found that postgres is good. Actually I myself am a fan of postgres as compared to mysql. However I want to have some frank opinions before I decide something. Following are some of the aspects of my schema, and our concerns -- - We have around 150 tables on the DB - We have lot of foreign keys between the tables - Couple of tables are going to have around couple of hundereds of millions of records (300 Million right now and would grow). Few of this tables are fairly wide with around 32 columns, and have around 3-4 columns which are foreign keys and refer to other tables - Most of the DB usage is Selects. We would have some inserts but that would be like a nightly or a monthly process Our only concern with going with postgres is speed. I haven't done a speed test yet so I can't speak. But the major concern is that the selects and inserts are going to be much much slower on postgres than on mysql. I dont know how true this is. I know this is a postgres forum so everyone will say postgres is better but I am just looking for some help and advise I guess !!! I am not trying to start a mysql vs postgres war so please dont misunderstand me I tried to look around for mysql vs postgres articles, but most of them said mysql is better in speed. However those articles were very old so I dont know about recent stage. Please comment !!! Thanks, Amit ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Need help to decide Mysql vs Postgres
I am not trying to start a mysql vs postgres war so please dont misunderstand me I tried to look around for mysql vs postgres articles, but most of them said mysql is better in speed. However those articles were very old so I dont know about recent stage. Please comment !!! It is my experience that MySQL is faster under smaller load scenarios. Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website. It is also my experience that PostgreSQL is faster and more stable under consistent and heavy load. I have customers you regularly are using up to 500 connections. Note that alot of this depends on how your database is designed. Foreign keys slow things down. I think it would be important for you to look at your overall goal of migration. MySQL is really not a bad product IF you are willing to work within its limitations. PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a comparable feature set. Only you can decide if that is what you need. Sincerely, Joshua D. Drake Command Prompt, Inc. -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Need help to decide Mysql vs Postgres
Hi Josh, Thanks for the prompt reply !! Actually migration is inevitable. We have a totally messed up schema, not normalized and stuff like that. So the goal of the migration is to get a new and better normalized schema. That part is done already. Now the decision point is, should we go with postgres or mysql. Thanks, Amit -Original Message- From: Joshua D. Drake [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 24, 2005 1:15 PM To: Amit V Shah Cc: 'pgsql-performance@postgresql.org' Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres I am not trying to start a mysql vs postgres war so please dont misunderstand me I tried to look around for mysql vs postgres articles, but most of them said mysql is better in speed. However those articles were very old so I dont know about recent stage. Please comment !!! It is my experience that MySQL is faster under smaller load scenarios. Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website. It is also my experience that PostgreSQL is faster and more stable under consistent and heavy load. I have customers you regularly are using up to 500 connections. Note that alot of this depends on how your database is designed. Foreign keys slow things down. I think it would be important for you to look at your overall goal of migration. MySQL is really not a bad product IF you are willing to work within its limitations. PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a comparable feature set. Only you can decide if that is what you need. Sincerely, Joshua D. Drake Command Prompt, Inc. -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Select performance vs. mssql
On Tue, May 24, 2005 at 08:36:36 -0700, mark durrant [EMAIL PROTECTED] wrote: --MSSQL's ability to hit the index only and not having to go to the table itself results in a _big_ performance/efficiency gain. If someone who's in development wants to pass this along, it would be a nice addition to PostgreSQL sometime in the future. I'd suspect that as well as making one query faster, it would make everything else faster/more scalable as the server load is so much less. This gets brought up a lot. The problem is that the index doesn't include information about whether the current transaction can see the referenced row. Putting this information in the index will add significant overhead to every update and the opinion of the developers is that this would be a net loss overall. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Need help to decide Mysql vs Postgres
I took an unbiased look and did some tests. Objectively for me MYSQL was not an improvement for speed. I had read the benchmarks in pcmagazine from a while back as well. I did some tests using ODBC, and .net connections and also used aqua studios (hooks up to both data bases) and found postgres a bit faster. I did spend more time getting a feeling for setup on postgres, but I was at a point of desperation as some queries were still too slow on postgres. I ended up re-engineering my app to use simpler(flattened) data sets. I still have a few I am working through, but all in all it is running better then when I was on MSSQL, and MYSQL was just slower on the tests I did. I loaded both MYSQL and postgres on both my 4 processor Dell running red hat AS3 and Windows XP on a optiplex. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 [EMAIL PROTECTED] www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Amit V Shah Sent: Tuesday, May 24, 2005 12:22 PM To: 'Joshua D. Drake' Cc: 'pgsql-performance@postgresql.org' Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres Hi Josh, Thanks for the prompt reply !! Actually migration is inevitable. We have a totally messed up schema, not normalized and stuff like that. So the goal of the migration is to get a new and better normalized schema. That part is done already. Now the decision point is, should we go with postgres or mysql. Thanks, Amit -Original Message- From: Joshua D. Drake [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 24, 2005 1:15 PM To: Amit V Shah Cc: 'pgsql-performance@postgresql.org' Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres I am not trying to start a mysql vs postgres war so please dont misunderstand me I tried to look around for mysql vs postgres articles, but most of them said mysql is better in speed. However those articles were very old so I dont know about recent stage. Please comment !!! It is my experience that MySQL is faster under smaller load scenarios. Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website. It is also my experience that PostgreSQL is faster and more stable under consistent and heavy load. I have customers you regularly are using up to 500 connections. Note that alot of this depends on how your database is designed. Foreign keys slow things down. I think it would be important for you to look at your overall goal of migration. MySQL is really not a bad product IF you are willing to work within its limitations. PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a comparable feature set. Only you can decide if that is what you need. Sincerely, Joshua D. Drake Command Prompt, Inc. -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Need help to decide Mysql vs Postgres
Amit, - We have lot of foreign keys between the tables Do you need these keys to be enforced? Last I checked, MySQL was still having trouble with foriegn keys. - Most of the DB usage is Selects. We would have some inserts but that would be like a nightly or a monthly process So transaction integrity is not a real concern? This sounds like a data warehouse; wanna try Bizgres? (www.bizgres.org) Our only concern with going with postgres is speed. I haven't done a speed test yet so I can't speak. But the major concern is that the selects and inserts are going to be much much slower on postgres than on mysql. I dont know how true this is. I know this is a postgres forum so everyone will say postgres is better but I am just looking for some help and advise I guess Well, the relative speed depends on what you're doing. You want slow, try a transaction rollback on a large InnoDB table ;-) PostgreSQL/Bizgres will also be implementing bitmapped indexes and table partitioning very soon, so we're liable to pull way ahead of MySQL on very large databases. I am not trying to start a mysql vs postgres war so please dont misunderstand me I tried to look around for mysql vs postgres articles, but most of them said mysql is better in speed. Also I'll bet most of those articles were based on either website use or single-threaded simple-sql tests. Not a read data warehousing situatiion. It's been my personal experience that MySQL does not scale well beyond about 75GB without extensive support from MySQL AB. PostgreSQL more easily scales up to 200GB, and to as much as 1TB with tuning expertise. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Need help to decide Mysql vs Postgres
- Most of the DB usage is Selects. We would have some inserts but that would be like a nightly or a monthly process So transaction integrity is not a real concern? This sounds like a data warehouse; wanna try Bizgres? (www.bizgres.org) I took a look at this. I have a few concerns with bizgres though -- I am using jetspeed portal engine and Hibernate as my O/R Mapping layer. I know for sure that they dont support bizgres. Now the question is what difference is there between bizgres and postgres ... I guess I will try to look around the website more and find out, but if there is something you would like to comment, that would be very helpful ... Thanks, Amit ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Need help to decide Mysql vs Postgres
On Tue, May 24, 2005 at 01:56:54PM -0400, Amit V Shah wrote: I took a look at this. I have a few concerns with bizgres though -- I am using jetspeed portal engine and Hibernate as my O/R Mapping layer. If you have problems with performance, you might want to look into using JDBC directly instead of using Hibernate. I know groups of people who are rather less-than-happy with it performance-wise :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Need help to decide Mysql vs Postgres
Amit, I took a look at this. I have a few concerns with bizgres though -- I am using jetspeed portal engine and Hibernate as my O/R Mapping layer. I know for sure that they dont support bizgres. Now the question is what difference is there between bizgres and postgres ... I guess I will try to look around the website more and find out, but if there is something you would like to comment, that would be very helpful ... Bizgres is PostgreSQL. Just a different packaging of it, with some patches which are not yet in the main PostgreSQL. Also, it's currently beta. --Josh -- __Aglio Database Solutions___ Josh BerkusConsultant josh@agliodbs.comwww.agliodbs.com Ph: 415-752-2500Fax: 415-752-2387 2166 Hayes Suite 200San Francisco, CA ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Select performance vs. mssql
I'm far from an expert, so this may be off-base... but perhaps a suggestion would be to allow a hint to be sent to the optimizer if the user doesn't care that the result is approximate maybe then this wouldn't require adding more overhead to the indexes. MSSQL has something like this with (nolock) i.e. select count(*) from customers (nolock) where name like 'Mark%' Regardless, I'm very impressed with PostgreSQL and I think we're moving ahead with it. Mark --- Bruno Wolff III [EMAIL PROTECTED] wrote: On Tue, May 24, 2005 at 08:36:36 -0700, mark durrant [EMAIL PROTECTED] wrote: --MSSQL's ability to hit the index only and not having to go to the table itself results in a _big_ performance/efficiency gain. If someone who's in development wants to pass this along, it would be a nice addition to PostgreSQL sometime in the future. I'd suspect that as well as making one query faster, it would make everything else faster/more scalable as the server load is so much less. This gets brought up a lot. The problem is that the index doesn't include information about whether the current transaction can see the referenced row. Putting this information in the index will add significant overhead to every update and the opinion of the developers is that this would be a net loss overall. __ Do you Yahoo!? Yahoo! Small Business - Try our new Resources site http://smallbusiness.yahoo.com/resources/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Need help to decide Mysql vs Postgres
It's common knowledge, it seems, that MySQL without transactions will be a lot faster than Postgres on Inserts. And on Updates too, that is, unless you have more than a few concurrent concurrent connections, at which point the MySQL full table lock will just kill everything. And you don't have transactions, of course, and if something goes wrong, bye bye data, or funky stuff happens, like half-commited transactions if a constraint is violated in an INSERT SELECT, or you get 0 January or 31 February, etc. I heard it said that MySQL with transactions (InnoDB) is slower than postgres. I'd believe it... and you still get 00-00- as a date for free. But from your use case postgres doesn't sound like a problem, yours sounds like a few big batched COPY's which are really really fast. And about SELECTs, this is really from an experience I had a few months ago, from a e-commerce site... well, to put it nicely, MySQL's planner don't know shit when it comes to doing anything a bit complicated. I had this query to show the also purchased products on a page, and also a few other queries, best buys in this category, related products, etc..., nothing very complicated really, at worst they were 4-table joins... and with 50K products MySQL planned it horrendously and it took half a second ! Seq scans every times... I had to split the query in two, one to get the product id's, another one to get the products. I took the sql, put it in postgres with the usual changes (typenames, etc...) but same indexes, same data... the query took half a millisecond. Well... what can I say ? Also when you sit in front of the psql or mysql command line, it's an entirely different experience. One is a pleasure to work with... the other one is just a pain. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Need help to decide Mysql vs Postgres
Amit V Shah wrote: Hi Josh, Thanks for the prompt reply !! Actually migration is inevitable. We have a totally messed up schema, not normalized and stuff like that. So the goal of the migration is to get a new and better normalized schema. That part is done already. Now the decision point is, should we go with postgres or mysql. O.k. then I would ask myself this: Would I trust my brand new data that I have put all this effort into, that finally looks the way that I want it to look, to a database that truncates information? PostgreSQL is truly ACID compliant. Even if it is a little slower (which under normal use I don't find to be the case) wouldn't the reliability of PostgreSQL make up for say the 10% net difference in performance? Sincerely, Joshua D. Drake Thanks, Amit -Original Message- From: Joshua D. Drake [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 24, 2005 1:15 PM To: Amit V Shah Cc: 'pgsql-performance@postgresql.org' Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres I am not trying to start a mysql vs postgres war so please dont misunderstand me I tried to look around for mysql vs postgres articles, but most of them said mysql is better in speed. However those articles were very old so I dont know about recent stage. Please comment !!! It is my experience that MySQL is faster under smaller load scenarios. Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website. It is also my experience that PostgreSQL is faster and more stable under consistent and heavy load. I have customers you regularly are using up to 500 connections. Note that alot of this depends on how your database is designed. Foreign keys slow things down. I think it would be important for you to look at your overall goal of migration. MySQL is really not a bad product IF you are willing to work within its limitations. PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a comparable feature set. Only you can decide if that is what you need. Sincerely, Joshua D. Drake Command Prompt, Inc. -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Select performance vs. mssql
Until you start worrying about MVC - we have had problems with the MSSQL implementation of read consistency because of this 'feature'. Alex Turner NetEconomistOn 5/24/05, Bruno Wolff III [EMAIL PROTECTED] wrote: On Tue, May 24, 2005 at 08:36:36 -0700,mark durrant [EMAIL PROTECTED] wrote: --MSSQL's ability to hit the index only and not having to go to the table itself results in a _big_ performance/efficiency gain. If someone who's in development wants to pass this along, it would be a nice addition to PostgreSQL sometime in the future. I'd suspect that as well as making one query faster, it would make everything else faster/more scalable as the server load is so much less.This gets brought up a lot. The problem is that the index doesn't includeinformation about whether the current transaction can see the referenced row. Putting this information in the index will add significant overheadto every update and the opinion of the developers is that this would bea net loss overall.---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Select performance vs. mssql
Folks, This gets brought up a lot. The problem is that the index doesn't include information about whether the current transaction can see the referenced row. Putting this information in the index will add significant overhead to every update and the opinion of the developers is that this would be a net loss overall. Pretty much. There has been discussion about allowing index-only access to frozen tables, i.e. archive partitions. But it all sort of hinges on someone implementing it and testing -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Select performance vs. mssql
Pretty much. There has been discussion about allowing index-only access to frozen tables, i.e. archive partitions. But it all sort of hinges on someone implementing it and testing Would be interesting as a parameter to set at index creation (ie. if you know this table will have a lot of reads and few writes)... like create an index on columns X,Y keeping data on columns X,Y and Z... But in this case do you still need the table ? Or even create a table type where the table and the index are one, like an auto-clustered table... I don't know if it would be used that often, though ;) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Select performance vs. mssql
On Tue, May 24, 2005 at 04:35:14PM -0700, Josh Berkus wrote: Pretty much. There has been discussion about allowing index-only access to frozen tables, i.e. archive partitions. But it all sort of hinges on someone implementing it and testing Is there any way to expose the planner estimate? For some purposes it's enough to just give a rough ballpark (e.g., a google-esque results 1-10 of approximately 1000) so a user knows whether its worth even starting to page through. Mike Stone ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Select performance vs. mssql
Michael Stone wrote: On Tue, May 24, 2005 at 04:35:14PM -0700, Josh Berkus wrote: Pretty much. There has been discussion about allowing index-only access to frozen tables, i.e. archive partitions. But it all sort of hinges on someone implementing it and testing Is there any way to expose the planner estimate? For some purposes it's enough to just give a rough ballpark (e.g., a google-esque results 1-10 of approximately 1000) so a user knows whether its worth even starting to page through. Mike Stone ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Well, you could always do: EXPLAIN SELECT ... And then parse out the rows= in the first line. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Select performance vs. mssql
--As Chris pointed out, how real-world is this test? His point is valid. The database we're planning will have a lot of rows and require a lot of summarization (hence my attempt at a test), but we shouldn't be pulling a million rows at a time. If you want to do lots of aggregate analysis, I suggest you create a sepearate summary table, and create triggers on the main table to maintain your summaries in the other table... --MSSQL's ability to hit the index only and not having to go to the table itself results in a _big_ performance/efficiency gain. If someone who's in development wants to pass this along, it would be a nice addition to PostgreSQL sometime in the future. I'd suspect that as well as making one query faster, it would make everything else faster/more scalable as the server load is so much less. This is well-known and many databases do it. However, due to MVCC considerations in PostgreSQL, it's not feasible for us to implement it... Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Can anyone explain this: duplicate dbs.
If I pg_dump that database then create a new database (e.g. tempdb) and upload the dump file (thus making a duplicate) then the same query only takes 190ms !! Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an impact on these times. Damn, for some reason I didn't read that you had already tried vacuum full. In that case, I can't explain it except perhaps you aren't vacuuming properly, or the right thing, or it's a disk cache thing. Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Can anyone explain this: duplicate dbs.
SpaceBallOne wrote: Wondering if someone could explain a pecularity for me: We have a database which takes 1000ms to perform a certain query on. If I pg_dump that database then create a new database (e.g. tempdb) and upload the dump file (thus making a duplicate) then the same query only takes 190ms !! Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an impact on these times. Can anyone explain why this may be occurring and how I might be able to keep the original database running at the same speed as tempdb? Thanks in advance, Dave. What version of postgres? There are a few possibilities. If you are having a lot of updates to the table, you can get index bloat. And vacuum doesn't fix indexes. You have to REINDEX to do that. Though REINDEX has the same lock that VACUUM FULL has, so you need to be a little careful with it. Probably better is to do CLUSTER, as it does a REINDEX and a sort, so your table ends up nicer when you are done. Also, older versions of postgres had a worse time with index bloat. One thing that caused a lot of problem is a table that you insert into over time, so that all the values are incrementing. If you are deleting older entries, that area won't be re-used because they fall at the back end. I believe newer versions have been fixed. By the way, I think doing: CREATE DATABASE tempdb WITH TEMPLATE = originaldb; Is a much faster way of doing dump and load. I *think* it would recreate indexes, etc. If it just does a copy it may not show the dump/restore improvement. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Can anyone explain this: duplicate dbs.
What version of postgres? 8.0.2 ... but I think I've seen this before on 7.3 ... There are a few possibilities. If you are having a lot of updates to the table, you can get index bloat. And vacuum doesn't fix indexes. You have to REINDEX to do that. Though REINDEX has the same lock that VACUUM FULL has, so you need to be a little careful with it. Probably better is to do CLUSTER, as it does a REINDEX and a sort, so your table ends up nicer when you are done. Thanks, will try those next time this problem crops up (i just deleted / recreated the database to speed things for its users in the office ... probably should have held off to see if I could find a solution first!). Yes, the database / table-in-question does have a lot of updates, deletes, and new rows (relatively speaking for a small business). Would CLUSTER / REINDEX still have an effect if our queries were done via sequential scan? This is a old database (as in built by me when i was just starting to learn unix / postgres) so the database design is pretty horrible (little normalisation, no indexes). Have taken Chris's advice onboard too and setup cron to do a vacuumdb hourly instead of my weekly vacuum. Cheers, Dave. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Can anyone explain this: duplicate dbs.
Would CLUSTER / REINDEX still have an effect if our queries were done via sequential scan? SELECTS don't write to the database, so they have no effect at all on vacuuming/analyzing. You only need to worry about that with writes. This is a old database (as in built by me when i was just starting to learn unix / postgres) so the database design is pretty horrible (little normalisation, no indexes). No indexes? Bloody hell :D Use EXPLAIN ANALYZE SELECT ... ; on all of your selects to see where they are slow and where you can add indexes... Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Can anyone explain this: duplicate dbs.
Christopher Kings-Lynne [EMAIL PROTECTED] writes: If I pg_dump that database then create a new database (e.g. tempdb) and upload the dump file (thus making a duplicate) then the same query only takes 190ms !! Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an impact on these times. Damn, for some reason I didn't read that you had already tried vacuum full. I'm thinking index bloat, and a PG version too old for vacuum full to recover any index space. But without any information about PG version or EXPLAIN ANALYZE results, we're all just guessing. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Can anyone explain this: duplicate dbs.
SpaceBallOne wrote: What version of postgres? 8.0.2 ... but I think I've seen this before on 7.3 ... There are a few possibilities. If you are having a lot of updates to the table, you can get index bloat. And vacuum doesn't fix indexes. You have to REINDEX to do that. Though REINDEX has the same lock that VACUUM FULL has, so you need to be a little careful with it. Probably better is to do CLUSTER, as it does a REINDEX and a sort, so your table ends up nicer when you are done. Thanks, will try those next time this problem crops up (i just deleted / recreated the database to speed things for its users in the office ... probably should have held off to see if I could find a solution first!). Yes, the database / table-in-question does have a lot of updates, deletes, and new rows (relatively speaking for a small business). Would CLUSTER / REINDEX still have an effect if our queries were done via sequential scan? This is a old database (as in built by me when i was just starting to learn unix / postgres) so the database design is pretty horrible (little normalisation, no indexes). Well, my first recommendation is to put in some indexes. :) They are relatively easy to setup and can drastically improve select performance. What version of postgres are you using? What does it say at the end of VACUUM FULL ANALYZE VERBOSE, that should tell you how many free pages were reclaimed and how big your free space map should be. If you only did 1 VACUUM FULL, you might try another, as it sounds like your tables aren't properly filled. I'm pretty sure vacuum only removes empty pages/marks locations for the free space map so they can be re-used, while vacuum full will move entries around to create free pages. It sounds like it didn't do it properly. But even so, CLUSTER is still your friend, as it allows you to presort the rows in your tables. Have taken Chris's advice onboard too and setup cron to do a vacuumdb hourly instead of my weekly vacuum. Cheers, Dave. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Can anyone explain this: duplicate dbs.
On Tue, May 24, 2005 at 21:39:15 -0500, John A Meinel [EMAIL PROTECTED] wrote: By the way, I think doing: CREATE DATABASE tempdb WITH TEMPLATE = originaldb; Is a much faster way of doing dump and load. I *think* it would recreate indexes, etc. If it just does a copy it may not show the dump/restore improvement. You need to be careful when doing this. See section 18.3 of the 8.0 docs for caveats. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster