Re: [HACKERS] Large databases, performance
On 3 Oct 2002 at 18:53, Manfred Koizar wrote: On Thu, 03 Oct 2002 21:47:03 +0530, Shridhar Daithankar [EMAIL PROTECTED] wrote: I believe that was vacuum analyze only. Well there is VACUUM [tablename]; and there is ANALYZE [tablename]; And VACUUM ANALYZE [tablename]; is VACUUM followed by ANALYZE. I was using vacuum analyze. Good that you pointed out. Now I will modify the postgresql auto vacuum daemon that I wrote to analyze only in case of excesive inserts. I hope that's lighter on performance compared to vacuum analyze.. Bye Shridhar -- Mix's Law: There is nothing more permanent than a temporary building. There is nothing more permanent than a temporary tax. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Large databases, performance
Hi, Today we concluded test for database performance. Attached are results and the schema, for those who have missed earlier discussion on this. We have (almost) decided that we will partition the data across machines. The theme is, after every some short interval a burst of data will be entered in new table in database, indexed and vacuume. The table(s) will be inherited so that query on base table will fetch results from all the children. The application has to consolidate all the data per node basis. If the database is not postgresql, app. has to consolidate data across partitions as well. Now we need to investigate whether selecting on base table to include children would use indexes created on children table. It's estimated that when entire data is gathered, total number of children tables would be around 1K-1.1K across all machines. This is in point of average rate of data insertion i.e. 5K records/sec and total data size, estimated to be 9 billion rows max i.e. estimated database size is 900GB. Obviously it's impossible to keep insertion rate on an indexed table high as data grows. So partitioning/inheritance looks better approach. Postgresql is not the final winner as yet. Mysql is in close range. I will keep you guys posted about the result. Let me know about any comments.. Bye Shridhar -- Price's Advice: It's all a game -- play it to have fun. Machine Compaq Proliant Server ML 530 Intel Xeon 2.4 Ghz Processor x 4, 4 GB RAM, 5 x 72.8 GB SCSI HDD RAID 0 (Striping) Hardware Setup, Mandrake Linux 9.0 Cost - $13,500 ($1,350 for each additional 72GB HDD) Performance Parameter MySQL 3.23.52 MySQL 3.23.52 PostgreSQL 7.2.2 WITHOUT InnoDB WITH InnoDB for with built-in support for transactional transactional support for transactions support Complete Data Inserts + building a composite index 40 GB data, 432,000,000 tuples3738 secs 18720 secs 20628 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Load Speed 115570 tuples/second23076 tuples/second 20942 tuples/second Database Size on Disk 48 GB 87 GB 111 GB Average per partition Inserts + building a composite index 300MB data, 3,000,000 tuples, 28 secs 130 secs 150 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Select Query7 secs 7 secs 6 secs based on equality match of 2 fields (esn and min) - 4 concurrent queries running Database Size on Disk 341 MB 619 MB 788 MB Field Name Field Type NullableIndexed typeint no no esn char (10) no yes min char (10) no yes datetimetimestamp no yes opc0char (3)no no opc1char (3)no no opc2char (3)no no dpc0char (3)no no dpc1char (3)no no dpc2char (3)no no npa char (3)no
Re: [HACKERS] Large databases, performance
NOTE: Setting follow up to the performance list Funny that the status quo seems to be if you need fast selects on data that has few inserts to pick mysql, otherwise if you have a lot of inserts and don't need super fast selects go with PostgreSQL; yet your data seems to cut directly against this. I'm curious, did you happen to run the select tests while also running the insert tests? IIRC the older mysql versions have to lock the table when doing the insert, so select performance goes in the dumper in that scenario, perhaps that's not an issue with 3.23.52? It also seems like the vacuum after each insert is unnecessary, unless your also deleting/updating data behind it. Perhaps just running an ANALYZE on the table would suffice while reducing overhead. Robert Treat On Thu, 2002-10-03 at 08:36, Shridhar Daithankar wrote: Machine Compaq Proliant Server ML 530 Intel Xeon 2.4 Ghz Processor x 4, 4 GB RAM, 5 x 72.8 GB SCSI HDD RAID 0 (Striping) Hardware Setup, Mandrake Linux 9.0 Cost - $13,500 ($1,350 for each additional 72GB HDD) Performance Parameter MySQL 3.23.52 MySQL 3.23.52 PostgreSQL 7.2.2 WITHOUT InnoDB WITH InnoDB for with built-in support for transactional transactional support for transactions support Complete Data Inserts + building a composite index 40 GB data, 432,000,000 tuples 3738 secs 18720 secs 20628 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Load Speed115570 tuples/second23076 tuples/second 20942 tuples/second Database Size on Disk 48 GB 87 GB 111 GB Average per partition Inserts + building a composite index 300MB data, 3,000,000 tuples, 28 secs 130 secs150 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Select Query 7 secs 7 secs 6 secs based on equality match of 2 fields (esn and min) - 4 concurrent queries running Database Size on Disk 341 MB 619 MB 788 MB ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Large databases, performance
On Thu, 03 Oct 2002 21:47:03 +0530, Shridhar Daithankar [EMAIL PROTECTED] wrote: I believe that was vacuum analyze only. Well there is VACUUM [tablename]; and there is ANALYZE [tablename]; And VACUUM ANALYZE [tablename]; is VACUUM followed by ANALYZE. Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Large databases, performance
On 3 Oct 2002 at 11:57, Robert Treat wrote: NOTE: Setting follow up to the performance list Funny that the status quo seems to be if you need fast selects on data that has few inserts to pick mysql, otherwise if you have a lot of inserts and don't need super fast selects go with PostgreSQL; yet your data seems to cut directly against this. Well, couple of things.. The number of inserts aren't few. it's 5000/sec.required in the field Secondly I don't know really but postgresql seems doing pretty fine in parallel selects. If we use mysql with transaction support then numbers are really close.. May be it's time to rewrite famous myth that postgresql is slow. When properly tuned or given enough head room, it's almost as fast as mysql.. I'm curious, did you happen to run the select tests while also running the insert tests? IIRC the older mysql versions have to lock the table when doing the insert, so select performance goes in the dumper in that scenario, perhaps that's not an issue with 3.23.52? IMO even if it locks tables that shouldn't affect select performance. It would be fun to watch when we insert multiple chunks of data and fire queries concurrently. I would be surprised if mysql starts slowing down.. It also seems like the vacuum after each insert is unnecessary, unless your also deleting/updating data behind it. Perhaps just running an ANALYZE on the table would suffice while reducing overhead. I believe that was vacuum analyze only. But still it takes lot of time. Good thing is it's not blocking.. Anyway I don't think such frequent vacuums are going to convince planner to choose index scan over sequential scan. I am sure it's already convinced.. Regards, Shridhar --- Shridhar Daithankar LIMS CPE Team Member, PSPL. mailto:shridhar_daithankar;persistent.co.in Phone:- +91-20-5678900 Extn.270 Fax :- +91-20-5678901 --- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org