[PERFORM] PostgreSQL and Linux 2.6 kernel.
As part of my ongoing evaluation of PostgreSQL I have been doing a little stress testing. I though I would share an interesting result here.. Machine spec: 500 MHz PIII 256MB RAM old-ish IDE HD (5400RPM) Linux 2.4.22 kernel (Madrake 9.2) I have PostgreSQL 7.4.1 installed and have managed to load up a 1.4 GB database from MS SQLServer. Vaccum analyzed it. As a test in PosgreSQL I issued a statement to update a single column of a table containing 2.8 million rows with the values of a column in a table with similar rowcount. Using the above spec I had to stop the server after 17 hours. The poor thing was thrashing the hard disk and doing more swapping than useful work. Having obtained a copy of Mandrake 10.0 with the 2.6 kernal I though I would give it a go. Same hardware. Same setup. Same database loaded up. Same postgresql.conf file to make sure all the settings were the same. Vaccum analyzed it. same update statement COMPLETED in 2 hours 50 minutes. I'm impressed. I could see from vmstat that the system was achieving much greater IO thoughput than the 2.4 kernel. Although the system was still swapping there seems to be a completely different memory management pattern that suits PostgreSQL very well. Just to see that this wasn't a coincidence I am repeating the test. It is now into the 14th hour using the old 2.4 kernel. I'm going to give up. Has anyone else done any comparative testing with the 2.6 kernel? Cheers, Gary. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] single index on more than two coulumns a bad thing?
Hi, I remember reading a post ages ago, maybe from Vadim, about the fact that people creating indices on more than two columns will be the first to be put againts the wall when the revolution comes... sort of... Is it always bad to create index xx on yy (field1, field2, field3); I guess the problem is that the index might often grow bigger than the table, or at least big enough not to speed up the queries? /Palle ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] single index on more than two coulumns a bad thing?
Palle, Is it always bad to create index xx on yy (field1, field2, field3); No, it seldom bad, in fact.I have some indexes that run up to seven columns, becuase they are required for unique keys. Indexes of 3-4 columns are often *required* for many-to-many join tables. I'm afraid that you've been given some misleading advice. I guess the problem is that the index might often grow bigger than the table, or at least big enough not to speed up the queries? Well, yes ... a 4-column index on a 5-column table could be bigger than the table if allowed to bloat and not re-indexed. But that's just a reason for better maintainence. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] single index on more than two coulumns a bad thing?
Josh Berkus [EMAIL PROTECTED] writes: Is it always bad to create index xx on yy (field1, field2, field3); I'm afraid that you've been given some misleading advice. I'd say it's a matter of getting your optimizations straight. If you have a query that can make use of that index, and the query is executed often enough to make it worth maintaining the index during table updates, then by all means make the index. The standard advice is meant to warn you against creating a zillion indexes without any thought to what you'll be paying in update costs. Indexes with more than a couple of columns are usually of only narrow applicability, and so you have to be sure that they'll really pay for themselves... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index Performance Help
Damien Dougan [EMAIL PROTECTED] writes: Sample analyze output for an initial query: hydradb=# explain analyze select * from pvsubscriber where actorid = 'b3432-asdas-232-Subscriber793500'; I take it pvsubscriber is a view? What's the definition of your view? - Index Scan using mc_actor_key on mc_actor (cost=0.00..4.08 rows=1 width=69) (actual time=39.497..39.499 rows=1 loops=1) Is this table regularly vacuumed? Is it possible it has lots of dead records with this value for actorid? Try running vacuum full, or better vacuum full verbose and keep the output, it might explain. What version of postgres is this? You might try reindexing all your indexes (but particularly this one). Older versions of postgres were prone to index bloat problems. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] single index on more than two coulumns a bad thing?
Palle Girgensohn [EMAIL PROTECTED] writes: Is it always bad to create index xx on yy (field1, field2, field3); All generalisations are false... Seriously, it's true that as the length of your index key gets longer the harder and harder it is to justify it. That doesn't mean they're always wrong, but you should consider whether a shorter key would perform just as well. The other problem with long index keys is that they often show up in the same place as having dozens of indexes on the same table. Usually in shops where the indexes were created after the fact looking at specific queries. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
Gary Doades [EMAIL PROTECTED] writes: As a test in PosgreSQL I issued a statement to update a single column of a table containing 2.8 million rows with the values of a column in a table with similar rowcount. Using the above spec I had to stop the server after 17 hours. The poor thing was thrashing the hard disk and doing more swapping than useful work. This statement is pretty much content-free, since you did not show us the table schemas, the query, or the EXPLAIN output for the query. (I'll forgive you the lack of EXPLAIN ANALYZE, but you could easily have provided all the other hard facts.) There's really no way to tell where the bottleneck is. Maybe it's a kernel-level issue, but I would not bet on that without more evidence. I'd definitely not bet on it without direct confirmation that the same query plan was used in both setups. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
The post was not intended to be content-rich, just my initial feedback after only just switching to 2.6. Since I had largely given up on this particular line of attack using 2.4 I didn't think to do a detailed analysis at this time. I was also hoping that others would add to the discussion. As this could become important I will be doing more analysis, but due to the nature of the issue and trying to keep as many factors constant as possible, this may take some time. Cheers, Gary. On 2 Apr 2004 at 1:32, Tom Lane wrote: Gary Doades [EMAIL PROTECTED] writes: As a test in PosgreSQL I issued a statement to update a single column of a table containing 2.8 million rows with the values of a column in a table with similar rowcount. Using the above spec I had to stop the server after 17 hours. The poor thing was thrashing the hard disk and doing more swapping than useful work. This statement is pretty much content-free, since you did not show us the table schemas, the query, or the EXPLAIN output for the query. (I'll forgive you the lack of EXPLAIN ANALYZE, but you could easily have provided all the other hard facts.) There's really no way to tell where the bottleneck is. Maybe it's a kernel-level issue, but I would not bet on that without more evidence. I'd definitely not bet on it without direct confirmation that the same query plan was used in both setups. regards, tom lane -- Incoming mail is certified Virus Free. Checked by AVG Anti-Virus (http://www.grisoft.com). Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match