[PERFORM] Does it matters the column order in indexes and constraints creation?
Hi all, I've got a doubt about how to create an index and a primary key. Lets say I have the following table: CREATE TABLE blacklist ( telephoneVARCHAR(15), customer_idINT4 CONSTRAINT fk_blacklist_customerid REFERENCES customers( customer_id ), country_id INT2 CONSTRAINT fk_blacklist_countryid REFERENCES countries( country_id ), CONSTRAINT pk_blacklist_cidcustidtel PRIMARY KEY(country_id, customer_id, telephone) ); The country_id column can have maybe 100 - 250 different values. The customer_id column can have as much several hundred values (less than 1000). The telephone is where all will be different. So my doubt is, in terms of performance makes any difference the order of the primary key fields? The same in the index definition? I have checked the postgresql documentation I haven't been able to find anything about. Thanks -- Arnau ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Does it matters the column order in indexes and constraints
Arnau wrote: Hi all, I've got a doubt about how to create an index and a primary key. Lets say I have the following table: The country_id column can have maybe 100 - 250 different values. The customer_id column can have as much several hundred values (less than 1000). The telephone is where all will be different. So my doubt is, in terms of performance makes any difference the order of the primary key fields? The same in the index definition? I have checked the postgresql documentation I haven't been able to find anything about. Well, it makes no *logical* difference, but clearly the index will have a different shape depending on how you create it. If you regularly write queries that select by country but not by customer, then use (country_id,customer_id). A more likely scenario is that you will access via customer, which in any case is more selective. However, since both colums reference other tables you might want to make sure the secondary column has its own index if you do lots of updating on the target FK table. But, the more indexes you have the slower updates will be on this table, since you'll need to keep the indexes up-to-date too. I find it's easier to spot where to put indexes during testing. It's easy to add indexes where they're never used. HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] table partioning performance
On Wed, 2007-01-10 at 16:00 -0500, Steven Flatt wrote: On 1/9/07, Simon Riggs [EMAIL PROTECTED] wrote: If you are doing date range partitioning it should be fairly simple to load data into the latest table directly. That was the way I originally intended for it to be used. The rules approach isn't something I'd recommend as a bulk loading option and its a lot more complex anyway. The problem we have with blindly loading all data into the latest table is that some data ( 5%, possibly even much less) is actually delivered late and belongs in earlier partitions. So we still needed the ability to send data to an arbitrary partition. Yes, understand the problem. COPY is always going to be faster than INSERTs anyhow and COPY doesn't allow views, nor utilise rules. You can set up a client-side program to pre-qualify the data and feed it to multiple simultaneous COPY commands, as the best current way to handle this. -- Next section aimed at pgsql-hackers, relates directly to above: My longer term solution looks like this: 1. load all data into newly created partition (optimised in a newly submitted patch for 8.3), then add the table as a new partition 2. use a newly created, permanent errortable into which rows that don't match constraints or have other formatting problems would be put. Following the COPY you would then run an INSERT SELECT to load the remaining rows from the errortable into their appropriate tables. The INSERT statement could target the parent table, so that rules to distribute the rows would be applied appropriately. When all of those have happened, drop the errortable. This would allow the database to apply its constraints accurately without aborting the load when a constraint error occurs. In the use case you outline this would provide a fast path for 95% of the data load, plus a straightforward mechanism for the remaining 5%. We discussed this on hackers earlier, though we had difficulty with handling unique constraint errors, so the idea was shelved. The errortable part of the concept was sound however. http://archives.postgresql.org/pgsql-hackers/2005-11/msg01100.php James William Pye had a similar proposal http://archives.postgresql.org/pgsql-hackers/2006-02/msg00120.php The current TODO says Allow COPY to report error lines and continue This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure. If we agreed that the TODO actually has two parts to it, each of which is separately implementable: 1. load errors to a table (all errors apart from uniqueness violation) 2. do something sensible with unique violation ERRORs IMHO part (1) can be implemented without Savepoints, which testing has shown (see James' results) would not be an acceptable solution for bulk data loading. So (1) can be implemented fairly easily, whereas (2) remains an issue that we have no acceptable solution for, as yet. Can we agree to splitting the TODO into two parts? That way we stand a chance of getting at least some functionality in this important area. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Partitioning
On Wed, 10 Jan 2007 15:30:16 -0600, Scott Marlowe [EMAIL PROTECTED] wrote: [...] And I don't think the mysql partition supports tablespaces either. MySQL supports distributing partitions over multiple disks via the SUBPARTITION clause [1]. I leave it to you, wether their syntax is cleaner, more powerful or easier or ;) Bernd [1] http://dev.mysql.com/doc/refman/5.1/en/partitioning-subpartitions.html ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Partitioning
On Fri, Jan 05, 2007 at 12:47:08PM +0100, Mikael Carneholm wrote: Take a look at the set of partitioning functions I wrote shortly after the 8.1 release: http://www.studenter.hb.se/~arch/files/part_functions.sql You could probably work something out using those functions (as-is, or as inspiration) together with pgAgent (http://www.pgadmin.org/docs/1.4/pgagent.html) /Mikael Those are some great functions. Well, they're less than optimal in one aspect: they add one rule per partition, making them unsuitable for OLTP type applications (actually: any application where insert performance is crucial). Someone with time and/or energy could probably fix that, I guess...patches are welcome :) /Mikael ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Partitioning
Well - whether or not MySQL's implementation of partitioning has some deficiency, it sure is a lot easier to set up than PostgreSQL. And I don't think there is any technical reason that setting up partitioning on Postgres couldn't be very easy and still be robust. On Thu, 11 Jan 2007 13:59:20 +0100, Mikael Carneholm [EMAIL PROTECTED] said: On Fri, Jan 05, 2007 at 12:47:08PM +0100, Mikael Carneholm wrote: Take a look at the set of partitioning functions I wrote shortly after the 8.1 release: http://www.studenter.hb.se/~arch/files/part_functions.sql You could probably work something out using those functions (as-is, or as inspiration) together with pgAgent (http://www.pgadmin.org/docs/1.4/pgagent.html) /Mikael Those are some great functions. Well, they're less than optimal in one aspect: they add one rule per partition, making them unsuitable for OLTP type applications (actually: any application where insert performance is crucial). Someone with time and/or energy could probably fix that, I guess...patches are welcome :) /Mikael ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Partitioning
Each partition can have its own disk, without using subpartitions. CREATE TABLE th (id INT, name VARCHAR(30), adate DATE) PARTITION BY LIST(YEAR(adate)) ( PARTITION p1999 VALUES IN (1995, 1999, 2003) DATA DIRECTORY = '/var/appdata/95/data' INDEX DIRECTORY = '/var/appdata/95/idx', PARTITION p2000 VALUES IN (1996, 2000, 2004) DATA DIRECTORY = '/var/appdata/96/data' INDEX DIRECTORY = '/var/appdata/96/idx', PARTITION p2001 VALUES IN (1997, 2001, 2005) DATA DIRECTORY = '/var/appdata/97/data' INDEX DIRECTORY = '/var/appdata/97/idx', PARTITION p2000 VALUES IN (1998, 2002, 2006) DATA DIRECTORY = '/var/appdata/98/data' INDEX DIRECTORY = '/var/appdata/98/idx' ); Subpartitions are just a way to break (parent) partitions up into smaller pieces. Those of course can be moved to other disks just like the main partitions. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bernd Helmle Sent: Thursday, January 11, 2007 6:51 AM To: Scott Marlowe Cc: Jim C. Nasby; Jeremy Haile; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Partitioning On Wed, 10 Jan 2007 15:30:16 -0600, Scott Marlowe [EMAIL PROTECTED] wrote: [...] And I don't think the mysql partition supports tablespaces either. MySQL supports distributing partitions over multiple disks via the SUBPARTITION clause [1]. I leave it to you, wether their syntax is cleaner, more powerful or easier or ;) Bernd [1] http://dev.mysql.com/doc/refman/5.1/en/partitioning-subpartitions.html ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Partitioning
On Thu, 11 Jan 2007 08:18:39 -0600, Adam Rich [EMAIL PROTECTED] wrote: Subpartitions are just a way to break (parent) partitions up into smaller pieces. Those of course can be moved to other disks just like the main partitions. Ah, didn't know that (i just wondered why i need a subpartition to change the location of a partition). Thanks for your clarification... Bernd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] unusual performance for vac following 8.2 upgrade
Hello all! Running a vac on an 8.2 client connecting to an 8.2 database (following example was for a 14 row static table) - vacuums will sit (for lack of a better word) for anywhere from ten to twenty minutes before taking a lock out on the table and doing work there. Once the vacuum does complete, I noticed that no timestamp is registered in pg_stat_all_tables for that relation for the last-vacuum'd timestamp (however analyze does seem to set it's timestamp). I asked it to run a vacuum on an index (knowing it would fail out), and again, the vacuum sat for several minutes before finally erroring out saying that it couldn't vacuum an index. Out of curiosity I tried the vacuum on an 8.1 client connected to the 8.2 db, same delay. In running a truss on the process while it is running, there is over five minutes where the process seems to be scanning pg_class (at least thats the only table listed in pg_locks for this process). Following this it drops into a cycle of doing the same send() command with several seconds lag between each one, and every so often it catches the same interrupt (SIGUSR1) and then goes back into the same cycle of send() calls. Also, whatever it is doing during this stage, it isn't checking for process-cancelled interrupts, as the process won't recognize it's been requested to cancel until it breaks out of this cycle of send()s and SIGUSR1s (which can go for another several minutes). I'm happy to send along the gore of the truss call if you think it would be helpful... Any ideas what the vac is prepping for that it could become bogged down in before finally taking the lock on the table? Is the lack of a timestamp set for last_vacuum in pg_stat_all_tables an indication that there may be something incomplete about our install? Since the upgrade, we've also seen unusual lag time in simple inserts into tables (atomic inserts have been seen running for several seconds), and also extreme delays in running \d on tables (I got tired of counting past 2 minutes, connecting with an 8.1 client gives immediate response on this command). We plan to upgrade to 8.2.1 as soon as possible, and also to drop into single user mode and run a reindex system, but any suggestions in the meantime as to a potential cause or a way to further debug the vacs would be greatly appreciated. OS: Solaris 10 write transactions/hr: 1.5 million size of pg_class: 535,226 number of relations: 108,694 Thanks to all, Kim ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
Kim wrote: snip OS: Solaris 10 write transactions/hr: 1.5 million size of pg_class: 535,226 number of relations: 108,694 That is a huge pg_class. I remember some discussion recently about problems with 8.2 and the way it scans pg_class. I also believe it's fixed in 8.2.1. Are you running that. If not, I suggest you upgrade and see if the fault still exists. Regards Russell Smith Thanks to all, Kim ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Improving SQL performance
Hi, I know that the problem with the following SQL is the LOG.CODCEP = ENDE.CODCEP||CODLOG condition, but what can I do to improve the performance? Is there a type of index that could help or is there another way to build this SQL? Thank you in advance! explain analyze SELECT ENDE.* , DEND.DESEND, DEND.USOEND, DEND.DUPEND, to_char('F') as NOVO, LOG.TIPLOG FROM TT_END ENDE LEFT OUTER JOIN TD_END DEND ON DEND.CODTAB = ENDE.TIPEND LEFT OUTER JOIN TT_LOG LOG ON LOG.CODCEP = ENDE.CODCEP||CODLOG WHERE ENDE.FILCLI = '001' AND ENDE.CODCLI = ' 19475'; QUERY PLAN -- Nested Loop Left Join (cost=0.00..25366.84 rows=1259 width=417) (actual time=1901.499..1901.529 rows=1 loops=1) Join Filter: ((inner.codcep)::text = ((outer.codcep)::text || (outer.codlog)::text)) - Nested Loop Left Join (cost=0.00..4.91 rows=1 width=412) (actual time=0.117..0.144 rows=1 loops=1) Join Filter: (inner.codtab = outer.tipend) - Index Scan using pk_end on tt_end ende (cost=0.00..3.87 rows=1 width=388) (actual time=0.066..0.078 rows=1 loops=1) Index Cond: ((filcli = '001'::bpchar) AND (codcli = ' 19475'::bpchar)) - Seq Scan on td_end dend (cost=0.00..1.02 rows=2 width=33) (actual time=0.012..0.018 rows=2 loops=1) - Seq Scan on tt_log log (cost=0.00..12254.24 rows=582424 width=17) (actual time=0.013..582.521 rows=582424 loops=1) Total runtime: 1901.769 ms (9 rows) \d tt_log Table TOTALL.tt_log Column | Type | Modifiers ++--- codbai | numeric(5,0) | not null nomlog | character varying(55) | not null codcep | character(8) | not null \d tt_end Table TOTALL.tt_end Column | Type |Modifiers +---+- ... ... ... codlog | character(3) | ... ... ... codcep | character(5) | ... ... Reimer
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
Kim [EMAIL PROTECTED] writes: Running a vac on an 8.2 client connecting to an 8.2 database (following example was for a 14 row static table) - vacuums will sit (for lack of a better word) for anywhere from ten to twenty minutes before taking a lock out on the table and doing work there. How big is this database (how many pg_class entries)? What do you get from VACUUM VERBOSE pg_class? The truss results make it sound like the problem is pgstat_vacuum_tabstat() taking a long time, but that code has not changed since 8.1 ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
Kim [EMAIL PROTECTED] writes: size of pg_class: 535,226 number of relations: 108,694 Oh, I shoulda read all the way to the bottom of your email :-(. What version of PG were you running before? I would think that pretty much any version of pgstat_vacuum_tabstats would have had a performance issue with pg_class that large. Also, could we see select relkind, count(*) from pg_class group by relkind; regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Improving SQL performance
Carlos H. Reimer wrote: Hi, I know that the problem with the following SQL is the LOG.CODCEP = ENDE.CODCEP||CODLOG condition, but what can I do to improve the performance? I wouldn't say it's the join condition. There is a nested loop join on 500k+ rows. Is it possible to put an index on LOG.CODCEP? That might give you a better plan, as you only have 1 row in the left of the join. so index scan would be preferable. Regards Russell Smith Is there a type of index that could help or is there another way to build this SQL? Thank you in advance! explain analyze SELECT ENDE.* , DEND.DESEND, DEND.USOEND, DEND.DUPEND, to_char('F') as NOVO, LOG.TIPLOG FROM TT_END ENDE LEFT OUTER JOIN TD_END DEND ON DEND.CODTAB = ENDE.TIPEND LEFT OUTER JOIN TT_LOG LOG ON LOG.CODCEP = ENDE.CODCEP||CODLOG WHERE ENDE.FILCLI = '001' AND ENDE.CODCLI = ' 19475'; QUERY PLAN -- Nested Loop Left Join (cost=0.00..25366.84 rows=1259 width=417) (actual time=1901.499..1901.529 rows=1 loops=1) Join Filter: ((inner.codcep)::text = ((outer.codcep)::text || (outer.codlog)::text)) - Nested Loop Left Join (cost=0.00..4.91 rows=1 width=412) (actual time=0.117..0.144 rows=1 loops=1) Join Filter: (inner.codtab = outer.tipend) - Index Scan using pk_end on tt_end ende (cost=0.00..3.87 rows=1 width=388) (actual time=0.066..0.078 rows=1 loops=1) Index Cond: ((filcli = '001'::bpchar) AND (codcli = ' 19475'::bpchar)) - Seq Scan on td_end dend (cost=0.00..1.02 rows=2 width=33) (actual time=0.012..0.018 rows=2 loops=1) - Seq Scan on tt_log log (cost=0.00..12254.24 rows=582424 width=17) (actual time=0.013..582.521 rows=582424 loops=1) Total runtime: 1901.769 ms (9 rows) \d tt_log Table TOTALL.tt_log Column | Type | Modifiers ++--- codbai | numeric(5,0) | not null nomlog | character varying(55) | not null codcep | character(8) | not null \d tt_end Table TOTALL.tt_end Column | Type |Modifiers +---+- ... ... ... codlog | character(3) | ... ... ... codcep | character(5) | ... ... Reimer ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Improving SQL performance
Carlos H. Reimer [EMAIL PROTECTED] writes: I know that the problem with the following SQL is the LOG.CODCEP = ENDE.CODCEP||CODLOG condition, but what can I do to improve the performance? Seems the problem is not using an index for tt_log. Do you have an index on tt_log.codcep? If so, maybe you need to cast the result of the concatenation to char(8) to get it to use the index. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
Hey Tom, We were running on 8.1.1 previous to upgrading to 8.2, and yes, we definitely have a heafty pg_class. The inheritance model is heavily used in our schema (the results of the group by you wanted to see are down below). However, no significant problems were seen with vacs while we were on 8.1. Execution time for the vac seemed more linked to large table size and how active the table was with updates, rather than being universally over 10 minutes regardless of the vac's object. We will be doing an audit of our 8.2 install to try and make sure that it looks like a complete install, any tests you can think of that may further narrow things down for us? relkind | count -+ v | 1740 t | 49986 c | 4 S | 57 r | 108689 i | 374723 (6 rows) Tom Lane wrote: Kim [EMAIL PROTECTED] writes: size of pg_class: 535,226 number of relations: 108,694 Oh, I shoulda read all the way to the bottom of your email :-(. What version of PG were you running before? I would think that pretty much any version of pgstat_vacuum_tabstats would have had a performance issue with pg_class that large. Also, could we see select relkind, count(*) from pg_class group by relkind; regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
Kim [EMAIL PROTECTED] writes: We were running on 8.1.1 previous to upgrading to 8.2, and yes, we definitely have a heafty pg_class. The inheritance model is heavily used in our schema (the results of the group by you wanted to see are down below). However, no significant problems were seen with vacs while we were on 8.1. Odd, because the 8.1 code looks about the same, and it is perfectly obvious in hindsight that its runtime is about O(N^2) in the number of relations :-(. At least that'd be the case if the stats collector output were fully populated. Did you have either stats_block_level or stats_row_level turned on in 8.1? If not, maybe the reason for the change is that in 8.2, that table *will* be pretty fully populated, because now it's got a last-vacuum-time entry that gets made even if the stats are otherwise turned off. Perhaps making that non-disablable wasn't such a hot idea :-(. What I think we need to do about this is (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking of using a hash table for the OIDs instead of a linear list. Should be a pretty small change; I'll work on it today. (2) Reconsider whether last-vacuum-time should be sent to the collector unconditionally. Comments from hackers? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade
Tom Lane wrote: What I think we need to do about this is (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking of using a hash table for the OIDs instead of a linear list. Should be a pretty small change; I'll work on it today. (2) Reconsider whether last-vacuum-time should be sent to the collector unconditionally. (2) seems a perfectly reasonably answer, but ISTM (1) would be good to have anyway (at least in HEAD). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
On Thu, 2007-01-11 at 14:45 -0500, Tom Lane wrote: Kim [EMAIL PROTECTED] writes: We were running on 8.1.1 previous to upgrading to 8.2, and yes, we definitely have a heafty pg_class. The inheritance model is heavily used in our schema (the results of the group by you wanted to see are down below). However, no significant problems were seen with vacs while we were on 8.1. Odd, because the 8.1 code looks about the same, and it is perfectly obvious in hindsight that its runtime is about O(N^2) in the number of relations :-(. At least that'd be the case if the stats collector output were fully populated. Did you have either stats_block_level or stats_row_level turned on in 8.1? If not, maybe the reason for the change is that in 8.2, that table *will* be pretty fully populated, because now it's got a last-vacuum-time entry that gets made even if the stats are otherwise turned off. Perhaps making that non-disablable wasn't such a hot idea :-(. What I think we need to do about this is (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking of using a hash table for the OIDs instead of a linear list. Should be a pretty small change; I'll work on it today. (2) Reconsider whether last-vacuum-time should be sent to the collector unconditionally. Comments from hackers? It's not clear to me how this fix will alter the INSERT issue Kim mentions. Are those issues connected? Or are you thinking that handling stats in a tight loop is slowing down other aspects of the system? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] table partioning performance
On Thu, Jan 11, 2007 at 12:15:50PM +, Simon Riggs wrote: On Wed, 2007-01-10 at 16:00 -0500, Steven Flatt wrote: On 1/9/07, Simon Riggs [EMAIL PROTECTED] wrote: If you are doing date range partitioning it should be fairly simple to load data into the latest table directly. That was the way I originally intended for it to be used. The rules approach isn't something I'd recommend as a bulk loading option and its a lot more complex anyway. The problem we have with blindly loading all data into the latest table is that some data ( 5%, possibly even much less) is actually delivered late and belongs in earlier partitions. So we still needed the ability to send data to an arbitrary partition. Yes, understand the problem. COPY is always going to be faster than INSERTs anyhow and COPY doesn't allow views, nor utilise rules. You can set up a client-side program to pre-qualify the data and feed it to multiple simultaneous COPY commands, as the best current way to handle this. -- Next section aimed at pgsql-hackers, relates directly to above: I'm wondering if you see any issues with COPYing into a partitioned table that's using triggers instead of rules to direct data to the appropriate tables? BTW, I think improved copy error handling would be great, and might perform better than triggers, once we have it... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade
Simon Riggs [EMAIL PROTECTED] writes: It's not clear to me how this fix will alter the INSERT issue Kim mentions. I didn't say that it would; we have no information on the INSERT issue, so I'm just concentrating on the problem that he did provide info on. (BTW, I suppose the slow-\d issue is the regex planning problem we already knew about.) I'm frankly not real surprised that there are performance issues with such a huge pg_class; it's not a regime that anyone's spent any time optimizing. It is interesting that 8.2 seems to have regressed but I can think of several places that would've been bad before. One is that there are seqscans of pg_inherits ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
RES: [PERFORM] Improving SQL performance
Yes, I do have an index on tt_log.codcep. Indexes I´ve on both tables: tt_end Indexes: pk_end PRIMARY KEY, btree (filcli, codcli, codfil, numend) ak_end_numdoc UNIQUE, btree (numdoc) i_fk_end_darc btree (codarc, tiparc) i_fk_end_dend btree (tipend) i_fk_end_dfil btree (codfil) i_fk_end_dreg btree (regiao) i_fk_end_mun btree (codcid) tt_log Indexes: i_fk_log_bai btree (codbai) i_lc_log_codcep btree (codcep) Any clue? Thanks! Reimer -Mensagem original- De: Tom Lane [mailto:[EMAIL PROTECTED] Enviada em: quinta-feira, 11 de janeiro de 2007 16:31 Para: [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Assunto: Re: [PERFORM] Improving SQL performance Carlos H. Reimer [EMAIL PROTECTED] writes: I know that the problem with the following SQL is the LOG.CODCEP = ENDE.CODCEP||CODLOG condition, but what can I do to improve the performance? Seems the problem is not using an index for tt_log. Do you have an index on tt_log.codcep? If so, maybe you need to cast the result of the concatenation to char(8) to get it to use the index. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
I wrote: (2) Reconsider whether last-vacuum-time should be sent to the collector unconditionally. Actually, now that I look, the collector already contains this logic: /* * Don't create either the database or table entry if it doesn't already * exist. This avoids bloating the stats with entries for stuff that is * only touched by vacuum and not by live operations. */ and ditto for analyze messages. So my idea that the addition of last-vac-time was causing an increase in the statistics file size compared to 8.1 seems wrong. How large is your $PGDATA/global/pgstat.stat file, anyway? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade
I wrote: What I think we need to do about this is (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking of using a hash table for the OIDs instead of a linear list. Should be a pretty small change; I'll work on it today. I've applied the attached patch to 8.2 to do the above. Please give it a try and see how much it helps for you. Some limited testing here confirms a noticeable improvement in VACUUM startup time at 1 tables, and of course it should be 100X worse with 10 tables. I am still confused why you didn't see the problem in 8.1, though. This code is just about exactly the same in 8.1. Maybe you changed your stats collector settings when moving to 8.2? regards, tom lane Index: pgstat.c === RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v retrieving revision 1.140 diff -c -r1.140 pgstat.c *** pgstat.c21 Nov 2006 20:59:52 - 1.140 --- pgstat.c11 Jan 2007 22:32:30 - *** *** 159,164 --- 159,165 static void pgstat_read_statsfile(HTAB **dbhash, Oid onlydb); static void backend_read_statsfile(void); static void pgstat_read_current_status(void); + static HTAB *pgstat_collect_oids(Oid catalogid); static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype); static void pgstat_send(void *msg, int len); *** *** 657,666 void pgstat_vacuum_tabstat(void) { ! List *oidlist; ! Relationrel; ! HeapScanDesc scan; ! HeapTuple tup; PgStat_MsgTabpurge msg; HASH_SEQ_STATUS hstat; PgStat_StatDBEntry *dbentry; --- 658,664 void pgstat_vacuum_tabstat(void) { ! HTAB *htab; PgStat_MsgTabpurge msg; HASH_SEQ_STATUS hstat; PgStat_StatDBEntry *dbentry; *** *** 679,693 /* * Read pg_database and make a list of OIDs of all existing databases */ ! oidlist = NIL; ! rel = heap_open(DatabaseRelationId, AccessShareLock); ! scan = heap_beginscan(rel, SnapshotNow, 0, NULL); ! while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL) ! { ! oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup)); ! } ! heap_endscan(scan); ! heap_close(rel, AccessShareLock); /* * Search the database hash table for dead databases and tell the --- 677,683 /* * Read pg_database and make a list of OIDs of all existing databases */ ! htab = pgstat_collect_oids(DatabaseRelationId); /* * Search the database hash table for dead databases and tell the *** *** 698,709 { Oid dbid = dbentry-databaseid; ! if (!list_member_oid(oidlist, dbid)) pgstat_drop_database(dbid); } /* Clean up */ ! list_free(oidlist); /* * Lookup our own database entry; if not found, nothing more to do. --- 688,701 { Oid dbid = dbentry-databaseid; ! CHECK_FOR_INTERRUPTS(); ! ! if (hash_search(htab, (void *) dbid, HASH_FIND, NULL) == NULL) pgstat_drop_database(dbid); } /* Clean up */ ! hash_destroy(htab); /* * Lookup our own database entry; if not found, nothing more to do. *** *** 717,731 /* * Similarly to above, make a list of all known relations in this DB. */ ! oidlist = NIL; ! rel = heap_open(RelationRelationId, AccessShareLock); ! scan = heap_beginscan(rel, SnapshotNow, 0, NULL); ! while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL) ! { ! oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup)); ! } ! heap_endscan(scan); ! heap_close(rel, AccessShareLock); /* * Initialize our messages table counter to zero --- 709,715 /* * Similarly to above, make a list of all known relations in this DB. */ ! htab = pgstat_collect_oids(RelationRelationId); /* * Initialize our messages table counter to zero *** *** 738,750 hash_seq_init(hstat, dbentry-tables); while ((tabentry = (PgStat_StatTabEntry *) hash_seq_search(hstat)) != NULL) { ! if (list_member_oid(oidlist, tabentry-tableid)) continue; /* * Not there, so add this table's Oid to the message */ ! msg.m_tableid[msg.m_nentries++] = tabentry-tableid; /* * If the message is full, send it out and reinitialize to empty --- 722,738 hash_seq_init(hstat, dbentry-tables);
Re: [PERFORM] unusual performance for vac following 8.2 upgrade
Our pgstats.stat file is 40M for 8.2, on 8.1 it was 33M. Our schema size hasn't grown *that* much in the two weeks since we upgraded I'm not sure if this sheds any more light on the situation, but in scanning down through the process output from truss, it looks like the first section of output was a large chunk of reads on pgstat.stat, followed by a larger chunk of reads on the global directory and directories under base - this whole section probably went on for a good 6-7 minutes, though I would say the reads on pgstat likely finished within a couple of minutes or so. Following this there was a phase were it did a lot of seeks and reads on files under pg_clog, and it was while doing this (or perhaps it had finished whatever it wanted with clogs) it dropped into the send()/SIGUSR1 loop that goes for another several minutes. Kim Tom Lane wrote: I wrote: (2) Reconsider whether last-vacuum-time should be sent to the collector unconditionally. Actually, now that I look, the collector already contains this logic: /* * Don't create either the database or table entry if it doesn't already * exist. This avoids bloating the stats with entries for stuff that is * only touched by vacuum and not by live operations. */ and ditto for analyze messages. So my idea that the addition of last-vac-time was causing an increase in the statistics file size compared to 8.1 seems wrong. How large is your $PGDATA/global/pgstat.stat file, anyway? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2upgrade
On Thu, 2007-01-11 at 16:11 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: It's not clear to me how this fix will alter the INSERT issue Kim mentions. I didn't say that it would; we have no information on the INSERT issue, so I'm just concentrating on the problem that he did provide info on. OK. I'm frankly not real surprised that there are performance issues with such a huge pg_class; it's not a regime that anyone's spent any time optimizing. Yeh, I saw a pg_class that big once, but it just needed a VACUUM. Temp relations still make pg_class entried don't they? Is that on the TODO list to change? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: RES: [PERFORM] Improving SQL performance
Hi, Carlos, Wouldn't it be better if you used INT in 'codcep' in both tables (as CEP/ZIP numbers are [0-9]{8})? Casting as Tom Lane suggested is also a good alternative, yet I think it'd be much better if you used int in both columns. Regards, Cesar Let's see the query: SELECT ENDE.* , DEND.DESEND, DEND.USOEND, DEND.DUPEND, to_char('F') as NOVO, LOG.TIPLOG FROM TT_END ENDE LEFT OUTER JOIN TD_END DEND ON DEND.CODTAB = ENDE.TIPEND LEFT OUTER JOIN TT_LOG LOG ON LOG.CODCEP = ENDE.CODCEP||CODLOG WHERE ENDE.FILCLI = '001' AND ENDE.CODCLI = ' 19475'; Carlos H. Reimer wrote: Yes, I do have an index on tt_log.codcep. Indexes I´ve on both tables: tt_end Indexes: pk_end PRIMARY KEY, btree (filcli, codcli, codfil, numend) ak_end_numdoc UNIQUE, btree (numdoc) i_fk_end_darc btree (codarc, tiparc) i_fk_end_dend btree (tipend) i_fk_end_dfil btree (codfil) i_fk_end_dreg btree (regiao) i_fk_end_mun btree (codcid) tt_log Indexes: i_fk_log_bai btree (codbai) i_lc_log_codcep btree (codcep) Any clue? Thanks! Reimer -Mensagem original- De: Tom Lane [mailto:[EMAIL PROTECTED] Enviada em: quinta-feira, 11 de janeiro de 2007 16:31 Para: [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Assunto: Re: [PERFORM] Improving SQL performance Carlos H. Reimer [EMAIL PROTECTED] writes: I know that the problem with the following SQL is the LOG.CODCEP = ENDE.CODCEP||CODLOG condition, but what can I do to improve the performance? Seems the problem is not using an index for tt_log. Do you have an index on tt_log.codcep? If so, maybe you need to cast the result of the concatenation to char(8) to get it to use the index. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2upgrade
Simon Riggs wrote: Temp relations still make pg_class entried don't they? Is that on the TODO list to change? Yeah, and pg_attribute entries as well, which may be more problematic because they are a lot. Did we get rid of pg_attribute entries for system attributes already? Can we actually get rid of pg_class entries for temp tables. Maybe creating a temp pg_class which would be local to each session? Heck, it doesn't even have to be an actual table -- it just needs to be somewhere from where we can load entries into the relcache. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2upgrade
Alvaro Herrera [EMAIL PROTECTED] writes: Can we actually get rid of pg_class entries for temp tables. Maybe creating a temp pg_class which would be local to each session? Heck, it doesn't even have to be an actual table -- it just needs to be somewhere from where we can load entries into the relcache. A few things to think about: 1. You'll break a whole lotta client-side code if temp tables disappear from pg_class. This is probably solvable --- one thought is to give pg_class an inheritance child that is a view on a SRF that reads out the stored-in-memory rows for temp pg_class entries. Likewise for pg_attribute and everything else related to a table definition. 2. How do you keep the OIDs for temp tables (and their associated rowtypes) from conflicting with OIDs for real tables? Given the way that OID generation works, there wouldn't be any real problem unless a temp table survived for as long as it takes the OID counter to wrap all the way around --- but in a database that has WITH OIDS user tables, that might not be impossibly long ... 3. What about dependencies on user-defined types, functions, etc? How will you get things to behave sanely if one backend tries to drop a type that some other backend is using in a column of a temp table? Even if you put entries into pg_depend, which would kind of defeat the point of not having on-disk catalog entries for temp tables, I don't see how the other backend figures out what the referencing object is. I don't really see any solution to that last point :-( regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match