[PERFORM] insert
Hi, is there anything I can doo to speed up inserts? One of my tables gets about 100 new rows every five minutes. And somehow the inserts tend to take more and more time. Any suggestions welcome. TIA Ulrich ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] insert
As I see it's 100 inserts every 5 minutes, not only 100 inserts. Sure it's extreme for only 100 inserts. Cheers, Guido G u i d o B a r o s i o [EMAIL PROTECTED] wrote: [speeding up 100 inserts every 5 minutes] Tips! *Delete indexes and recreate them after the insert. sounds a bit extreme, for only 100 inserts gnari ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] insert
From: G u i d o B a r o s i o [EMAIL PROTECTED]: As I see it's 100 inserts every 5 minutes, not only 100 inserts. Sure it's extreme for only 100 inserts. I am sorry, I do not quite grasp what you are saying. my understanding was that there are constantly new inserts, coming in bursts of 100 , every 5 minutes. I imagined that the indexes were needed in between. if this is the case, the bunches of 100 inserts should be done inside a transaction (or by 1 COPY statement) if, on the other hand, the inserts happen independently, at a rate of 100 inserts / 5 minutes, then this will not help gnari Cheers, Guido G u i d o B a r o s i o [EMAIL PROTECTED] wrote: [speeding up 100 inserts every 5 minutes] Tips! *Delete indexes and recreate them after the insert. sounds a bit extreme, for only 100 inserts gnari ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] insert
On 13/08/2004 13:10 Ulrich Wisser wrote: Hi, my inserts are done in one transaction, but due to some foreign key constraints and five indexes sometimes the 100 inserts will take more than 5 minutes. Two possibilities come to mind: a) you need an index on the referenced FK field b) you have an index but a type mis-match (e.g, an int4 field referencing an int8 field) Either of these will cause a sequential table scan and poor performance. -- Paul Thomas +--+-+ | Thomas Micro Systems Limited | Software Solutions for Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +--+-+ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] insert
On Fri, 2004-08-13 at 08:10, Ulrich Wisser wrote: Hi, my inserts are done in one transaction, but due to some foreign key constraints and five indexes sometimes the 100 inserts will take more than 5 minutes. It is likely that you are missing an index on one of those foreign key'd items. Do an EXPLAIN ANALYZE SELECT * FROM foreign_table WHERE foreign_col = 'insert value'; Fix them until they're quick. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Weird Database Performance problem!
Hi, We are having a performance problem with our database. The problem exists when we include a constraint in GCTBALLOT. The constraint is as follows: alter table GCTBALLOT add constraint FK_GCTBALLOT_GCTWEBU foreign key (GCTWEBU_SRL) references GCTWEBU (SRL) on delete restrict on update restrict; The two tables that we insert into are the following: GCTBALLOT: Table cbcca.gctballot Column |Type | Modifiers --+-+--- srl | integer | not null default nextval('cbcca.gctballot_srl_seq'::text) gctbwindow_srl | numeric(12,0) | not null gctcandidate_srl | numeric(12,0) | not null gctwebu_srl | numeric(12,0) | gctphoneu_srl| numeric(12,0) | ballot_time | timestamp without time zone | not null ip_addr | character varying(15) | Indexes: pk_gctballot primary key, btree (srl) i1_gctballot_webusrl btree (gctwebu_srl) Foreign-key constraints: fk_gctbwindow_gctballot FOREIGN KEY (gctbwindow_srl) REFERENCES gctbwindow(srl) ON UPDATE RESTRICT ON DELETE RESTRICT fk_gctcandidate_gctballot FOREIGN KEY (gctcandidate_srl) REFERENCES gctcandidate(srl) ON UPDATE RESTRICT ON DELETE RESTRICT fk_gctphoneu_gctballot FOREIGN KEY (gctphoneu_srl) REFERENCES gctphoneu(srl) ON UPDATE RESTRICT ON DELETE RESTRICT with the extra constraint: fk_gctballot_gctwebu FOREIGN KEY (gctwebu_srl) REFERENCES gctwebu(srl) ON UPDATE RESTRICT ON DELETE RESTRICT GCTWEBU: Table cbcca.gctwebu Column |Type | Modifiers -+-+- srl | integer | not null default nextval('cbcca.gctwebu_srl_seq'::text) gctlocation_srl | numeric(12,0) | not null gctagerange_srl | numeric(12,0) | not null email | character varying(255) | not null uhash | character varying(255) | not null sex | character varying(1)| not null created_time| timestamp without time zone | not null Indexes: pk_gctwebu primary key, btree (srl) i1_gctwebu_email unique, btree (email) Foreign-key constraints: fk_gctagerang_gctwebu FOREIGN KEY (gctagerange_srl) REFERENCES gctagerange(srl) ON UPDATE RESTRICT ON DELETE RESTRICT fk_gctwebu_gctlocation FOREIGN KEY (gctlocation_srl) REFERENCES gctlocation(srl) ON UPDATE RESTRICT ON DELETE RESTRICT To begin, GCTBALLOT has 6122546 rows and GCTWEBU has 231444 rows. Now when we try and insert 100 entries into GCTBALLOT with the extra constraint it takes: 37981 milliseconds Also, when we try and insert 100 entries into GCTBALLOT with the extra constraint, but insert 'null' into the column gctwebu_srl it takes: 286 milliseconds However when we try and insert 100 entries into GCTBALLOT without the extra constraint (no foreign key between GCTBALLOT GCTWEBU) it takes: 471 milliseconds In summary, inserting into GCTBALLOT without the constraint or inserting null for gctwebu_srl in GCTBALLOT gives us good performance. However, inserting into GCTBALLOT with the constraint and valid gctwebu_srl values gives us poor performance. Also, the insert we use is as follows: INSERT INTO GCTBALLOT (gctbwindow_srl, gctcandidate_srl, gctwebu_srl, gctphoneu_srl, ballot_time, ip_addr) VALUES (CBCCA.gcf_getlocation(?), ?, CBCCA.gcf_validvoter(?,?), null, ?, ?); NOTE: gcf_validvoter find 'gctweb_srl' value CREATE OR REPLACE FUNCTION gcf_validvoter (VARCHAR, VARCHAR) RETURNS NUMERIC AS ' DECLARE arg1 ALIAS FOR $1; arg2 ALIAS FOR $2; return_val NUMERIC; BEGIN SELECT SRL INTO return_val FROM gctwebu WHERE EMAIL = arg1 ANDUHASH = arg2; RETURN return_val; END; ' LANGUAGE plpgsql; Where the question marks are filled in with values in our java code. We are puzzled as to why there is this difference in performance when inserting b/c we believe that we have indexed all columns used by this constraint. And we realize that inserting 'null' into GCTBALLOT doesn't use this constraint b/c no look up is necessary. So this causes good performance. Why is it that when we use this constraint that the performance is effected so much? Thanks P.S. Even we added an index on 'gctwebu_srl' column and did 1- Analyzed ALL TABLES 2- analyze GCTBALLOT(gctwebu_srl); but still have the same problem! ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] insert
On Fri, Aug 13, 2004 at 08:57:56 -0400, Rod Taylor [EMAIL PROTECTED] wrote: On Fri, 2004-08-13 at 08:10, Ulrich Wisser wrote: Hi, my inserts are done in one transaction, but due to some foreign key constraints and five indexes sometimes the 100 inserts will take more than 5 minutes. It is likely that you are missing an index on one of those foreign key'd items. I don't think that is too likely as a foreign key reference must be a unique key which would have an index. I think the type mismatch suggestion is probably what the problem is. The current solution is to make the types match. In 8.0.0 it would probably work efficiently as is, though it isn't normal for foreign keys to have a type mismatch and he may want to change that anyway. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Help specifying new machine
William Yu wrote: Rory Campbell-Lange wrote: The present server is a 2GHz Pentium 4/512 KB cache with 2 software-raided ide disks (Maxtors) and 1GB of RAM. I have been offered the following 1U server which I can just about afford: 1U server Intel Xeon 2.8GHz 512K cache 1 512MB PC2100 DDR ECC Registered 2 80Gb SATA HDD 4 4 port SATA card, 3 ware 8506-4 1 3 year next-day hardware warranty 1 You're not getting much of a bump with this server. The CPU is incrementally faster -- in the absolutely best case scenario where your queries are 100% cpu-bound, that's about ~25%-30% faster. What about using Dual Athlon MP instead of a Xeon? Would be much less expensive, but have higher performance (I think). If you could use that money instead to upgrade your current server, you'd get a much bigger impact. Go for more memory and scsi (raid controllers w/ battery-backed cache). ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [HACKERS] fsync vs open_sync
Guys, just so you know: OSDL did some testing and found Ext3 to be perhaps the worst FS for PostgreSQL -- although this testing was with the default options. Ext3 involved an almost 40% write performance penalty compared with Ext2, whereas the penalty for ReiserFS and JFS was less than 10%. This concurs with my personal experience. I had exactly the same experience ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] fsync vs open_sync
OSDL did some testing and found Ext3 to be perhaps the worst FS for PostgreSQL -- although this testing was with the default options. Ext3 involved an almost 40% write performance penalty compared with Ext2, whereas the penalty for ReiserFS and JFS was less than 10%. This concurs with my personal experience. I'm really curious to see if you guys have compared insert performance results between 7.4 and 8.0. As you probably know the system sync() call was replaced with a looping fsync on open file handles. This may have some interesting interactions with the WAL sync method. What caught my attention initially was the 300+/sec insert performance. On 8.0/NTFS/fsync=on, I can't break 100/sec on a 10k rpm ATA disk. My hardware seems to be more or less in the same league as psql's, so I was naturally curious if this was a NT/Unix issue, a 7.4/8.0 issue, or a combination of both. The system on which I can get 300 inserts per second is a battery backed up XEON system with 512M RAM, a Promise PDC DMA ATA card, and some fast disks with write caching enabled. (We are not worried about write caching because we have a UPS. Since all non-redundent systems are evaluated on probability of error, we decided that the probability of power failure and UPS failure was sufficiently more rare than system crash with file system corruption or hard disk failure.) A 5ms seek time disk would be limited to 200 transaction commits/sec if each transaction commit has at least 1 seek. Are there some circumstances where a transaction commit does not generate a physical seek? Maybe ext3 is not the worst filesystem after all! Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] insert
On Fri, Aug 13, 2004 at 17:17:10 +0100, Matt Clark [EMAIL PROTECTED] wrote: It is likely that you are missing an index on one of those foreign key'd items. I don't think that is too likely as a foreign key reference must be a unique key which would have an index. I think you must be thinking of primary keys, not foreign keys. All one-to-many relationships have non-unique foreign keys. The target of the reference needs to have at least a unique index. I am not sure if it needs to actually be declared as either a unique or primary key, though that is the intention. The records doing the referencing don't need (and normally aren't) unique. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] insert
It is likely that you are missing an index on one of those foreign key'd items. I don't think that is too likely as a foreign key reference must be a unique key which would have an index. I think you must be thinking of primary keys, not foreign keys. All one-to-many relationships have non-unique foreign keys. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] Reiser4
ReiserFS 4 is (will be) a filesystem that implements transactions. Are there any plans in a future Postgresql version to support a special fsync method for Reiser4 which will use the filesystem's transaction engine, instead of an old kludge like fsync(), with a possibility of vastly enhanced performance ? Is there also a possibility to tell Postgres : I don't care if I lose 30 seconds of transactions on this table if the power goes out, I just want to be sure it's still ACID et al. compliant but you can fsync less often and thus be faster (with a possibility of setting that on a per-table basis) ? Thanks. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] insert
Bruno Wolff III [EMAIL PROTECTED] writes: Rod Taylor [EMAIL PROTECTED] wrote: It is likely that you are missing an index on one of those foreign key'd items. I don't think that is too likely as a foreign key reference must be a unique key which would have an index. I think the type mismatch suggestion is probably what the problem is. I agree. It is possible to have a lack-of-index problem on the referencing column (as opposed to the referenced column), but that normally only hurts you for deletes from the referenced table. The current solution is to make the types match. In 8.0.0 it would probably work efficiently as is, though it isn't normal for foreign keys to have a type mismatch and he may want to change that anyway. 8.0 will not fix this particular issue, as I did not add any numeric-vs-int comparison operators. If we see a lot of complaints we could think about adding such, but for 8.0 only the more common cases such as int-vs-bigint are covered. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Reiser4
Pierre, Are there any plans in a future Postgresql version to support a special fsync method for Reiser4 which will use the filesystem's transaction engine, instead of an old kludge like fsync(), with a possibility of vastly enhanced performance ? I don't know of any such in progress right now. Why don't you start it? It would have to be an add-in since we support 28 operating systems and Reiser is AFAIK Linux-only, but it sounds like an interesting experiment. Is there also a possibility to tell Postgres : I don't care if I lose 30 seconds of transactions on this table if the power goes out, I just want to be sure it's still ACID et al. compliant but you can fsync less often and thus be faster (with a possibility of setting that on a per-table basis) ? Not per-table, no, but otherwise take a look at the Background Writer feature of 8.0. -- -Josh Berkus A developer of Very Little Brain Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Performance Problem With Postgresql!
Arash, We are having a performance problem with our database. The problem exists when we include a constraint in GCTBALLOT. The constraint is as follows: You posted twice, to three different mailing lists each time. This is discourteous. Please do not do so again, as people may not help you if they feel you are being rude. Richard H has posted the solution to your problem. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] fsync vs open_sync
There is also the fact that NTFS is a very slow filesystem, and Linux is a lot better than Windows for everything disk, caching and IO related. Try to copy some files in NTFS and in ReiserFS... I'm not so sure I would agree with such a blanket generalization. I find NTFS to be very fast, my main complaint is fragmentation issues...I bet NTFS is better than ext3 at most things (I do agree with you about the cache, thoughO. I think in very general sense the open source stuff is higher quality but Microsoft benefits from a very tight vertical integration of the system. They added ReadFileScatter and WriteFileScatter to the win32 api specifically to make SQL Server run faster and SQL server is indeed very, very good at i/o. SQL Server keeps a one file database with blocks collected and written asynchronously. It's a very tight system because they have control over every layer of the system. Know your enemy. That said, I think transaction based file I/O is 'the way' and if implemented on Reiser4 faster than I/O methodology than offered on windows/ntfs. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Weird Database Performance problem!
Arash Zaryoun wrote: Hi Richard, Thanks for your prompt reply. It fixed the problem. Just one more question: Do I need to create an index for FKs? You don't _need_ to, but on the referring side (e.g. table GCTBALLOT in your example) PostgreSQL won't create one automatically. Of course, the primary-key side will already have an index being used as part of the constraint. I've cc:ed the list on this, the question pops up quite commonly. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] REINDEX needed because of index corruption need help ASAP
select count(*) FROM items_2004_07_29 as items WHERE true AND index @@ to_tsquery('default', '( audiovox)') ;count --- 4(1 row) aers=# reindex index idx_title_2004_07_29;REINDEXaers=# select count(*) FROM items_2004_07_29 as items WHERE true AND index @@ to_tsquery('default', '( audiovox)') ;count --- 2342(1 row) Here are 2 searches using a gist index, the first one has a gist index which is not properly created but does not though an error in creation. After doing a few searches I noticed the numbers were way off so I reindexed. The problem was fixed and the search works properly. Is there a problem with indexing? Do you know of this problem? Is there a way to fix it? We are getting this error for other index's including primary key indexs where we get OID page not found errors and the index corruption as posted above. Sometimes when the index is broken it will cause run away searches which eat up all the memory and the server needs to be restarted. Anyways this all looks to be in index problem, vacuum/analyze does not fix it only reindexing does. Thank you, Aaron
Re: [PERFORM] REINDEX needed because of index corruption need help ASAP
borajetta [EMAIL PROTECTED] writes: We are getting this error for other index's including primary key indexs wh= ere we get OID page not found errors and the index corruption as posted abo= ve. Sometimes when the index is broken it will cause run away searches which ea= t up all the memory and the server needs to be restarted. It sounds to me like you have got hardware problems. Get out your memory and disk tests ... regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] REINDEX needed because of index corruption need help ASAP
Aaron, We are getting this error for other index's including primary key indexs where we get OID page not found errors and the index corruption as posted above. Sometimes when the index is broken it will cause run away searches which eat up all the memory and the server needs to be restarted. What version of PostgreSQL are you using? Do you have frequent power-outs on the machine? Have you tested for bad hard drive, controller, or memory? Corruption of any kind on PostgreSQL is not normal. It's usually indicative of a hardware problem. -- -Josh Berkus A developer of Very Little Brain Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Reiser4
Pierre-Frédéric Caillaud wrote: Is there also a possibility to tell Postgres : I don't care if I lose 30 seconds of transactions on this table if the power goes out, I just want to be sure it's still ACID et al. compliant but you can fsync less often and thus be faster (with a possibility of setting that on a per-table basis) ? I have been thinking about this. Informix calls it buffered logging and it would be a good feature. Added to TODO: * Allow buffered WAL writes and fsync Instead of guaranteeing recovery of all committed transactions, this would provide improved performance by delaying WAL writes and fsync so an abrupt operating system restart might lose a few seconds of committed transactions but still be consistent. We could perhaps remove the 'fsync' parameter (which results in an an inconsistent database) in favor of this capability. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Reiser4
Josh Berkus wrote: Pierre, Are there any plans in a future Postgresql version to support a special fsync method for Reiser4 which will use the filesystem's transaction engine, instead of an old kludge like fsync(), with a possibility of vastly enhanced performance ? I don't know of any such in progress right now. Why don't you start it? It would have to be an add-in since we support 28 operating systems and Reiser is AFAIK Linux-only, but it sounds like an interesting experiment. Is there also a possibility to tell Postgres : I don't care if I lose 30 seconds of transactions on this table if the power goes out, I just want to be sure it's still ACID et al. compliant but you can fsync less often and thus be faster (with a possibility of setting that on a per-table basis) ? Not per-table, no, but otherwise take a look at the Background Writer feature of 8.0. Actually the fsync of WAL is the big performance issue here. I added a TODO item about it. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Replication: Slony-I vs. Mammoth Replicator vs. ?
Chris Cheston wrote: HI all, I'm trying to implement a highly-scalable, high-performance, real-time database replication system to back-up my Postgres database as data gets written. So far, Mammoth Replicator is looking pretty good but it costs $1000+ . Yes but it includes 30 days of support and 12 months of upgrades/updates :) Has anyone tried Slony-I and other replication systems? Slony-I is pretty new so I'm a little unsure if it's ready for a prime-time commercial system yet. It really depends on your needs. They are both good systems. Slony-I is a bit more of a beast to get up and running, and it is a batch replication system that uses triggers. Once it is up and running it works well though. Mammoth Replicator is easy to setup and is integrated into PostgreSQL. However replicator is 1000+ and doesn't support promoting of slaves automatically (you can do it by hand) like Slony does. Replicator is also live replication. Sincerely, Joshua D. Drake So... wanted to put this out to the experts. Has anyone got any recommendations or had experiences with real-time database replication solutions that don't rely on RAID? The reason why I don't want to rely on a hardware solution is because we are renting dedicated servers and we don't have access to the boxes, only to software that gets installed on the boxes. Thanks, Chris ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL begin:vcard fn:Joshua D. Drake n:Drake;Joshua D. org:Command Prompt, Inc. adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA email;internet:[EMAIL PROTECTED] title:Consultant tel;work:503-667-4564 tel;fax:503-210-0034 note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl. x-mozilla-html:FALSE url:http://www.commandprompt.com/ version:2.1 end:vcard ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org