Re: [HACKERS] insert performance for win32
Tom Lane wrote: "Merlin Moncure" <[EMAIL PROTECTED]> writes: That works, I tried ctrl-c various ways including from within \i copy. Problem solved! Good. I've applied the patch in both HEAD and 8.0 branches. Since we're very nearly ready to wrap 8.1, would someone with access to a Windows machine please double-check that CVS tip still works? Worked for me. See buildfarm. Or are there more tests you want run? cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] insert performance for win32
"Merlin Moncure" <[EMAIL PROTECTED]> writes: > That works, I tried ctrl-c various ways including from within \i copy. > Problem solved! Good. I've applied the patch in both HEAD and 8.0 branches. Since we're very nearly ready to wrap 8.1, would someone with access to a Windows machine please double-check that CVS tip still works? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] insert performance for win32
> void > setup_cancel_handler(void) > { > + static bool done = false; > + > + if (!done) > SetConsoleCtrlHandler(consoleHandler, TRUE); > + done = true; > } > That works, I tried ctrl-c various ways including from within \i copy. Problem solved! Merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Insert Performance
Zeugswetter Andreas SB SD <[EMAIL PROTECTED]> wrote: > > What could you recommend? Locking the table and selecting > > max(invoice_id) wouldn't really be much faster, with max(invoice_id) > > not using an index... > > select invoice_id from table order by invoice_id desc limit 1; > > should get you the maximum fast if you have a unique index on invoice_id. > > Andreas I've figured that out after reading the TODO about max()/min() using indexes. Thank you anyway! The second problem I had was that I have invoices here that have not been sent into accounting. An actual invoice_id is something like 210309 at the moment. So I used invoice_ids > 3000 for "pre" invoice_ids. Having much of those "pre" invoices makes select ... desc limit 1 too slow. I figured out that I can use a partial index as a solution: CREATE INDEX idx_real_invoice_id ON invoice (invoice_id) WHERE invoice_id < 3; Now it works great. I have a function getNextInvoiceID(): CREATE OR REPLACE FUNCTION getNextInvoiceId() RETURNS bigint AS' DECLARE ret bigint; BEGIN LOCK TABLE invoice IN SHARE ROW EXCLUSIVE MODE; SELECT INTO ret invoice_id FROM invoice WHERE invoice_id < \'30\' ORDER BY invoice_id DESC limit 1; RETURN ret + 1; END; ' LANGUAGE 'plpgsql'; Using that is nearly as fast as a regular sequence. Thanks to all of you for your help. Best Regards, Michael Paesold ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Insert Performance
"Michael Paesold" <[EMAIL PROTECTED]> writes: > So inserting an invoice would also do an update on a single row > of the cs_sequence table, which cause the problems. > Now, with a normal sequence, it works like a charm. > 17 sec. for 1 rows and 2-3 sec. for commit. > But why is performance so much degrading? After 1 updates > on a row, the row seems to be unusable without vacuum! Probably, because the table contains 1 dead tuples and one live one. The system is scanning all 10001 tuples looking for the one to UPDATE. In 7.3 it might help a little to create an index on the table. But really this is one of the reasons that SEQUENCEs were invented --- you have no alternative but to do frequent vacuums, if you repeatedly update the same row of a table. You might consider issuing a selective "VACUUM cs_sequence" command every so often (ideally every few hundred updates). > I hope the currently discussed autovacuum daemon will help in such a > situation. Probably, if we can teach it to recognize that such frequent vacuums are needed. In the meantime, cron is your friend ... regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Insert Performance
> What could you recommend? Locking the table and selecting > max(invoice_id) wouldn't really be much faster, with max(invoice_id) > not using an index... select invoice_id from table order by invoice_id desc limit 1; should get you the maximum fast if you have a unique index on invoice_id. Andreas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Insert Performance
On 26 Sep 2002 at 12:28, Michael Paesold wrote: > But why is performance so much degrading? After 1 updates > on a row, the row seems to be unusable without vacuum! I hope > the currently discussed autovacuum daemon will help in such a > situation. Let mw know if it works. Use CVS BTW.. I am eager to know any bug reports.. Didn't have a chance to test it the way I would have liked. May be this weekend.. Bye Shridhar -- QOTD: The forest may be quiet, but that doesn't mean the snakes have gone away. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Insert Performance
Tom Lane wrote: > "Michael Paesold" <[EMAIL PROTECTED]> writes: > > Only vacuum will reset the insert times to the lowest possible! > > What does the vacuum code do?? :-] > > It removes dead tuples. Dead tuples can only arise from update or > delete operations ... so you have not been telling us the whole > truth. An insert-only test would not have this sort of behavior. > > regards, tom lane Sleeping is good. When I woke up this morning I had an idea of what is causing these problems; and you are right. I had used a self-written sequence system for the invoice_ids -- I can't use a sequence because sequence values can skip. So inserting an invoice would also do an update on a single row of the cs_sequence table, which cause the problems. Now, with a normal sequence, it works like a charm. 17 sec. for 1 rows and 2-3 sec. for commit. But why is performance so much degrading? After 1 updates on a row, the row seems to be unusable without vacuum! I hope the currently discussed autovacuum daemon will help in such a situation. So I think I will have to look for another solution. It would be nice if one could lock a sequence! That would solve all my troubles,... BEGIN; LOCK SEQUENCE invoice_id_seq; -- now only this connection can get nextval(), all others will block INSERT INTO invoice VALUES (nextval('invoice_id_seq'), ...); INSERT INTO invoice VALUES (nextval('invoice_id_seq'), ...); ... COMMIT; -- now this only helps if sequences could be rolled back -- wake up! What could you recommend? Locking the table and selecting max(invoice_id) wouldn't really be much faster, with max(invoice_id) not using an index... Best Regards, Michael Paesold ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Insert Performance
"Michael Paesold" <[EMAIL PROTECTED]> writes: > Only vacuum will reset the insert times to the lowest possible! > What does the vacuum code do?? :-] It removes dead tuples. Dead tuples can only arise from update or delete operations ... so you have not been telling us the whole truth. An insert-only test would not have this sort of behavior. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Insert Performance
> Only vacuum will reset the insert times to the lowest possible! > What does the vacuum code do?? :-] Please see the manual and the extensive discussions on this point in the archives. This behaviour is well known -- though undesirable. It is an effect of the multi-version concurrency control system. Gavin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Insert Performance
Update: > vacuum full; vacuum analyze; > select bench_invoice(1000); select bench_invoice(1000); ... (10 times) > > It seems performance is degrading with every insert! > Here is the result (time in seconds in bench_invoice(), commit between > selects just under a second) > > 13, 24, 36, 47, 58, 70, 84, 94, 105, 117, ... (seconds per 1000 rows > inserted) > > Isn't that odd? > I have tried again. vacuum analyze alone (without full) is enough to lower > times again. They will start again with 13 seconds. Tested further what exactly will reset insert times to lowest possible: vacuum full; helps vacuum analyze; helps analyze ; of table that I insert to doesn't help! analyze ; of any table reference in foreign key constraints doesn't help! Only vacuum will reset the insert times to the lowest possible! What does the vacuum code do?? :-] Regards, Michael Paesold ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Insert Performance
Tom Lane wrote: > "Michael Paesold" <[EMAIL PROTECTED]> writes: > > To insert another 10562 rows takes about 12 minutes now!!! > > > As I said I wrote a function to insert the rows (PL/pgSQL). All values were > > inserted inside a single function call; I always though that a function call > > would be executed inside a transaction block. Experience says it does. > > Well, there's something fishy about your results. Using CVS tip I see > about a 4-to-1 difference between COPYing 1 rows and INSERT'ing > 1 rows (as one transaction). That's annoyingly high, but it's still > way lower than what you're reporting ... > > I used the contents of table tenk1 in the regression database for test > data, and dumped it out with "pg_dump -a" with and without -d. I then > just timed feeding the scripts to psql ... > > regards, tom lane I have further played around with the test here. I now realized that insert performance is much better right after a vacuum full; vacuum analyze; I have this function bench_invoice(integer) that will insert $1 records into invoice table; select bench_invoice(1) took about 10 minutes average. Now I executed this with psql: vacuum full; vacuum analyze; select bench_invoice(1000); select bench_invoice(1000); ... (10 times) It seems performance is degrading with every insert! Here is the result (time in seconds in bench_invoice(), commit between selects just under a second) 13, 24, 36, 47, 58, 70, 84, 94, 105, 117, ... (seconds per 1000 rows inserted) Isn't that odd? I have tried again. vacuum analyze alone (without full) is enough to lower times again. They will start again with 13 seconds. I did not delete from the table by now; the table now has about 5 rows. The disk is not swapping, there are no other users using postgres, postmaster takes about 100% cpu time during the whole operation. There are no special messages in error log. Can you explain? Should I enable some debug logging? Disable some optimizer? Do something else? This is a development server, I habe no problem with playing around. Best Regards, Michael Paesold ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Insert Performance
"Michael Paesold" <[EMAIL PROTECTED]> writes: > To insert another 10562 rows takes about 12 minutes now!!! > As I said I wrote a function to insert the rows (PL/pgSQL). All values were > inserted inside a single function call; I always though that a function call > would be executed inside a transaction block. Experience says it does. Well, there's something fishy about your results. Using CVS tip I see about a 4-to-1 difference between COPYing 1 rows and INSERT'ing 1 rows (as one transaction). That's annoyingly high, but it's still way lower than what you're reporting ... I used the contents of table tenk1 in the regression database for test data, and dumped it out with "pg_dump -a" with and without -d. I then just timed feeding the scripts to psql ... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Insert Performance
Tom Lane wrote: > "Michael Paesold" <[EMAIL PROTECTED]> writes: > > To insert another 10562 rows takes about 12 minutes now!!! > > See > http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/populate.html > particularly the point about not committing each INSERT as a separate > transaction. > > regards, tom lane As I said I wrote a function to insert the rows (PL/pgSQL). All values were inserted inside a single function call; I always though that a function call would be executed inside a transaction block. Experience says it does. About the other points in the docs: > Use COPY FROM: Well, I am currently comparing INSERT to COPY ... ;) > Remove Indexes: Doesn't COPY also have to update indexes? > ANALYZE Afterwards: I have done a VACUUM FULL; VACUUM ANALYZE; just before running the test. So is it just the planner/optimizer/etc. costs? Would a PREPARE in 7.3 help? Best Regards, Michael Paesold ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Insert Performance
"Michael Paesold" <[EMAIL PROTECTED]> writes: > To insert another 10562 rows takes about 12 minutes now!!! See http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/populate.html particularly the point about not committing each INSERT as a separate transaction. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Insert Performance
Hi, I am wondering about bad INSERT performance compared against the speed of COPY. (I use 7.2.2 on RedHat 7.2) I have a table with about 30 fields, some constraints, some indexes, some foreign key constraints. I use COPY to import old data. Copying about 10562 rows takes about 19 seconds. For testing I have writtin a simple function in PL/pgSQL that inserts dummy records into the same table (just a FOR loop and an INSERT INTO ...). To insert another 10562 rows takes about 12 minutes now!!! What is the problem with INSERT in postgresql? I usually don't compare mysql and postgresql because mysql is just playing stuff, but I have think that the insert performance of mysql (even with innodb tables) is about 10 times better than the insert performance of postgresql. What is the reason and what can be done about it? Best Regards, Michael P.S: Perhaps you want to know about my postgresql.conf # # Shared Memory Size # shared_buffers = 12288 # 2*max_connections, min 16 max_fsm_relations = 100# min 10, fsm is free space map max_fsm_pages = 2 # min 1000, fsm is free space map max_locks_per_transaction = 64 # min 10 wal_buffers = 8# min 4 # # Non-shared Memory Sizes # sort_mem = 4096# min 32 (in Kb) vacuum_mem = 16384 # min 1024 # # Write-ahead log (WAL) # wal_files = 8 # range 0-64, default 0 wal_sync_method = fdatasync # the default varies across platforms: # # fsync, fdatasync, open_sync, or open_datasync fsync = true ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]