Re: [HACKERS] insert performance for win32

2005-11-04 Thread Andrew Dunstan



Tom Lane wrote:


"Merlin Moncure" <[EMAIL PROTECTED]> writes:
 


That works, I tried ctrl-c various ways including from within \i copy.
Problem solved!
   



Good.  I've applied the patch in both HEAD and 8.0 branches.

Since we're very nearly ready to wrap 8.1, would someone with access to
a Windows machine please double-check that CVS tip still works?


 



Worked for me. See buildfarm. Or are there more tests you want run?

cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] insert performance for win32

2005-11-04 Thread Tom Lane
"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> That works, I tried ctrl-c various ways including from within \i copy.
> Problem solved!

Good.  I've applied the patch in both HEAD and 8.0 branches.

Since we're very nearly ready to wrap 8.1, would someone with access to
a Windows machine please double-check that CVS tip still works?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] insert performance for win32

2005-11-04 Thread Merlin Moncure
 
>   void
>   setup_cancel_handler(void)
>   {
> + static bool done = false;
> +
> + if (!done)
>   SetConsoleCtrlHandler(consoleHandler, TRUE);
> + done = true;
>   }
> 

That works, I tried ctrl-c various ways including from within \i copy.
Problem solved!

Merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Insert Performance

2002-09-26 Thread Michael Paesold

Zeugswetter Andreas SB SD <[EMAIL PROTECTED]> wrote:

> > What could you recommend? Locking the table and selecting
> > max(invoice_id) wouldn't really be much faster, with max(invoice_id)
> > not using an index...
>
> select invoice_id from table order by invoice_id desc limit 1;
>
> should get you the maximum fast if you have a unique index on invoice_id.
>
> Andreas

I've figured that out after reading the TODO about max()/min() using
indexes.
Thank you anyway!

The second problem I had was that I have invoices here that have not been
sent into accounting. An actual invoice_id is something like 210309 at the
moment. So I used invoice_ids > 3000 for "pre" invoice_ids. Having much
of those "pre" invoices makes select ... desc limit 1 too slow.

I figured out that I can use a partial index as a solution:

CREATE INDEX idx_real_invoice_id ON invoice (invoice_id) WHERE invoice_id <
3;

Now it works great.
I have a function getNextInvoiceID():

CREATE OR REPLACE FUNCTION getNextInvoiceId() RETURNS bigint AS'
DECLARE
  ret bigint;
BEGIN
  LOCK TABLE invoice IN SHARE ROW EXCLUSIVE MODE;
  SELECT INTO ret invoice_id FROM invoice WHERE invoice_id < \'30\'
ORDER BY invoice_id DESC limit 1;
  RETURN ret + 1;
END;
' LANGUAGE 'plpgsql';

Using that is nearly as fast as a regular sequence.

Thanks to all of you for your help.

Best Regards,
Michael Paesold


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Insert Performance

2002-09-26 Thread Tom Lane

"Michael Paesold" <[EMAIL PROTECTED]> writes:
> So inserting an invoice would also do an update on a single row
> of the cs_sequence table, which cause the problems.

> Now, with a normal sequence, it works like a charm.
> 17 sec. for 1 rows and 2-3 sec. for commit.

> But why is performance so much degrading? After 1 updates
> on a row, the row seems to be unusable without vacuum!

Probably, because the table contains 1 dead tuples and one live one.
The system is scanning all 10001 tuples looking for the one to UPDATE.

In 7.3 it might help a little to create an index on the table.  But
really this is one of the reasons that SEQUENCEs were invented ---
you have no alternative but to do frequent vacuums, if you repeatedly
update the same row of a table.  You might consider issuing a selective
"VACUUM cs_sequence" command every so often (ideally every few hundred
updates).

> I hope the currently discussed autovacuum daemon will help in such a
> situation.

Probably, if we can teach it to recognize that such frequent vacuums are
needed.  In the meantime, cron is your friend ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Insert Performance

2002-09-26 Thread Zeugswetter Andreas SB SD


> What could you recommend? Locking the table and selecting
> max(invoice_id) wouldn't really be much faster, with max(invoice_id)
> not using an index...

select invoice_id from table order by invoice_id desc limit 1;

should get you the maximum fast if you have a unique index on invoice_id.

Andreas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Insert Performance

2002-09-26 Thread Shridhar Daithankar

On 26 Sep 2002 at 12:28, Michael Paesold wrote:
> But why is performance so much degrading? After 1 updates
> on a row, the row seems to be unusable without vacuum! I hope
> the currently discussed autovacuum daemon will help in such a
> situation.

Let mw know if it works. Use CVS BTW.. I am eager to know any bug reports.. 
Didn't have a chance to test it the way I would have liked. May be this 
weekend..


Bye
 Shridhar

--
QOTD:   The forest may be quiet, but that doesn't mean  the snakes have gone away.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Insert Performance

2002-09-26 Thread Michael Paesold

Tom Lane wrote:

> "Michael Paesold" <[EMAIL PROTECTED]> writes:
> > Only vacuum will reset the insert times to the lowest possible!
> > What does the vacuum code do?? :-]
> 
> It removes dead tuples.  Dead tuples can only arise from update or
> delete operations ... so you have not been telling us the whole
> truth.  An insert-only test would not have this sort of behavior.
> 
> regards, tom lane

Sleeping is good. When I woke up this morning I had an idea of
what is causing these problems; and you are right. I had used a
self-written sequence system for the invoice_ids -- I can't use a
sequence because sequence values can skip.

So inserting an invoice would also do an update on a single row
of the cs_sequence table, which cause the problems.

Now, with a normal sequence, it works like a charm.
17 sec. for 1 rows and 2-3 sec. for commit.

But why is performance so much degrading? After 1 updates
on a row, the row seems to be unusable without vacuum! I hope
the currently discussed autovacuum daemon will help in such a
situation.

So I think I will have to look for another solution. It would be
nice if one could lock a sequence! That would solve all my
troubles,...


BEGIN;
LOCK SEQUENCE invoice_id_seq;
-- now only this connection can get nextval(), all others will block
INSERT INTO invoice VALUES (nextval('invoice_id_seq'), ...);
INSERT INTO invoice VALUES (nextval('invoice_id_seq'), ...);
...
COMMIT;
-- now this only helps if sequences could be rolled back -- wake up!


What could you recommend? Locking the table and selecting
max(invoice_id) wouldn't really be much faster, with max(invoice_id)
not using an index...

Best Regards,
Michael Paesold




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Insert Performance

2002-09-25 Thread Tom Lane

"Michael Paesold" <[EMAIL PROTECTED]> writes:
> Only vacuum will reset the insert times to the lowest possible!
> What does the vacuum code do?? :-]

It removes dead tuples.  Dead tuples can only arise from update or
delete operations ... so you have not been telling us the whole
truth.  An insert-only test would not have this sort of behavior.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Insert Performance

2002-09-25 Thread Gavin Sherry

> Only vacuum will reset the insert times to the lowest possible!
> What does the vacuum code do?? :-]

Please see the manual and the extensive discussions on this point in the
archives. This behaviour is well known -- though undesirable. It is an
effect of the multi-version concurrency control system.

Gavin


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Insert Performance

2002-09-25 Thread Michael Paesold

Update:

> vacuum full; vacuum analyze;
> select bench_invoice(1000); select bench_invoice(1000); ... (10 times)
>
> It seems performance is degrading with every insert!
> Here is the result (time in seconds in bench_invoice(), commit between
> selects just under a second)
>
> 13, 24, 36, 47, 58, 70, 84, 94, 105, 117, ... (seconds per 1000 rows
> inserted)
>
> Isn't that odd?
> I have tried again. vacuum analyze alone (without full) is enough to lower
> times again. They will start again with 13 seconds.

Tested further what exactly will reset insert times to lowest possible:

vacuum full; helps
vacuum analyze; helps
analyze ; of table that I insert to doesn't help!
analyze ; of any table reference in foreign key constraints
doesn't help!

Only vacuum will reset the insert times to the lowest possible!
What does the vacuum code do?? :-]

Regards,
Michael Paesold


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Insert Performance

2002-09-25 Thread Michael Paesold

Tom Lane wrote:


> "Michael Paesold" <[EMAIL PROTECTED]> writes:
> > To insert another 10562 rows takes about 12 minutes now!!!
>
> > As I said I wrote a function to insert the rows (PL/pgSQL). All values
were
> > inserted inside a single function call; I always though that a function
call
> > would be executed inside a transaction block. Experience says it does.
>
> Well, there's something fishy about your results.  Using CVS tip I see
> about a 4-to-1 difference between COPYing 1 rows and INSERT'ing
> 1 rows (as one transaction).  That's annoyingly high, but it's still
> way lower than what you're reporting ...
>
> I used the contents of table tenk1 in the regression database for test
> data, and dumped it out with "pg_dump -a" with and without -d.  I then
> just timed feeding the scripts to psql ...
>
> regards, tom lane

I have further played around with the test here. I now realized that insert
performance is much better right after a vacuum full; vacuum analyze;

I have this function bench_invoice(integer) that will insert $1 records into
invoice table;
select bench_invoice(1) took about 10 minutes average. Now I executed
this with psql:

vacuum full; vacuum analyze;
select bench_invoice(1000); select bench_invoice(1000); ... (10 times)

It seems performance is degrading with every insert!
Here is the result (time in seconds in bench_invoice(), commit between
selects just under a second)

13, 24, 36, 47, 58, 70, 84, 94, 105, 117, ... (seconds per 1000 rows
inserted)

Isn't that odd?
I have tried again. vacuum analyze alone (without full) is enough to lower
times again. They will start again with 13 seconds.

I did not delete from the table by now; the table now has about 5 rows.
The disk is not swapping, there are no other users using postgres,
postmaster takes about 100% cpu time during the whole operation. There are
no special messages in error log.

Can you explain?
Should I enable some debug logging? Disable some optimizer? Do something
else?
This is a development server, I habe no problem with playing around.

Best Regards,
Michael Paesold




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Insert Performance

2002-09-25 Thread Tom Lane

"Michael Paesold" <[EMAIL PROTECTED]> writes:
> To insert another 10562 rows takes about 12 minutes now!!!

> As I said I wrote a function to insert the rows (PL/pgSQL). All values were
> inserted inside a single function call; I always though that a function call
> would be executed inside a transaction block. Experience says it does.

Well, there's something fishy about your results.  Using CVS tip I see
about a 4-to-1 difference between COPYing 1 rows and INSERT'ing
1 rows (as one transaction).  That's annoyingly high, but it's still
way lower than what you're reporting ...

I used the contents of table tenk1 in the regression database for test
data, and dumped it out with "pg_dump -a" with and without -d.  I then
just timed feeding the scripts to psql ...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Insert Performance

2002-09-25 Thread Michael Paesold

Tom Lane wrote:

> "Michael Paesold" <[EMAIL PROTECTED]> writes:
> > To insert another 10562 rows takes about 12 minutes now!!!
>
> See
> http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/populate.html
> particularly the point about not committing each INSERT as a separate
> transaction.
>
> regards, tom lane

As I said I wrote a function to insert the rows (PL/pgSQL). All values were
inserted inside a single function call; I always though that a function call
would be executed inside a transaction block. Experience says it does.

About the other points in the docs:

> Use COPY FROM:
Well, I am currently comparing INSERT to COPY ... ;)

> Remove Indexes:
Doesn't COPY also have to update indexes?

> ANALYZE Afterwards:
I have done a VACUUM FULL; VACUUM ANALYZE; just before running the test.

So is it just the planner/optimizer/etc. costs? Would a PREPARE in 7.3 help?

Best Regards,
Michael Paesold


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Insert Performance

2002-09-25 Thread Tom Lane

"Michael Paesold" <[EMAIL PROTECTED]> writes:
> To insert another 10562 rows takes about 12 minutes now!!!

See
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/populate.html
particularly the point about not committing each INSERT as a separate
transaction.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Insert Performance

2002-09-25 Thread Michael Paesold

Hi,

I am wondering about bad INSERT performance compared against the speed of
COPY. (I use 7.2.2 on RedHat 7.2)

I have a table with about 30 fields, some constraints, some indexes, some
foreign key constraints. I use COPY to import old data. Copying about
10562 rows takes about 19 seconds.

For testing I have writtin a simple function in PL/pgSQL that inserts dummy
records into the same table (just a FOR loop and an INSERT INTO ...).

To insert another 10562 rows takes about 12 minutes now!!!

What is the problem with INSERT in postgresql? I usually don't compare mysql
and postgresql because mysql is just playing stuff, but I have think that
the insert performance of mysql (even with innodb tables) is about 10 times
better than the insert performance of postgresql.

What is the reason and what can be done about it?

Best Regards,
Michael

P.S: Perhaps you want to know about my postgresql.conf

#
#   Shared Memory Size
#
shared_buffers = 12288 # 2*max_connections, min 16
max_fsm_relations = 100# min 10, fsm is free space map
max_fsm_pages = 2  # min 1000, fsm is free space map
max_locks_per_transaction = 64 # min 10
wal_buffers = 8# min 4

#
#   Non-shared Memory Sizes
#
sort_mem = 4096# min 32 (in Kb)
vacuum_mem = 16384 # min 1024

#
#   Write-ahead log (WAL)
#
wal_files = 8   # range 0-64, default 0
wal_sync_method = fdatasync # the default varies across platforms:
#   # fsync, fdatasync, open_sync, or open_datasync
fsync = true



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]