Re: [PERFORM] atrocious update performance

2004-04-05 Thread Rosser Schwarz
After deinstalling and scrubbing PostgreSQL from my server and doing
a clean build using a vanilla 7.4.2 tree, I'm rather more confident
that foreign key validation is at cause in my performance problems.

I recreated my schemas and ran the original update, with foreign
keys referring to the identity column of the target table.  The
update took roughly two days, as I'd predicted based on my analysis
of the previous installation.  (I can't say how long with certainty,
beyond that it finished some time between when I left work one night
and came in the next morning, the second day after starting the
query.)  I'm not sure what was wrong with the previous install, such
that the update took several days; two-ish days is long enough.

Just this morning, however, I created a copy of the target table (all
4.7M rows), with absolutely no foreign keys referring to it, and ran
the update against the copy.  That update took 2300 seconds.  The
join columns were indexed in both cases.

I'm in the process of migrating the machine to run kernel 2.6.4,
following the thread started by Gary, though I suspect that the
kernel revision is moot with respect to whether or not foreign keys
are being incorrectly validated.  I can keep the 2.4 kernel and
modules around to run using the current versions for testing
purposes, though any such work would necessarily be off-hours.

Please advise of anything I can do to help narrow down the specific
cause of the issue; I know just enough C to be mildly dangerous.

/rls

--
Rosser Schwarz
Total Card, Inc.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] atrocious update performance

2004-04-05 Thread Kevin Barnard
On 5 Apr 2004 at 12:05, Rosser Schwarz wrote:

 Just this morning, however, I created a copy of the target table (all
 4.7M rows), with absolutely no foreign keys referring to it, and ran
 the update against the copy.  That update took 2300 seconds.  The
 join columns were indexed in both cases.

Have you added indexes for the custid column for tables account.acct accunt.orgacct 
and note?

I haven't followed the entire thread but it you have cascading FK on those tables 
without an index on the column that could cause your delay.

Kevin Barnard
SpeedFC



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] atrocious update performance

2004-04-05 Thread Rosser Schwarz
while you weren't looking, Kevin Barnard wrote:

 Have you added indexes for the custid column for tables 
 account.acct accunt.orgacct and note?

They were indexed in the original case, yes.  There was no
need to index them in today's test case, as that was done
purely in attempt to rule in or out foreign key validation
as the cause of the performance hit.  No foreign keys that
might be validated, no need to index the foreign key columns.

 I haven't followed the entire thread but it you have 
 cascading FK on those tables without an index on the
 column that could cause your delay.

The issue is that the foreign keys are being validated at
all, when the column being referenced by those foreign keys
(account.cust.custid) is never touched.

Regardless of whether or not the referencing columns are
indexed, validating them at all--in this specific case--is
broken.  The column they refer to is never touched; they
should remain utterly ignorant of whatever happens to other
columns in the same row.

/rls

--
Rosser Schwarz
Total Card, Inc.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] atrocious update performance

2004-04-05 Thread scott.marlowe
On Mon, 5 Apr 2004, Kevin Barnard wrote:

 On 5 Apr 2004 at 12:05, Rosser Schwarz wrote:
 
  Just this morning, however, I created a copy of the target table (all
  4.7M rows), with absolutely no foreign keys referring to it, and ran
  the update against the copy.  That update took 2300 seconds.  The
  join columns were indexed in both cases.
 
 Have you added indexes for the custid column for tables account.acct accunt.orgacct 
 and note?
 
 I haven't followed the entire thread but it you have cascading FK on those tables 
 without an index on the column that could cause your delay.

also make sure the fk/pk types match, or the index likely won't get used 
anyway.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] atrocious update performance

2004-04-05 Thread Stephan Szabo
On Mon, 5 Apr 2004, Rosser Schwarz wrote:

 while you weren't looking, Kevin Barnard wrote:

  Have you added indexes for the custid column for tables
  account.acct accunt.orgacct and note?

 They were indexed in the original case, yes.  There was no
 need to index them in today's test case, as that was done
 purely in attempt to rule in or out foreign key validation
 as the cause of the performance hit.  No foreign keys that
 might be validated, no need to index the foreign key columns.

  I haven't followed the entire thread but it you have
  cascading FK on those tables without an index on the
  column that could cause your delay.

 The issue is that the foreign keys are being validated at
 all, when the column being referenced by those foreign keys
 (account.cust.custid) is never touched.

 Regardless of whether or not the referencing columns are
 indexed, validating them at all--in this specific case--is
 broken.  The column they refer to is never touched; they
 should remain utterly ignorant of whatever happens to other
 columns in the same row.

It shouldn't be checking the other table if the values of the key column
hadn't changed. The ri_KeysEqual check should be causing it to return just
before actually doing the check on the other table (it still does a few
things before then but nothing that should be particularly expensive). In
some simple tests on my 7.4.2 machine, this appears to work for me on pk
cascade updates. It would be interesting to know if it's actually doing
any checks for you, you might be able to poke around the triggers
(backend/utils/adt/ri_triggers.c).

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-03-24 Thread Rosser Schwarz
Greg Spiegelberg wrote:

  Will advise.

After creating 100, 1K, 10K, 100K and 1M-row subsets of account.cust and
the corresponding rows/tables with foreign key constraints referring to
the table, I'm unable to reproduce the behavior at issue.

explain analyze looks like the following, showing the query run with the
join column indexed and not, respectively:

# explain analyze update test.cust100 set prodid = tempprod.prodid,
subprodid = tempprod.subprodid where origid = tempprod.debtid;
-- with index
QUERY PLAN
---
 Merge Join  (cost=0.00..25.64 rows=500 width=220) (actual
 time=0.241..13.091 rows=100 loops=1)
   Merge Cond: ((outer.origid)::text = (inner.debtid)::text)
   -  Index Scan using ix_origid_cust100 on cust100  (cost=0.00..11.50
   rows=500 width=204) (actual time=0.125..6.465 rows=100 loops=1)
   -  Index Scan using ix_debtid on tempprod  (cost=0.00..66916.71
   rows=4731410 width=26) (actual time=0.057..1.497 rows=101 loops=1)
 Total runtime: 34.067 ms
(5 rows)

-- without index
QUERY PLAN
--
 Merge Join  (cost=7.32..16.71 rows=100 width=220) (actual
 time=4.415..10.918 rows=100 loops=1)
   Merge Cond: ((outer.debtid)::text = inner.?column22?)
   -  Index Scan using ix_debtid on tempprod  (cost=0.00..66916.71
   rows=4731410 width=26) (actual time=0.051..1.291 rows=101 loops=1)
   -  Sort  (cost=7.32..7.57 rows=100 width=204) (actual
   time=4.311..4.450 rows=100 loops=1)
 Sort Key: (cust100.origid)::text
 -  Seq Scan on cust100  (cost=0.00..4.00 rows=100 width=204)
 (actual time=0.235..2.615 rows=100 loops=1)
 Total runtime: 25.031 ms
(7 rows)

With the join column indexed, it takes roughly .32ms/row on the first
four tests (100.. 100K), and about .48ms/row on 1M rows.  Without the
index, it runs 100 rows @ .25/row, 1000 @ .26, 1 @ .27, 10 @
.48 and .5 @ 1M rows.

In no case does the query plan reflect foreign key validation.  Failing
any other suggestions for diagnosis in the soon, I'm going to nuke the
PostgreSQL install, scour it from the machine and start from scratch.
Failing that, I'm going to come in some weekend and re-do the machine.

 Problem is when I recreate the indexes and add the constraints back
 on ORIG I end up with the same long running process.  The original
 UPDATE runs for about 30 minutes on a table of 400,000 with the
 WHERE matching about 70% of the rows.  The above runs for about 2
 minutes without adding the constraints or indexes however adding the
 constraints and creating the dropped indexes negates any gain.

Is this a frequently-run update?

In my experience, with my seemingly mutant install, dropping indices
and constraints to shave 14/15 off the update time would be worth the
effort.  Just script dropping, updating and recreating into one large
transaction.  It's a symptom-level fix, but re-creating the fifteen
indices on one of our 5M row tables doesn't take 28 minutes, and your
hardware looks to be rather less IO and CPU bound than ours.  I'd also
second Tom's suggestion of moving to 7.4.

/rls


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] atrocious update performance

2004-03-23 Thread Rosser Schwarz
Greg Spiegelberg wrote:

 I've been following this thread closely as I have the same problem
 with an UPDATE.  Everything is identical here right down to the
 strace output.

 Has anyone found a workaround or resolved the problem?  If not,
 I have test systems here which I can use to help up test and explore.

I'm still gathering data.  The explain analyze I'd expected to finish
Thursday afternoon hasn't yet.  I'm going to kill it and try a few
smaller runs, increasing in size, until the behavior manifests.

Will advise.

/rls


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] atrocious update performance

2004-03-23 Thread Greg Spiegelberg
Rosser Schwarz wrote:
Greg Spiegelberg wrote:


I've been following this thread closely as I have the same problem
with an UPDATE.  Everything is identical here right down to the
strace output.


Has anyone found a workaround or resolved the problem?  If not,
I have test systems here which I can use to help up test and explore.


I'm still gathering data.  The explain analyze I'd expected to finish
Thursday afternoon hasn't yet.  I'm going to kill it and try a few
smaller runs, increasing in size, until the behavior manifests.
Will advise.
I've replaced my atrocious UPDATE with the following.

begin;
-- Drop all contraints
alter table ORIG drop constraint ...;
-- Drop all indexes
drop index ...;
-- Update
update ORIG set column=... where...;
commit;
Problem is when I recreate the indexes and add the constraints back
on ORIG I end up with the same long running process.  The original
UPDATE runs for about 30 minutes on a table of 400,000 with the
WHERE matching about 70% of the rows.  The above runs for about 2
minutes without adding the constraints or indexes however adding the
constraints and creating the dropped indexes negates any gain.
RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5
Dual PIII 1.3'ishGHz, 2GB Memory
U160 OS drives and a 1Gbps test SAN on a Hitachi 9910
Greg

--
Greg Spiegelberg
 Sr. Product Development Engineer
 Cranel, Incorporated.
 Phone: 614.318.4314
 Fax:   614.431.8388
 Email: [EMAIL PROTECTED]
Cranel. Technology. Integrity. Focus.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] atrocious update performance

2004-03-23 Thread Tom Lane
Greg Spiegelberg [EMAIL PROTECTED] writes:
 RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5
  

Please try 7.4.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] atrocious update performance

2004-03-22 Thread Greg Spiegelberg
I've been following this thread closely as I have the same problem
with an UPDATE.  Everything is identical here right down to the
strace output.
Has anyone found a workaround or resolved the problem?  If not,
I have test systems here which I can use to help up test and explore.
Greg

--
Greg Spiegelberg
 Sr. Product Development Engineer
 Cranel, Incorporated.
 Phone: 614.318.4314
 Fax:   614.431.8388
 Email: [EMAIL PROTECTED]
Cranel. Technology. Integrity. Focus.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] atrocious update performance

2004-03-17 Thread Richard Huxton
On Tuesday 16 March 2004 00:08, Tom Lane wrote:

 I'm inclined to suspect an issue with foreign-key checking.  You didn't
 give us any details about foreign key relationships your cust table is
 involved in --- could we see those?  And the schemas of the other tables
 involved?

Two questions Tom:
1. Do the stats tables record FK checks, or just explicit table accesses?
2. If not, should they?

If the only real activity is this update then simple before/after views of the 
stats might be revealing.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] atrocious update performance

2004-03-17 Thread Rosser Schwarz
while you weren't looking, Tom Lane wrote:

 Hm.  It looks like you mistakenly traced psql rather than the backend,
 but since the delay went away we wouldn't have learned 
 anything anyhow.
 Have you got any idea what conditions may have changed between seeing
 delay and not seeing delay?

None, offhand.  I have noticed that when a large query is running,
the machine can sporadically just freeze--or at least take inordinately
long for some other process, be it top or ls, another query, or whatever
to start.  Nothing looks saturated when it happens, and, while you can
count on it to happen, it's not consistent enough to reproduce.

 This is pretty odd too.  It looks like it's doing checkpoints every so
 often (look for the writes to pg_control), which a backend engaged in
 a long-running query surely ought not be doing.  Need to think about
 why that might be...

Does the fact that all the reads and writes are 32K mean anything out
of the ordinary?  $PGSRC/src/include/pg_config_manual.h has BLCKSZ
#defined to 16384.  I was running previously with a 32K BLCKSZ, but
that turned out to be rather sub-optimal for as heavily indexed as our
tables are.  I've dumped and rebuilt several times since then.

/rls

--
Rosser Schwarz
Total Card, Inc.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] atrocious update performance

2004-03-17 Thread Tom Lane
Rosser Schwarz [EMAIL PROTECTED] writes:
 while you weren't looking, Tom Lane wrote:
 Have you got any idea what conditions may have changed between seeing
 delay and not seeing delay?

 None, offhand.  I have noticed that when a large query is running,
 the machine can sporadically just freeze--or at least take inordinately
 long for some other process, be it top or ls, another query, or whatever
 to start.  Nothing looks saturated when it happens, and, while you can
 count on it to happen, it's not consistent enough to reproduce.

Interesting.  You should leave vmstat 1 running in the background and
see if you can correlate these freezes with bursts of disk I/O or swap.
I saw a couple of delays in your big strace that seemed odd --- a couple
of one-second-plus intervals, and a four-second-plus interval, with no
obvious reason for them.  Perhaps the same issue?

 Does the fact that all the reads and writes are 32K mean anything out
 of the ordinary?  $PGSRC/src/include/pg_config_manual.h has BLCKSZ
 #defined to 16384.  I was running previously with a 32K BLCKSZ, but
 that turned out to be rather sub-optimal for as heavily indexed as our
 tables are.  I've dumped and rebuilt several times since then.

I hate to break it to you, but that most definitely means you are
running with BLCKSZ = 32K.  Whatever you thought you were rebuilding
didn't take effect.

I agree that the larger blocksize is of dubious value.  People used to
do that back when the blocksize limited your row width, but these days
I think you're probably best off with the standard 8K.

Another thing that's fairly striking is the huge bursts of WAL activity
--- your trace shows that the thing is writing entire WAL segments (16
MB) at one go, rather than dribbling it out a page or two at a time as
the code is intended to do.  I think what is happening is that you have
wal_buffers = 1024 (correct?) yielding 32MB of WAL buffers, and since
there are no other transactions happening, nothing gets written until
you hit the write when the buffers are half full heuristic.  I would
suggest knocking wal_buffers down to something closer to the default
(maybe 4 or 8 buffers) to reduce these I/O storms.  (Memo to hackers:
we need to see to it that the new background writer process takes some
responsibility for pushing out filled WAL buffers, not only data
buffers.)

If the big EXPLAIN ANALYZE is still running, would you get a dump of its
open files (see lsof -p) and correlate those with the tables being
used in the query?  I'm trying to figure out what the different writes
and reads represent.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] atrocious update performance

2004-03-17 Thread Rosser Schwarz
I wrote:

 Regardless, something thinks it's still there.  Is there any way that
 it is, and that I've somehow been running 7.3.2 all along?  `which
 psql`, c show the bindir from my configure, but I'm not sure that's
 sufficient.

The weird thing is that I know I never built 7.3.anything with 32K
BLCKSZ, never built 7.3.anything at all.  If 7.3 were installed, would
it have any problem reading a 7.4 cluster?

/rls

--
Rosser Schwarz
Total Card, Inc.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] atrocious update performance

2004-03-16 Thread Tom Lane
Shridhar Daithankar [EMAIL PROTECTED] writes:
 Rosser Schwarz wrote:
 shared_buffers = 4096
 sort_mem = 32768
 vacuum_mem = 32768
 wal_buffers = 16384
 checkpoint_segments = 64
 checkpoint_timeout = 1800
 checkpoint_warning = 30
 commit_delay = 5
 effective_cache_size = 131072

 First of all, your shared buffers are low. 4096 is 64MB with 16K block
 size. I would say at least push them to 150-200MB.

Check.  Much more than that isn't necessarily better though.
shared_buffers = 1 is frequently mentioned as a sweet spot.

 Secondly your sort mem is too high. Note that it is per sort per query. You 
 could build a massive swap storm with such a setting.

Agreed, but I doubt that has anything to do with the immediate
problem, since he's not testing parallel queries.

 Similarly pull down vacuum and WAL buffers to around 512-1024 each.

The vacuum_mem setting here is 32Mb, which seems okay to me, if not on
the low side.  Again though it's not his immediate problem.

I agree that the wal_buffers setting is outlandishly large; I can't see
any plausible reason for it to be more than a few dozen.  I don't know
whether oversized wal_buffers can directly cause any performance issues,
but it's certainly not a well-tested scenario.

The other setting I was going to comment on is checkpoint_warning;
it seems mighty low in comparison to checkpoint_timeout.  If you are
targeting a checkpoint every half hour, I'd think you'd want the system
to complain about checkpoints spaced more closely than several minutes.

But with the possible exception of wal_buffers, I can't see anything in
these settings that explains the originally complained-of performance
problem.  I'm still wondering about foreign key checks.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-03-16 Thread Rosser Schwarz
while you weren't looking, Tom Lane wrote:

 But with the possible exception of wal_buffers, I can't see 
 anything in
 these settings that explains the originally complained-of performance
 problem.  I'm still wondering about foreign key checks.

Many of the configs I posted were fairly wild values, set to gather
data points for further tweaking.  Unfortunately, with this query
there hasn't been time for many passes, and I've too much else on my
plate to try concocting demonstration cases.  The postmaster's been
hupped with more sane values, but I experienced this same issue with
the defaults.

As for foreign keys, three tables refer to account.cust; all of them
refer to account.cust.custid, the pk.  One of those tables has several
hundred thousand rows, many more to come; the others are empty.  Unless
I've woefully misunderstood, the presence or absence of a foreign key
referring to one column should be moot for updates writing another
column, shouldn't it?

To answer your (and others') question, Tom, 7.4.1 on 2.4.20-18.9smp.
Red Hat, I believe.  I was handed the machine, which is also in use
for lightweight production stuff:  intranet webserver, rinky-dink
MySQL doo-dads, c.  I'm sure that has an impact, usurping the disk
heads and such--maybe even more than I'd expect--but I can't imagine
that'd cause an update to one 4.7M row table, from another 4.7M row
table, both clustered on a join column that maps one-to-one between
them, to take days.  I'm baffled; everything else is perfectly snappy,
given the hardware.  Anything requiring a sequential scan over one of
the big tables is a slog, but that's to be expected and hence all the
indices.

Watching iostat, I've observed a moderately cyclic read-big, write-
big pattern, wavelengths generally out of phase, interspersed with
smaller, almost epicycles--from the machine's other tasks, I'm sure.
top has postmaster's cpu usage rarely breaking 25% over the course
of the query's execution, and spending most of its time much lower;
memory usage hovers somewhere north of 500MB.

In what little time I had to stare at a disturbingly matrix-esque
array of terminals scrolling sundry metrics, I didn't notice a
correlation between cpu usage spikes and peaks in the IO cycle's
waveforms.  For whatever that's worth.

The other tables involved are:

# \d account.acct
  Table account.acct
   Column   |Type |Modifiers  
+-+-

 acctid | bigint  | not null default
  |
nextval('account.acctid_seq'::text)
 custid | bigint  |
 acctstatid | integer | not null
 acctno | character varying(50)   |
 bal| money   |
 begdt  | timestamp without time zone | not null
 enddt  | timestamp without time zone |
 debtid | character varying(50)   |
Indexes:
acct_pkey primary key, btree (acctid)
ix_acctno btree (acctno) WHERE (acctno IS NOT NULL)
Foreign-key constraints:
$1 FOREIGN KEY (custid) REFERENCES account.cust(custid)
 ON UPDATE CASCADE ON DELETE RESTRICT
$2 FOREIGN KEY (acctstatid) REFERENCES account.acctstat(acctstatid)
 ON UPDATE CASCADE ON DELETE RESTRICT

# \d account.note
  Table account.note
  Column   |Type |   Modifiers   
---+-+--
---
 noteid| bigint  | not null default
 |
nextval('account.noteid_seq'::text)
 custid| bigint  | not null
 note  | text| not null
 createddt | timestamp without time zone | not null default now()
Indexes:
note_pkey primary key, btree (noteid)
Foreign-key constraints:
$1 FOREIGN KEY (custid) REFERENCES account.cust(custid)
 ON UPDATE CASCADE ON DELETE RESTRICT

# \d account.origacct
   Table account.origacct
   Column|Type | Modifiers
-+-+---
 custid  | bigint  |
 lender  | character varying(50)   |
 chgoffdt| timestamp without time zone |
 opendt  | timestamp without time zone |
 offbureaudt | timestamp without time zone |
 princbal| money   |
 intbal  | money   |
 totbal  | money   |
 lastpayamt  | money   |
 lastpaydt   | timestamp without time zone |
 debttype| integer |
 debtid  | character varying(10)   |
 acctno  | character varying(50)   |
Foreign-key constraints:
$1 FOREIGN KEY (custid) REFERENCES account.cust(custid)
 ON UPDATE CASCADE ON DELETE RESTRICT

And the table we were 

Re: [PERFORM] atrocious update performance

2004-03-16 Thread Tom Lane
Rosser Schwarz [EMAIL PROTECTED] writes:
 But if I'm not touching the column referenced from account.acct, why
 would it be looking there at all?  I've got an explain analyze of the
 update running now, but until it finishes, I can't say for certain
 what it's doing.  explain, alone, says:

EXPLAIN won't tell you anything about triggers that might get fired
during the UPDATE, so it's not much help for investigating possible
FK performance problems.  EXPLAIN ANALYZE will give you some indirect
evidence: the difference between the total query time and the total time
reported for the topmost plan node represents the time spent running
triggers and physically updating the tuples.  I suspect we are going
to see a big difference.

 which shows it not hitting account.acct at all.  (And why did it take
 the planner 20-some seconds to come up with that query plan?)

It took 20 seconds to EXPLAIN?  That's pretty darn odd in itself.  I'm
starting to think there must be something quite whacked-out about your
installation, but I haven't got any real good ideas about what.

(I'm assuming of course that there weren't a ton of other jobs eating
CPU while you tried to do the EXPLAIN.)

[ thinks for awhile... ]  The only theory that comes to mind for making
the planner so slow is oodles of dead tuples in pg_statistic.  Could I
trouble you to run
vacuum full verbose pg_statistic;
and send along the output?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] atrocious update performance

2004-03-16 Thread Rosser Schwarz
while you weren't looking, Tom Lane wrote:

 EXPLAIN won't tell you anything about triggers that might get fired
 during the UPDATE, so it's not much help for investigating possible
 FK performance problems.  EXPLAIN ANALYZE will give you some indirect
 evidence: the difference between the total query time and the total time
 reported for the topmost plan node represents the time spent running
 triggers and physically updating the tuples.  I suspect we are going
 to see a big difference.

It's still running.

 It took 20 seconds to EXPLAIN?  That's pretty darn odd in itself.

It struck me, too.

 I'm starting to think there must be something quite whacked-out about
 your installation, but I haven't got any real good ideas about what.

Built from source.  configure arguments:

./configure --prefix=/var/postgresql --bindir=/usr/bin
--enable-thread-safety --with-perl --with-python --with-openssl
--with-krb5=/usr/kerberos

I can answer more specific questions; otherwise, I'm not sure what to
look for, either.  If we could take the machine out of production (oh,
hell; I think I just volunteered myself for weekend work) long enough
to reinstall everything to get a fair comparison...

So far as I know, though, it's a more or less stock Red Hat.  2.4.20-
something.

 (I'm assuming of course that there weren't a ton of other jobs eating
 CPU while you tried to do the EXPLAIN.)

CPU's spiked sopradically, which throttled everything else, but it never
stays high.  top shows the current explain analyze running between 50-
ish% and negligible.  iostat -k 3 shows an average of 3K/sec written, for
a hundred-odd tps.

I can't get any finer-grained than that, unfortunately; the machine was
handed to me with a single, contiguous filesystem, in production use.

 [ thinks for awhile... ]  The only theory that comes to mind
 for making
 the planner so slow is oodles of dead tuples in pg_statistic.  Could I
 trouble you to run
   vacuum full verbose pg_statistic;
 and send along the output?

INFO:  vacuuming pg_catalog.pg_statistic
INFO:  pg_statistic: found 215 removable, 349 nonremovable row versions
in 7 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 72 to 8132 bytes long.
There were 3 unused item pointers.
Total free space (including removable row versions) is 91572 bytes.
0 pages are or will become empty, including 0 at the end of the table.
7 pages containing 91572 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.71 sec.
INFO:  index pg_statistic_relid_att_index now contains 349 row versions
in 2 pages
DETAIL:  215 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_statistic: moved 120 row versions, truncated 7 to 5 pages
DETAIL:  CPU 0.03s/0.01u sec elapsed 0.17 sec.
INFO:  index pg_statistic_relid_att_index now contains 349 row versions
in 2 pages
DETAIL:  120 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming pg_toast.pg_toast_16408
INFO:  pg_toast_16408: found 12 removable, 12 nonremovable row versions
in 5 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 660 to 8178 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 91576 bytes.
2 pages are or will become empty, including 0 at the end of the table.
5 pages containing 91576 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.27 sec.
INFO:  index pg_toast_16408_index now contains 12 row versions in 2 pages
DETAIL:  12 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.05 sec.
INFO:  pg_toast_16408: moved 10 row versions, truncated 5 to 3 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  index pg_toast_16408_index now contains 12 row versions in 2 pages
DETAIL:  10 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

Having never more than glanced at the output of vacuum verbose, I
can't say whether that makes the cut for oodles.  My suspicion is no.

/rls

--
Rosser Schwarz
Total Card, Inc.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-03-16 Thread Tom Lane
Rosser Schwarz [EMAIL PROTECTED] writes:
 Having never more than glanced at the output of vacuum verbose, I
 can't say whether that makes the cut for oodles.  My suspicion is no.

Nope, it sure doesn't.  We occasionally see people who don't know they
need to vacuum regularly and have accumulated hundreds or thousands of
dead tuples for every live one :-(.  That's clearly not the issue here.

I'm fresh out of ideas, and the fact that this is a live server kinda
limits what we can do experimentally ... but clearly, *something* is
very wrong.

Well, when you don't know what to look for, you still have to look.
One possibly useful idea is to trace the kernel calls of the backend
process while it does that ridiculously long EXPLAIN --- think you could
try that?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] atrocious update performance

2004-03-16 Thread Rosser Schwarz
while you weren't looking, Tom Lane wrote:

[trace]

`strace -p 21882` run behind the below query and plan ... below that.

# explain update account.cust set prodid = tempprod.prodid, subprodid =
tempprod.subprodid where origid = tempprod.debtid;
  QUERY PLAN
-
 Merge Join  (cost=0.00..232764.69 rows=4731410 width=252)
   Merge Cond: ((outer.origid)::text = (inner.debtid)::text)
   -  Index Scan using ix_origid on cust  (cost=0.00..94876.83
   rows=4731410 width=236)
   -  Index Scan using ix_debtid on tempprod  (cost=0.00..66916.71
   rows=4731410 width=26)
(4 rows)

--

recv(9, Q\0\0\0}explain update account.cust..., 8192, 0) = 126
gettimeofday({1079482151, 106228}, NULL) = 0
brk(0)  = 0x82d9000
brk(0x82db000)  = 0x82db000
open(/var/lib/pgsql/data/base/495616/6834170, O_RDWR|O_LARGEFILE) = 8
_llseek(8, 212402176, [212402176], SEEK_SET) = 0
write(8, \342\1\0\0\0\314\374\6\24\0\0\0\214\7pG\360\177\1\200\320...,
32768) = 32768
close(8)= 0
open(/var/lib/pgsql/data/base/495616/16635, O_RDWR|O_LARGEFILE) = 8
read(8, \0\0\0\0\20\0\0\0\1\0\0\0\24\0\360\177\360\177\1\200b1..., 32768)
=
32768
open(/var/lib/pgsql/data/base/495616/6834168, O_RDWR|O_LARGEFILE) = 10
_llseek(10, 60817408, [60817408], SEEK_SET) = 0
write(10, \342\1\0\0`\334\5\7\24\0\0\0t\0010x\360\177\1\200\330\377...,
32768) = 32768
close(10)   = 0
read(8, \334\1\0\0h\217\270n\24\0\0\0H\0H|[EMAIL PROTECTED]...,
32768)
= 32768
open(/var/lib/pgsql/data/base/495616/6834165, O_RDWR|O_LARGEFILE) = 10
_llseek(10, 130777088, [130777088], SEEK_SET) = 0
write(10, \342\1\0\0\341\7\7\24\0\0\0004\t0I\360\177\1\200\330\377...,
32768) = 32768
close(10)   = 0
open(/var/lib/pgsql/data/base/495616/16595, O_RDWR|O_LARGEFILE) = 10
read(10, [EMAIL PROTECTED]...,
32768) = 32768
open(/var/lib/pgsql/data/base/495616/6834168, O_RDWR|O_LARGEFILE) = 11
_llseek(11, 145915904, [145915904], SEEK_SET) = 0
write(11, \342\1\0\0\300\350\n\7\24\0\0\0\224\6\310Z\360\177\1\200...,
32768) = 32768
close(11)   = 0
open(/var/lib/pgsql/data/base/495616/16614, O_RDWR|O_LARGEFILE) = 11
read(11, \0\0\0\0\24\231P\306\16\0\0\0\24\0\360\177\360\177\1\200...,
32768)
= 32768
open(/var/lib/pgsql/data/base/495616/6834166, O_RDWR|O_LARGEFILE) = 12
_llseek(12, 148570112, [148570112], SEEK_SET) = 0
write(12, \342\1\0\0\274\365\22\7\24\0\0\0X\3\234o\360\177\1\200...,
32768)
= 32768
close(12)   = 0
_llseek(11, 98304, [98304], SEEK_SET)   = 0
read(11, \0\0\0\0\24\231P\306\16\0\0\0\34\0\234\177\360\177\1\200...,
32768)
= 32768
open(/var/lib/pgsql/data/base/495616/6834163, O_RDWR|O_LARGEFILE) = 12
_llseek(12, 251789312, [251789312], SEEK_SET) = 0
write(12, \342\1\0\0l\366\23\7\24\0\0\0\364\10\260J\360\177\1\200...,
32768)
= 32768
close(12)   = 0
_llseek(11, 32768, [32768], SEEK_SET)   = 0
read(11, \340\1\0\0\324\231\273\241\24\0\0\0\234\5\330\26\360\177...,
32768)
= 32768
open(/var/lib/pgsql/data/base/495616/6834165, O_RDWR|O_LARGEFILE) = 12
_llseek(12, 117309440, [117309440], SEEK_SET) = 0
write(12, \342\1\0\0d\36)\7\24\0\0\\tHI\360\177\1\200\330\377...,
32768)
= 32768
close(12)   = 0
open(/var/lib/pgsql/data/base/495616/1259, O_RDWR|O_LARGEFILE) = 12
_llseek(12, 32768, [32768], SEEK_SET)   = 0
read(12, \334\1\0\0\324v-p\24\0\0\\3\304\3\0\200\1\200\377..., 32768)
=
32768
open(/var/lib/pgsql/data/base/495616/6834173, O_RDWR|O_LARGEFILE) = 13
_llseek(13, 247824384, [247824384], SEEK_SET) = 0
write(13, \342\1\0\0h *\7\24\0\0\0\204\4dm\360\177\1\200\340\377...,
32768)
= 32768
close(13)   = 0
open(/var/lib/pgsql/data/base/495616/16613, O_RDWR|O_LARGEFILE) = 13
read(13, \0\0\0\0\20\0\0\0\1\0\0\0\24\0\360\177\360\177\1\200b1..., 32768)
=
32768
open(/var/lib/pgsql/data/base/495616/6834168, O_RDWR|O_LARGEFILE) = 14
_llseek(14, 204472320, [204472320], SEEK_SET) = 0
write(14, \342\1\0\0\314\272:\7\24\0\0\0\324\t\354K\360\177\1\200...,
32768)
= 32768
close(14)   = 0
read(13, \340\1\0\0X\231\273\241\24\0\0\0\370\6Dk\360\177\1\200..., 32768)
=
32768
open(/var/lib/pgsql/data/base/495616/6834166, O_RDWR|O_LARGEFILE) = 14
_llseek(14, 152010752, [152010752], SEEK_SET) = 0
write(14, \342\1\0\0p\277\7\24\0\0\0\364\n\220I\360\177\1\200\334...,
32768) = 32768
close(14)   = 0
open(/var/lib/pgsql/data/base/495616/16610, O_RDWR|O_LARGEFILE) = 14
read(14, \0\0\0\0\10\317\27\t\16\0\0\0\24\0\360\177\360\177\1\200...,
32768)
= 32768
open(/var/lib/pgsql/data/base/495616/6834170, O_RDWR|O_LARGEFILE) = 15
_llseek(15, 86441984, [86441984], SEEK_SET) = 0
write(15, \342\1\0\0\330B?\7\24\0\0\0\370\6 N\360\177\1\200\310\377...,
32768) = 32768
close(15)   = 0
_llseek(14, 

Re: [PERFORM] atrocious update performance

2004-03-16 Thread Tom Lane
Rosser Schwarz [EMAIL PROTECTED] writes:
 `strace -p 21882` run behind the below query and plan ... below that.

Hmm ... that took 20 seconds eh?

It is a fairly interesting trace.  It shows that the backend needed to
read 63 system catalog pages (that weren't already in shared memory),
which is not too unreasonable I think ... though I wonder if more of
them shouldn't have been in memory already.  The odd thing is that for
*every single read* it was necessary to first dump out a dirty page
in order to make a buffer free.  That says you are running with the
entire contents of shared buffer space dirty at all times.  That's
probably not the regime you want to be operating in.  I think we already
suggested increasing shared_buffers.  You might also want to think about
not using such a large checkpoint interval.  (The background-writing
logic already committed for 7.5 should help this problem, but it's not
there in 7.4.)

Another interesting fact is that the bulk of the writes were blind
writes, involving an open()/write()/close() sequence instead of keeping
the open file descriptor around for re-use.  This is not too surprising
in a freshly started backend, I guess; it's unlikely to have had reason
to create a relation descriptor for the relations it may have to dump
pages for.  In some Unixen, particularly Solaris, open() is fairly
expensive and so blind writes are bad news.  I didn't think it was a big
problem in Linux though.  (This is another area we've improved for 7.5:
there are no more blind writes.  But that won't help you today.)

What's not immediately evident is whether the excess I/O accounted for
all of the slowdown.  Could you retry the strace with -r and -T options
so we can see how much time is being spent inside and outside the
syscalls?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Rod Taylor
 # explain update account.cust set prodid = tempprod.prodid
   where tempprod.did = origid;
 
  Merge Join  (cost=0.00..232764.69 rows=4731410 width=252)
Merge Cond: ((outer.origid)::text = (inner.did)::text)
-  Index Scan using ix_origid on cust  (cost=0.00..94876.83
rows=4731410 width=244)
-  Index Scan using ix_did on tempprod  (cost=0.00..66916.71
rows=4731410 width=18)

I'm going to hazard a guess and say you have a number of foreign keys
that refer to account.cust.prodid? This is probably the time consuming
part -- perhaps even a missing index on one of those keys that refers to
this field.

Going the other way should be just as good for your purposes, and much
faster since you're not updating several foreign key'd fields bound to
account.cust.prodid.

UPDATE tempprod.prodid = prodid
  FROM account.cust
 WHERE temprod.did = cust.origid;



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Rosser Schwarz
  # explain update account.cust set prodid = tempprod.prodid
  where tempprod.did = origid;

   Merge Join  (cost=0.00..232764.69 rows=4731410 width=252)
 Merge Cond: ((outer.origid)::text = (inner.did)::text)
 -  Index Scan using ix_origid on cust  (cost=0.00..94876.83
 rows=4731410 width=244)
 -  Index Scan using ix_did on tempprod  (cost=0.00..66916.71
 rows=4731410 width=18)
 
 I'm going to hazard a guess and say you have a number of foreign keys
 that refer to account.cust.prodid? This is probably the time consuming
 part -- perhaps even a missing index on one of those keys 
 that refers to
 this field.

Actually, there are no foreign keys to those columns.  Once they're
populated, I'll apply a foreign key constraint and they'll refer to the
appropriate row in the prod and subprod tables, but nothing will 
reference account.cust.[sub]prodid.  There are, of course, several foreign
keys referencing account.cust.custid.

 Going the other way should be just as good for your purposes, and much
 faster since you're not updating several foreign key'd fields bound to
 account.cust.prodid.

 UPDATE tempprod.prodid = prodid
   FROM account.cust
  WHERE temprod.did = cust.origid;

Not quite. Without this update, acount.cust.[sub]prodid are null.  The
data was strewn across multiple tables in MS SQL; we're normalizing it
into one, hence the need to populate the two columns independently.

/rls

--
Rosser Schwarz
Total Card, Inc. 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Aaron Werman
Bulk updates are generally dogs (not just in pg), so I avoid doing them by
doing faster selects and inserts. You can create a new table using 'create
table as' to produce your target results. This is real fast - avoiding the
row iteration in insert, allowing the select optimizer to run and no index
overhead. Then alter/rename, add indexes and whatever else hangs off the
table (or if you're lazy do an insert/select into the original target
table). I often see 2 orders of magnitude improvement doing this, and no
need to vacuum.

/Aaron

- Original Message - 
From: Rosser Schwarz [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, March 15, 2004 3:28 PM
Subject: [PERFORM] atrocious update performance


We're in the throes of an MS SQL to PostgreSQL migration; our databases
include a number of ~5M row tables.  We decided to take this opportunity
to clean up and slightly re-normalize our schemas, given what we've
learned about the data over its lifetime and such, else we wouldn't be
experiencing any of the following (we could instead just dump and `copy
from`).

We have a temporary table, public.tempprod, containing 4.7M rows, one
for each row in account.cust.  account.cust has, among others, two
columns, prod and subprod, which we're trying to update from tempprod
joined against prod.  The update tends to take unnecessarily long--
rather, we've had to finally kill it after its taking obscenely too
long.

The table:

# \d account.cust
  Table account.cust
  Column   |Type | Modifiers
---+-+--

 custid| bigint  | not null default
   | |
nextval('account.custid_seq'::text)
 ownerid   | integer | not null
 origid| text| not null
 pname | text|
 fname | text|
 mname | text|
 lname | text|
 suffix| text|
 addr1 | text|
 addr2 | text|
 addr3 | text|
 city  | text|
 state | text|
 zip   | text|
 zipplus   | text|
 homeph| text|
 workph| text|
 otherph   | text|
 ssn   | text|
 isactive  | boolean | default true
 createddt | timestamp without time zone | default now()
 prodid| bigint  |
 subprodid | bigint  |
Indexes:
cust_pkey primary key, btree (custid)
ix_addr1 btree (addr1) WHERE (addr1 IS NOT NULL)
ix_addr2 btree (addr2) WHERE (addr2 IS NOT NULL)
ix_city btree (city) WHERE (city IS NOT NULL)
ix_fname btree (fname) WHERE (fname IS NOT NULL)
ix_homeph btree (homeph) WHERE (homeph IS NOT NULL)
ix_lname btree (lname) WHERE (lname IS NOT NULL)
ix_mname btree (mname) WHERE (mname IS NOT NULL)
ix_origid btree (origid)
ix_ssn btree (ssn) WHERE (ssn IS NOT NULL)
ix_state btree (state) WHERE (state IS NOT NULL)
ix_workph btree (workph) WHERE (workph IS NOT NULL)
ix_zip btree (zip) WHERE (zip IS NOT NULL)

We're currently running on a dual Xeon 700 (I know, I know; it's what
we've got) with 2.5GB RAM and 4x36GB SCSI in hardware RAID 5 (Dell
Perc3 something-or-other controller).  If we can demonstrate that
PostgreSQL will meet our needs, we'll be going production on a dual
Opteron, maxed memory, with a 12-disk Fibre Channel array.

The query is:

update account.cust set prodid =
(select p.prodid from account.prod p
join public.tempprod t on t.pool = p.origid
where custid = t.did)

And then, upon its completion, s/prod/subprod/.

That shouldn't run overnight, should it, let alone for -days-?

In experimenting with ways of making the updates take less time, we tried
adding product and subproduct columns to tempprod, and updating those.
That seemed to work marginally better:

explain analyze update public.tempprod set prodid =
(select account.prod.prodid::bigint
from account.prod
where public.tempprod.pool::text = account.prod.origid::text)

Seq Scan on tempprod (cost=0.00..9637101.35 rows 4731410 width=56) (actual
time=24273.467..16090470.438 rows=4731410 loops=1)
  SubPlan
-  Limit (cost=0.00..2.02 rows=2 width=8) (actual time=0.134..0.315
rows=1 loops=4731410)
  -  Seq Scan on prod (cost=0.00..2.02 rows=2 width=8) (actual
  time=0.126..0.305 rows=1 loops=4731410)
  Filter: (($0)::text = (origid)::text)
Total runtime: 2284551.962 ms

But then going from public.tempprod to account.cust again takes days.  I
just cancelled an update that's been running since last Thursday.
Alas, 

Re: [PERFORM] atrocious update performance

2004-03-15 Thread Rod Taylor
On Mon, 2004-03-15 at 16:15, Rosser Schwarz wrote:
   # explain update account.cust set prodid = tempprod.prodid
 where tempprod.did = origid;
 
Merge Join  (cost=0.00..232764.69 rows=4731410 width=252)
  Merge Cond: ((outer.origid)::text = (inner.did)::text)
  -  Index Scan using ix_origid on cust  (cost=0.00..94876.83
  rows=4731410 width=244)
  -  Index Scan using ix_did on tempprod  (cost=0.00..66916.71
  rows=4731410 width=18)
  
  I'm going to hazard a guess and say you have a number of foreign keys
  that refer to account.cust.prodid? This is probably the time consuming
  part -- perhaps even a missing index on one of those keys 
  that refers to
  this field.
 
 Actually, there are no foreign keys to those columns.  Once they're
 populated, I'll apply a foreign key constraint and they'll refer to the
 appropriate row in the prod and subprod tables, but nothing will 
 reference account.cust.[sub]prodid.  There are, of course, several foreign
 keys referencing account.cust.custid.

If there are no feign keys to it, I wouldn't expect it to take more than
10 minutes on slow hardware.

Fresh out of ideas here.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Rosser Schwarz
 You can create a new table using 'create table as' to produce your
 target results. This is real fast ...
 I often see 2 orders of magnitude improvement doing this, and no
 need to vacuum.

Indeed:

Query returned successfully with no result in 582761 ms.

Though I must say, ten minutes is nominally more than two orders of
mangitude performance improvement, versus several days.

Many thanks, Aaron.

/rls

--
Rosser Schwarz
Total Card, Inc.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Tom Lane
Rosser Schwarz [EMAIL PROTECTED] writes:
 You can create a new table using 'create table as' to produce your
 target results. This is real fast ...
 I often see 2 orders of magnitude improvement doing this, and no
 need to vacuum.

 Indeed:
 Query returned successfully with no result in 582761 ms.
 Though I must say, ten minutes is nominally more than two orders of
 mangitude performance improvement, versus several days.

Hm.  There is no way that inserting a row is two orders of magnitude
faster than updating a row --- they both require storing a new row and
making whatever index entries are needed.  The only additional cost of
the update is finding the old row (not a very big deal AFAICS in the
examples you gave) and marking it deleted (definitely cheap).  So
there's something awfully fishy going on here.

I'm inclined to suspect an issue with foreign-key checking.  You didn't
give us any details about foreign key relationships your cust table is
involved in --- could we see those?  And the schemas of the other tables
involved?

Also, exactly which PG version is this?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Aaron Werman
The original point was about a very slow update of an entire table with a
plan that looped, and over a dozen conditional indices - vs. a 'create as'
in a CPU starved environment. I stand by my statement about observing the
orders of magnitude difference. In theory I agree that the update should be
in the same order of magnitude as the create as, but in practice I disagree.
I also think something is wrong on the logical side (besides FKs, are there
any triggers?) but was responding to the Gordian knot issue of bailing out
of pg.

Can you post a sample extract, Rosser? Otherwise, I'll try to put together a
sample of a slow mass join update.

/Aaron

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]
To: Rosser Schwarz [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, March 15, 2004 7:08 PM
Subject: Re: [PERFORM] atrocious update performance


 Rosser Schwarz [EMAIL PROTECTED] writes:
  You can create a new table using 'create table as' to produce your
  target results. This is real fast ...
  I often see 2 orders of magnitude improvement doing this, and no
  need to vacuum.

  Indeed:
  Query returned successfully with no result in 582761 ms.
  Though I must say, ten minutes is nominally more than two orders of
  mangitude performance improvement, versus several days.

 Hm.  There is no way that inserting a row is two orders of magnitude
 faster than updating a row --- they both require storing a new row and
 making whatever index entries are needed.  The only additional cost of
 the update is finding the old row (not a very big deal AFAICS in the
 examples you gave) and marking it deleted (definitely cheap).  So
 there's something awfully fishy going on here.

 I'm inclined to suspect an issue with foreign-key checking.  You didn't
 give us any details about foreign key relationships your cust table is
 involved in --- could we see those?  And the schemas of the other tables
 involved?

 Also, exactly which PG version is this?

 regards, tom lane

 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Shridhar Daithankar
Rosser Schwarz wrote:

  shared_buffers = 4096
sort_mem = 32768
vacuum_mem = 32768
wal_buffers = 16384
checkpoint_segments = 64
checkpoint_timeout = 1800
checkpoint_warning = 30
commit_delay = 5
effective_cache_size = 131072
You didn't mention the OS so I would take it as either linux/freeBSD.

First of all, your shared buffers are low. 4096 is 64MB with 16K block size. I 
would say at least push them to 150-200MB.

Secondly your sort mem is too high. Note that it is per sort per query. You 
could build a massive swap storm with such a setting.

Similarly pull down vacuum and WAL buffers to around 512-1024 each.

I know that your problem is solved by using insert rather than updates. But I 
just want to point out that you still need to analyze the table to update the 
statistics or the further queres will not be exactly good.

And lastly, you can bundle entire thing including creating duplicate table, 
populating it, renaming original table etc in a single transaction and nobody 
will notice it. I am almost sure MS-SQL can not do that. Not many databases have 
trasact-safe DDLs out there..

 HTH

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings