Re: [PERFORM] atrocious update performance
After deinstalling and scrubbing PostgreSQL from my server and doing a clean build using a vanilla 7.4.2 tree, I'm rather more confident that foreign key validation is at cause in my performance problems. I recreated my schemas and ran the original update, with foreign keys referring to the identity column of the target table. The update took roughly two days, as I'd predicted based on my analysis of the previous installation. (I can't say how long with certainty, beyond that it finished some time between when I left work one night and came in the next morning, the second day after starting the query.) I'm not sure what was wrong with the previous install, such that the update took several days; two-ish days is long enough. Just this morning, however, I created a copy of the target table (all 4.7M rows), with absolutely no foreign keys referring to it, and ran the update against the copy. That update took 2300 seconds. The join columns were indexed in both cases. I'm in the process of migrating the machine to run kernel 2.6.4, following the thread started by Gary, though I suspect that the kernel revision is moot with respect to whether or not foreign keys are being incorrectly validated. I can keep the 2.4 kernel and modules around to run using the current versions for testing purposes, though any such work would necessarily be off-hours. Please advise of anything I can do to help narrow down the specific cause of the issue; I know just enough C to be mildly dangerous. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] atrocious update performance
On 5 Apr 2004 at 12:05, Rosser Schwarz wrote: Just this morning, however, I created a copy of the target table (all 4.7M rows), with absolutely no foreign keys referring to it, and ran the update against the copy. That update took 2300 seconds. The join columns were indexed in both cases. Have you added indexes for the custid column for tables account.acct accunt.orgacct and note? I haven't followed the entire thread but it you have cascading FK on those tables without an index on the column that could cause your delay. Kevin Barnard SpeedFC ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] atrocious update performance
while you weren't looking, Kevin Barnard wrote: Have you added indexes for the custid column for tables account.acct accunt.orgacct and note? They were indexed in the original case, yes. There was no need to index them in today's test case, as that was done purely in attempt to rule in or out foreign key validation as the cause of the performance hit. No foreign keys that might be validated, no need to index the foreign key columns. I haven't followed the entire thread but it you have cascading FK on those tables without an index on the column that could cause your delay. The issue is that the foreign keys are being validated at all, when the column being referenced by those foreign keys (account.cust.custid) is never touched. Regardless of whether or not the referencing columns are indexed, validating them at all--in this specific case--is broken. The column they refer to is never touched; they should remain utterly ignorant of whatever happens to other columns in the same row. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
On Mon, 5 Apr 2004, Kevin Barnard wrote: On 5 Apr 2004 at 12:05, Rosser Schwarz wrote: Just this morning, however, I created a copy of the target table (all 4.7M rows), with absolutely no foreign keys referring to it, and ran the update against the copy. That update took 2300 seconds. The join columns were indexed in both cases. Have you added indexes for the custid column for tables account.acct accunt.orgacct and note? I haven't followed the entire thread but it you have cascading FK on those tables without an index on the column that could cause your delay. also make sure the fk/pk types match, or the index likely won't get used anyway. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
On Mon, 5 Apr 2004, Rosser Schwarz wrote: while you weren't looking, Kevin Barnard wrote: Have you added indexes for the custid column for tables account.acct accunt.orgacct and note? They were indexed in the original case, yes. There was no need to index them in today's test case, as that was done purely in attempt to rule in or out foreign key validation as the cause of the performance hit. No foreign keys that might be validated, no need to index the foreign key columns. I haven't followed the entire thread but it you have cascading FK on those tables without an index on the column that could cause your delay. The issue is that the foreign keys are being validated at all, when the column being referenced by those foreign keys (account.cust.custid) is never touched. Regardless of whether or not the referencing columns are indexed, validating them at all--in this specific case--is broken. The column they refer to is never touched; they should remain utterly ignorant of whatever happens to other columns in the same row. It shouldn't be checking the other table if the values of the key column hadn't changed. The ri_KeysEqual check should be causing it to return just before actually doing the check on the other table (it still does a few things before then but nothing that should be particularly expensive). In some simple tests on my 7.4.2 machine, this appears to work for me on pk cascade updates. It would be interesting to know if it's actually doing any checks for you, you might be able to poke around the triggers (backend/utils/adt/ri_triggers.c). ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
Greg Spiegelberg wrote: Will advise. After creating 100, 1K, 10K, 100K and 1M-row subsets of account.cust and the corresponding rows/tables with foreign key constraints referring to the table, I'm unable to reproduce the behavior at issue. explain analyze looks like the following, showing the query run with the join column indexed and not, respectively: # explain analyze update test.cust100 set prodid = tempprod.prodid, subprodid = tempprod.subprodid where origid = tempprod.debtid; -- with index QUERY PLAN --- Merge Join (cost=0.00..25.64 rows=500 width=220) (actual time=0.241..13.091 rows=100 loops=1) Merge Cond: ((outer.origid)::text = (inner.debtid)::text) - Index Scan using ix_origid_cust100 on cust100 (cost=0.00..11.50 rows=500 width=204) (actual time=0.125..6.465 rows=100 loops=1) - Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (actual time=0.057..1.497 rows=101 loops=1) Total runtime: 34.067 ms (5 rows) -- without index QUERY PLAN -- Merge Join (cost=7.32..16.71 rows=100 width=220) (actual time=4.415..10.918 rows=100 loops=1) Merge Cond: ((outer.debtid)::text = inner.?column22?) - Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (actual time=0.051..1.291 rows=101 loops=1) - Sort (cost=7.32..7.57 rows=100 width=204) (actual time=4.311..4.450 rows=100 loops=1) Sort Key: (cust100.origid)::text - Seq Scan on cust100 (cost=0.00..4.00 rows=100 width=204) (actual time=0.235..2.615 rows=100 loops=1) Total runtime: 25.031 ms (7 rows) With the join column indexed, it takes roughly .32ms/row on the first four tests (100.. 100K), and about .48ms/row on 1M rows. Without the index, it runs 100 rows @ .25/row, 1000 @ .26, 1 @ .27, 10 @ .48 and .5 @ 1M rows. In no case does the query plan reflect foreign key validation. Failing any other suggestions for diagnosis in the soon, I'm going to nuke the PostgreSQL install, scour it from the machine and start from scratch. Failing that, I'm going to come in some weekend and re-do the machine. Problem is when I recreate the indexes and add the constraints back on ORIG I end up with the same long running process. The original UPDATE runs for about 30 minutes on a table of 400,000 with the WHERE matching about 70% of the rows. The above runs for about 2 minutes without adding the constraints or indexes however adding the constraints and creating the dropped indexes negates any gain. Is this a frequently-run update? In my experience, with my seemingly mutant install, dropping indices and constraints to shave 14/15 off the update time would be worth the effort. Just script dropping, updating and recreating into one large transaction. It's a symptom-level fix, but re-creating the fifteen indices on one of our 5M row tables doesn't take 28 minutes, and your hardware looks to be rather less IO and CPU bound than ours. I'd also second Tom's suggestion of moving to 7.4. /rls ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
Greg Spiegelberg wrote: I've been following this thread closely as I have the same problem with an UPDATE. Everything is identical here right down to the strace output. Has anyone found a workaround or resolved the problem? If not, I have test systems here which I can use to help up test and explore. I'm still gathering data. The explain analyze I'd expected to finish Thursday afternoon hasn't yet. I'm going to kill it and try a few smaller runs, increasing in size, until the behavior manifests. Will advise. /rls ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
Rosser Schwarz wrote: Greg Spiegelberg wrote: I've been following this thread closely as I have the same problem with an UPDATE. Everything is identical here right down to the strace output. Has anyone found a workaround or resolved the problem? If not, I have test systems here which I can use to help up test and explore. I'm still gathering data. The explain analyze I'd expected to finish Thursday afternoon hasn't yet. I'm going to kill it and try a few smaller runs, increasing in size, until the behavior manifests. Will advise. I've replaced my atrocious UPDATE with the following. begin; -- Drop all contraints alter table ORIG drop constraint ...; -- Drop all indexes drop index ...; -- Update update ORIG set column=... where...; commit; Problem is when I recreate the indexes and add the constraints back on ORIG I end up with the same long running process. The original UPDATE runs for about 30 minutes on a table of 400,000 with the WHERE matching about 70% of the rows. The above runs for about 2 minutes without adding the constraints or indexes however adding the constraints and creating the dropped indexes negates any gain. RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5 Dual PIII 1.3'ishGHz, 2GB Memory U160 OS drives and a 1Gbps test SAN on a Hitachi 9910 Greg -- Greg Spiegelberg Sr. Product Development Engineer Cranel, Incorporated. Phone: 614.318.4314 Fax: 614.431.8388 Email: [EMAIL PROTECTED] Cranel. Technology. Integrity. Focus. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] atrocious update performance
Greg Spiegelberg [EMAIL PROTECTED] writes: RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5 Please try 7.4. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] atrocious update performance
I've been following this thread closely as I have the same problem with an UPDATE. Everything is identical here right down to the strace output. Has anyone found a workaround or resolved the problem? If not, I have test systems here which I can use to help up test and explore. Greg -- Greg Spiegelberg Sr. Product Development Engineer Cranel, Incorporated. Phone: 614.318.4314 Fax: 614.431.8388 Email: [EMAIL PROTECTED] Cranel. Technology. Integrity. Focus. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
On Tuesday 16 March 2004 00:08, Tom Lane wrote: I'm inclined to suspect an issue with foreign-key checking. You didn't give us any details about foreign key relationships your cust table is involved in --- could we see those? And the schemas of the other tables involved? Two questions Tom: 1. Do the stats tables record FK checks, or just explicit table accesses? 2. If not, should they? If the only real activity is this update then simple before/after views of the stats might be revealing. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: Hm. It looks like you mistakenly traced psql rather than the backend, but since the delay went away we wouldn't have learned anything anyhow. Have you got any idea what conditions may have changed between seeing delay and not seeing delay? None, offhand. I have noticed that when a large query is running, the machine can sporadically just freeze--or at least take inordinately long for some other process, be it top or ls, another query, or whatever to start. Nothing looks saturated when it happens, and, while you can count on it to happen, it's not consistent enough to reproduce. This is pretty odd too. It looks like it's doing checkpoints every so often (look for the writes to pg_control), which a backend engaged in a long-running query surely ought not be doing. Need to think about why that might be... Does the fact that all the reads and writes are 32K mean anything out of the ordinary? $PGSRC/src/include/pg_config_manual.h has BLCKSZ #defined to 16384. I was running previously with a 32K BLCKSZ, but that turned out to be rather sub-optimal for as heavily indexed as our tables are. I've dumped and rebuilt several times since then. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] atrocious update performance
Rosser Schwarz [EMAIL PROTECTED] writes: while you weren't looking, Tom Lane wrote: Have you got any idea what conditions may have changed between seeing delay and not seeing delay? None, offhand. I have noticed that when a large query is running, the machine can sporadically just freeze--or at least take inordinately long for some other process, be it top or ls, another query, or whatever to start. Nothing looks saturated when it happens, and, while you can count on it to happen, it's not consistent enough to reproduce. Interesting. You should leave vmstat 1 running in the background and see if you can correlate these freezes with bursts of disk I/O or swap. I saw a couple of delays in your big strace that seemed odd --- a couple of one-second-plus intervals, and a four-second-plus interval, with no obvious reason for them. Perhaps the same issue? Does the fact that all the reads and writes are 32K mean anything out of the ordinary? $PGSRC/src/include/pg_config_manual.h has BLCKSZ #defined to 16384. I was running previously with a 32K BLCKSZ, but that turned out to be rather sub-optimal for as heavily indexed as our tables are. I've dumped and rebuilt several times since then. I hate to break it to you, but that most definitely means you are running with BLCKSZ = 32K. Whatever you thought you were rebuilding didn't take effect. I agree that the larger blocksize is of dubious value. People used to do that back when the blocksize limited your row width, but these days I think you're probably best off with the standard 8K. Another thing that's fairly striking is the huge bursts of WAL activity --- your trace shows that the thing is writing entire WAL segments (16 MB) at one go, rather than dribbling it out a page or two at a time as the code is intended to do. I think what is happening is that you have wal_buffers = 1024 (correct?) yielding 32MB of WAL buffers, and since there are no other transactions happening, nothing gets written until you hit the write when the buffers are half full heuristic. I would suggest knocking wal_buffers down to something closer to the default (maybe 4 or 8 buffers) to reduce these I/O storms. (Memo to hackers: we need to see to it that the new background writer process takes some responsibility for pushing out filled WAL buffers, not only data buffers.) If the big EXPLAIN ANALYZE is still running, would you get a dump of its open files (see lsof -p) and correlate those with the tables being used in the query? I'm trying to figure out what the different writes and reads represent. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] atrocious update performance
I wrote: Regardless, something thinks it's still there. Is there any way that it is, and that I've somehow been running 7.3.2 all along? `which psql`, c show the bindir from my configure, but I'm not sure that's sufficient. The weird thing is that I know I never built 7.3.anything with 32K BLCKSZ, never built 7.3.anything at all. If 7.3 were installed, would it have any problem reading a 7.4 cluster? /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
Shridhar Daithankar [EMAIL PROTECTED] writes: Rosser Schwarz wrote: shared_buffers = 4096 sort_mem = 32768 vacuum_mem = 32768 wal_buffers = 16384 checkpoint_segments = 64 checkpoint_timeout = 1800 checkpoint_warning = 30 commit_delay = 5 effective_cache_size = 131072 First of all, your shared buffers are low. 4096 is 64MB with 16K block size. I would say at least push them to 150-200MB. Check. Much more than that isn't necessarily better though. shared_buffers = 1 is frequently mentioned as a sweet spot. Secondly your sort mem is too high. Note that it is per sort per query. You could build a massive swap storm with such a setting. Agreed, but I doubt that has anything to do with the immediate problem, since he's not testing parallel queries. Similarly pull down vacuum and WAL buffers to around 512-1024 each. The vacuum_mem setting here is 32Mb, which seems okay to me, if not on the low side. Again though it's not his immediate problem. I agree that the wal_buffers setting is outlandishly large; I can't see any plausible reason for it to be more than a few dozen. I don't know whether oversized wal_buffers can directly cause any performance issues, but it's certainly not a well-tested scenario. The other setting I was going to comment on is checkpoint_warning; it seems mighty low in comparison to checkpoint_timeout. If you are targeting a checkpoint every half hour, I'd think you'd want the system to complain about checkpoints spaced more closely than several minutes. But with the possible exception of wal_buffers, I can't see anything in these settings that explains the originally complained-of performance problem. I'm still wondering about foreign key checks. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: But with the possible exception of wal_buffers, I can't see anything in these settings that explains the originally complained-of performance problem. I'm still wondering about foreign key checks. Many of the configs I posted were fairly wild values, set to gather data points for further tweaking. Unfortunately, with this query there hasn't been time for many passes, and I've too much else on my plate to try concocting demonstration cases. The postmaster's been hupped with more sane values, but I experienced this same issue with the defaults. As for foreign keys, three tables refer to account.cust; all of them refer to account.cust.custid, the pk. One of those tables has several hundred thousand rows, many more to come; the others are empty. Unless I've woefully misunderstood, the presence or absence of a foreign key referring to one column should be moot for updates writing another column, shouldn't it? To answer your (and others') question, Tom, 7.4.1 on 2.4.20-18.9smp. Red Hat, I believe. I was handed the machine, which is also in use for lightweight production stuff: intranet webserver, rinky-dink MySQL doo-dads, c. I'm sure that has an impact, usurping the disk heads and such--maybe even more than I'd expect--but I can't imagine that'd cause an update to one 4.7M row table, from another 4.7M row table, both clustered on a join column that maps one-to-one between them, to take days. I'm baffled; everything else is perfectly snappy, given the hardware. Anything requiring a sequential scan over one of the big tables is a slog, but that's to be expected and hence all the indices. Watching iostat, I've observed a moderately cyclic read-big, write- big pattern, wavelengths generally out of phase, interspersed with smaller, almost epicycles--from the machine's other tasks, I'm sure. top has postmaster's cpu usage rarely breaking 25% over the course of the query's execution, and spending most of its time much lower; memory usage hovers somewhere north of 500MB. In what little time I had to stare at a disturbingly matrix-esque array of terminals scrolling sundry metrics, I didn't notice a correlation between cpu usage spikes and peaks in the IO cycle's waveforms. For whatever that's worth. The other tables involved are: # \d account.acct Table account.acct Column |Type |Modifiers +-+- acctid | bigint | not null default | nextval('account.acctid_seq'::text) custid | bigint | acctstatid | integer | not null acctno | character varying(50) | bal| money | begdt | timestamp without time zone | not null enddt | timestamp without time zone | debtid | character varying(50) | Indexes: acct_pkey primary key, btree (acctid) ix_acctno btree (acctno) WHERE (acctno IS NOT NULL) Foreign-key constraints: $1 FOREIGN KEY (custid) REFERENCES account.cust(custid) ON UPDATE CASCADE ON DELETE RESTRICT $2 FOREIGN KEY (acctstatid) REFERENCES account.acctstat(acctstatid) ON UPDATE CASCADE ON DELETE RESTRICT # \d account.note Table account.note Column |Type | Modifiers ---+-+-- --- noteid| bigint | not null default | nextval('account.noteid_seq'::text) custid| bigint | not null note | text| not null createddt | timestamp without time zone | not null default now() Indexes: note_pkey primary key, btree (noteid) Foreign-key constraints: $1 FOREIGN KEY (custid) REFERENCES account.cust(custid) ON UPDATE CASCADE ON DELETE RESTRICT # \d account.origacct Table account.origacct Column|Type | Modifiers -+-+--- custid | bigint | lender | character varying(50) | chgoffdt| timestamp without time zone | opendt | timestamp without time zone | offbureaudt | timestamp without time zone | princbal| money | intbal | money | totbal | money | lastpayamt | money | lastpaydt | timestamp without time zone | debttype| integer | debtid | character varying(10) | acctno | character varying(50) | Foreign-key constraints: $1 FOREIGN KEY (custid) REFERENCES account.cust(custid) ON UPDATE CASCADE ON DELETE RESTRICT And the table we were
Re: [PERFORM] atrocious update performance
Rosser Schwarz [EMAIL PROTECTED] writes: But if I'm not touching the column referenced from account.acct, why would it be looking there at all? I've got an explain analyze of the update running now, but until it finishes, I can't say for certain what it's doing. explain, alone, says: EXPLAIN won't tell you anything about triggers that might get fired during the UPDATE, so it's not much help for investigating possible FK performance problems. EXPLAIN ANALYZE will give you some indirect evidence: the difference between the total query time and the total time reported for the topmost plan node represents the time spent running triggers and physically updating the tuples. I suspect we are going to see a big difference. which shows it not hitting account.acct at all. (And why did it take the planner 20-some seconds to come up with that query plan?) It took 20 seconds to EXPLAIN? That's pretty darn odd in itself. I'm starting to think there must be something quite whacked-out about your installation, but I haven't got any real good ideas about what. (I'm assuming of course that there weren't a ton of other jobs eating CPU while you tried to do the EXPLAIN.) [ thinks for awhile... ] The only theory that comes to mind for making the planner so slow is oodles of dead tuples in pg_statistic. Could I trouble you to run vacuum full verbose pg_statistic; and send along the output? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: EXPLAIN won't tell you anything about triggers that might get fired during the UPDATE, so it's not much help for investigating possible FK performance problems. EXPLAIN ANALYZE will give you some indirect evidence: the difference between the total query time and the total time reported for the topmost plan node represents the time spent running triggers and physically updating the tuples. I suspect we are going to see a big difference. It's still running. It took 20 seconds to EXPLAIN? That's pretty darn odd in itself. It struck me, too. I'm starting to think there must be something quite whacked-out about your installation, but I haven't got any real good ideas about what. Built from source. configure arguments: ./configure --prefix=/var/postgresql --bindir=/usr/bin --enable-thread-safety --with-perl --with-python --with-openssl --with-krb5=/usr/kerberos I can answer more specific questions; otherwise, I'm not sure what to look for, either. If we could take the machine out of production (oh, hell; I think I just volunteered myself for weekend work) long enough to reinstall everything to get a fair comparison... So far as I know, though, it's a more or less stock Red Hat. 2.4.20- something. (I'm assuming of course that there weren't a ton of other jobs eating CPU while you tried to do the EXPLAIN.) CPU's spiked sopradically, which throttled everything else, but it never stays high. top shows the current explain analyze running between 50- ish% and negligible. iostat -k 3 shows an average of 3K/sec written, for a hundred-odd tps. I can't get any finer-grained than that, unfortunately; the machine was handed to me with a single, contiguous filesystem, in production use. [ thinks for awhile... ] The only theory that comes to mind for making the planner so slow is oodles of dead tuples in pg_statistic. Could I trouble you to run vacuum full verbose pg_statistic; and send along the output? INFO: vacuuming pg_catalog.pg_statistic INFO: pg_statistic: found 215 removable, 349 nonremovable row versions in 7 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 72 to 8132 bytes long. There were 3 unused item pointers. Total free space (including removable row versions) is 91572 bytes. 0 pages are or will become empty, including 0 at the end of the table. 7 pages containing 91572 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.71 sec. INFO: index pg_statistic_relid_att_index now contains 349 row versions in 2 pages DETAIL: 215 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: pg_statistic: moved 120 row versions, truncated 7 to 5 pages DETAIL: CPU 0.03s/0.01u sec elapsed 0.17 sec. INFO: index pg_statistic_relid_att_index now contains 349 row versions in 2 pages DETAIL: 120 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: vacuuming pg_toast.pg_toast_16408 INFO: pg_toast_16408: found 12 removable, 12 nonremovable row versions in 5 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 660 to 8178 bytes long. There were 0 unused item pointers. Total free space (including removable row versions) is 91576 bytes. 2 pages are or will become empty, including 0 at the end of the table. 5 pages containing 91576 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.27 sec. INFO: index pg_toast_16408_index now contains 12 row versions in 2 pages DETAIL: 12 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.05 sec. INFO: pg_toast_16408: moved 10 row versions, truncated 5 to 3 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: index pg_toast_16408_index now contains 12 row versions in 2 pages DETAIL: 10 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. Having never more than glanced at the output of vacuum verbose, I can't say whether that makes the cut for oodles. My suspicion is no. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
Rosser Schwarz [EMAIL PROTECTED] writes: Having never more than glanced at the output of vacuum verbose, I can't say whether that makes the cut for oodles. My suspicion is no. Nope, it sure doesn't. We occasionally see people who don't know they need to vacuum regularly and have accumulated hundreds or thousands of dead tuples for every live one :-(. That's clearly not the issue here. I'm fresh out of ideas, and the fact that this is a live server kinda limits what we can do experimentally ... but clearly, *something* is very wrong. Well, when you don't know what to look for, you still have to look. One possibly useful idea is to trace the kernel calls of the backend process while it does that ridiculously long EXPLAIN --- think you could try that? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: [trace] `strace -p 21882` run behind the below query and plan ... below that. # explain update account.cust set prodid = tempprod.prodid, subprodid = tempprod.subprodid where origid = tempprod.debtid; QUERY PLAN - Merge Join (cost=0.00..232764.69 rows=4731410 width=252) Merge Cond: ((outer.origid)::text = (inner.debtid)::text) - Index Scan using ix_origid on cust (cost=0.00..94876.83 rows=4731410 width=236) - Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (4 rows) -- recv(9, Q\0\0\0}explain update account.cust..., 8192, 0) = 126 gettimeofday({1079482151, 106228}, NULL) = 0 brk(0) = 0x82d9000 brk(0x82db000) = 0x82db000 open(/var/lib/pgsql/data/base/495616/6834170, O_RDWR|O_LARGEFILE) = 8 _llseek(8, 212402176, [212402176], SEEK_SET) = 0 write(8, \342\1\0\0\0\314\374\6\24\0\0\0\214\7pG\360\177\1\200\320..., 32768) = 32768 close(8)= 0 open(/var/lib/pgsql/data/base/495616/16635, O_RDWR|O_LARGEFILE) = 8 read(8, \0\0\0\0\20\0\0\0\1\0\0\0\24\0\360\177\360\177\1\200b1..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834168, O_RDWR|O_LARGEFILE) = 10 _llseek(10, 60817408, [60817408], SEEK_SET) = 0 write(10, \342\1\0\0`\334\5\7\24\0\0\0t\0010x\360\177\1\200\330\377..., 32768) = 32768 close(10) = 0 read(8, \334\1\0\0h\217\270n\24\0\0\0H\0H|[EMAIL PROTECTED]..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834165, O_RDWR|O_LARGEFILE) = 10 _llseek(10, 130777088, [130777088], SEEK_SET) = 0 write(10, \342\1\0\0\341\7\7\24\0\0\0004\t0I\360\177\1\200\330\377..., 32768) = 32768 close(10) = 0 open(/var/lib/pgsql/data/base/495616/16595, O_RDWR|O_LARGEFILE) = 10 read(10, [EMAIL PROTECTED]..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834168, O_RDWR|O_LARGEFILE) = 11 _llseek(11, 145915904, [145915904], SEEK_SET) = 0 write(11, \342\1\0\0\300\350\n\7\24\0\0\0\224\6\310Z\360\177\1\200..., 32768) = 32768 close(11) = 0 open(/var/lib/pgsql/data/base/495616/16614, O_RDWR|O_LARGEFILE) = 11 read(11, \0\0\0\0\24\231P\306\16\0\0\0\24\0\360\177\360\177\1\200..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834166, O_RDWR|O_LARGEFILE) = 12 _llseek(12, 148570112, [148570112], SEEK_SET) = 0 write(12, \342\1\0\0\274\365\22\7\24\0\0\0X\3\234o\360\177\1\200..., 32768) = 32768 close(12) = 0 _llseek(11, 98304, [98304], SEEK_SET) = 0 read(11, \0\0\0\0\24\231P\306\16\0\0\0\34\0\234\177\360\177\1\200..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834163, O_RDWR|O_LARGEFILE) = 12 _llseek(12, 251789312, [251789312], SEEK_SET) = 0 write(12, \342\1\0\0l\366\23\7\24\0\0\0\364\10\260J\360\177\1\200..., 32768) = 32768 close(12) = 0 _llseek(11, 32768, [32768], SEEK_SET) = 0 read(11, \340\1\0\0\324\231\273\241\24\0\0\0\234\5\330\26\360\177..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834165, O_RDWR|O_LARGEFILE) = 12 _llseek(12, 117309440, [117309440], SEEK_SET) = 0 write(12, \342\1\0\0d\36)\7\24\0\0\\tHI\360\177\1\200\330\377..., 32768) = 32768 close(12) = 0 open(/var/lib/pgsql/data/base/495616/1259, O_RDWR|O_LARGEFILE) = 12 _llseek(12, 32768, [32768], SEEK_SET) = 0 read(12, \334\1\0\0\324v-p\24\0\0\\3\304\3\0\200\1\200\377..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834173, O_RDWR|O_LARGEFILE) = 13 _llseek(13, 247824384, [247824384], SEEK_SET) = 0 write(13, \342\1\0\0h *\7\24\0\0\0\204\4dm\360\177\1\200\340\377..., 32768) = 32768 close(13) = 0 open(/var/lib/pgsql/data/base/495616/16613, O_RDWR|O_LARGEFILE) = 13 read(13, \0\0\0\0\20\0\0\0\1\0\0\0\24\0\360\177\360\177\1\200b1..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834168, O_RDWR|O_LARGEFILE) = 14 _llseek(14, 204472320, [204472320], SEEK_SET) = 0 write(14, \342\1\0\0\314\272:\7\24\0\0\0\324\t\354K\360\177\1\200..., 32768) = 32768 close(14) = 0 read(13, \340\1\0\0X\231\273\241\24\0\0\0\370\6Dk\360\177\1\200..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834166, O_RDWR|O_LARGEFILE) = 14 _llseek(14, 152010752, [152010752], SEEK_SET) = 0 write(14, \342\1\0\0p\277\7\24\0\0\0\364\n\220I\360\177\1\200\334..., 32768) = 32768 close(14) = 0 open(/var/lib/pgsql/data/base/495616/16610, O_RDWR|O_LARGEFILE) = 14 read(14, \0\0\0\0\10\317\27\t\16\0\0\0\24\0\360\177\360\177\1\200..., 32768) = 32768 open(/var/lib/pgsql/data/base/495616/6834170, O_RDWR|O_LARGEFILE) = 15 _llseek(15, 86441984, [86441984], SEEK_SET) = 0 write(15, \342\1\0\0\330B?\7\24\0\0\0\370\6 N\360\177\1\200\310\377..., 32768) = 32768 close(15) = 0 _llseek(14,
Re: [PERFORM] atrocious update performance
Rosser Schwarz [EMAIL PROTECTED] writes: `strace -p 21882` run behind the below query and plan ... below that. Hmm ... that took 20 seconds eh? It is a fairly interesting trace. It shows that the backend needed to read 63 system catalog pages (that weren't already in shared memory), which is not too unreasonable I think ... though I wonder if more of them shouldn't have been in memory already. The odd thing is that for *every single read* it was necessary to first dump out a dirty page in order to make a buffer free. That says you are running with the entire contents of shared buffer space dirty at all times. That's probably not the regime you want to be operating in. I think we already suggested increasing shared_buffers. You might also want to think about not using such a large checkpoint interval. (The background-writing logic already committed for 7.5 should help this problem, but it's not there in 7.4.) Another interesting fact is that the bulk of the writes were blind writes, involving an open()/write()/close() sequence instead of keeping the open file descriptor around for re-use. This is not too surprising in a freshly started backend, I guess; it's unlikely to have had reason to create a relation descriptor for the relations it may have to dump pages for. In some Unixen, particularly Solaris, open() is fairly expensive and so blind writes are bad news. I didn't think it was a big problem in Linux though. (This is another area we've improved for 7.5: there are no more blind writes. But that won't help you today.) What's not immediately evident is whether the excess I/O accounted for all of the slowdown. Could you retry the strace with -r and -T options so we can see how much time is being spent inside and outside the syscalls? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
# explain update account.cust set prodid = tempprod.prodid where tempprod.did = origid; Merge Join (cost=0.00..232764.69 rows=4731410 width=252) Merge Cond: ((outer.origid)::text = (inner.did)::text) - Index Scan using ix_origid on cust (cost=0.00..94876.83 rows=4731410 width=244) - Index Scan using ix_did on tempprod (cost=0.00..66916.71 rows=4731410 width=18) I'm going to hazard a guess and say you have a number of foreign keys that refer to account.cust.prodid? This is probably the time consuming part -- perhaps even a missing index on one of those keys that refers to this field. Going the other way should be just as good for your purposes, and much faster since you're not updating several foreign key'd fields bound to account.cust.prodid. UPDATE tempprod.prodid = prodid FROM account.cust WHERE temprod.did = cust.origid; ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] atrocious update performance
# explain update account.cust set prodid = tempprod.prodid where tempprod.did = origid; Merge Join (cost=0.00..232764.69 rows=4731410 width=252) Merge Cond: ((outer.origid)::text = (inner.did)::text) - Index Scan using ix_origid on cust (cost=0.00..94876.83 rows=4731410 width=244) - Index Scan using ix_did on tempprod (cost=0.00..66916.71 rows=4731410 width=18) I'm going to hazard a guess and say you have a number of foreign keys that refer to account.cust.prodid? This is probably the time consuming part -- perhaps even a missing index on one of those keys that refers to this field. Actually, there are no foreign keys to those columns. Once they're populated, I'll apply a foreign key constraint and they'll refer to the appropriate row in the prod and subprod tables, but nothing will reference account.cust.[sub]prodid. There are, of course, several foreign keys referencing account.cust.custid. Going the other way should be just as good for your purposes, and much faster since you're not updating several foreign key'd fields bound to account.cust.prodid. UPDATE tempprod.prodid = prodid FROM account.cust WHERE temprod.did = cust.origid; Not quite. Without this update, acount.cust.[sub]prodid are null. The data was strewn across multiple tables in MS SQL; we're normalizing it into one, hence the need to populate the two columns independently. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] atrocious update performance
Bulk updates are generally dogs (not just in pg), so I avoid doing them by doing faster selects and inserts. You can create a new table using 'create table as' to produce your target results. This is real fast - avoiding the row iteration in insert, allowing the select optimizer to run and no index overhead. Then alter/rename, add indexes and whatever else hangs off the table (or if you're lazy do an insert/select into the original target table). I often see 2 orders of magnitude improvement doing this, and no need to vacuum. /Aaron - Original Message - From: Rosser Schwarz [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, March 15, 2004 3:28 PM Subject: [PERFORM] atrocious update performance We're in the throes of an MS SQL to PostgreSQL migration; our databases include a number of ~5M row tables. We decided to take this opportunity to clean up and slightly re-normalize our schemas, given what we've learned about the data over its lifetime and such, else we wouldn't be experiencing any of the following (we could instead just dump and `copy from`). We have a temporary table, public.tempprod, containing 4.7M rows, one for each row in account.cust. account.cust has, among others, two columns, prod and subprod, which we're trying to update from tempprod joined against prod. The update tends to take unnecessarily long-- rather, we've had to finally kill it after its taking obscenely too long. The table: # \d account.cust Table account.cust Column |Type | Modifiers ---+-+-- custid| bigint | not null default | | nextval('account.custid_seq'::text) ownerid | integer | not null origid| text| not null pname | text| fname | text| mname | text| lname | text| suffix| text| addr1 | text| addr2 | text| addr3 | text| city | text| state | text| zip | text| zipplus | text| homeph| text| workph| text| otherph | text| ssn | text| isactive | boolean | default true createddt | timestamp without time zone | default now() prodid| bigint | subprodid | bigint | Indexes: cust_pkey primary key, btree (custid) ix_addr1 btree (addr1) WHERE (addr1 IS NOT NULL) ix_addr2 btree (addr2) WHERE (addr2 IS NOT NULL) ix_city btree (city) WHERE (city IS NOT NULL) ix_fname btree (fname) WHERE (fname IS NOT NULL) ix_homeph btree (homeph) WHERE (homeph IS NOT NULL) ix_lname btree (lname) WHERE (lname IS NOT NULL) ix_mname btree (mname) WHERE (mname IS NOT NULL) ix_origid btree (origid) ix_ssn btree (ssn) WHERE (ssn IS NOT NULL) ix_state btree (state) WHERE (state IS NOT NULL) ix_workph btree (workph) WHERE (workph IS NOT NULL) ix_zip btree (zip) WHERE (zip IS NOT NULL) We're currently running on a dual Xeon 700 (I know, I know; it's what we've got) with 2.5GB RAM and 4x36GB SCSI in hardware RAID 5 (Dell Perc3 something-or-other controller). If we can demonstrate that PostgreSQL will meet our needs, we'll be going production on a dual Opteron, maxed memory, with a 12-disk Fibre Channel array. The query is: update account.cust set prodid = (select p.prodid from account.prod p join public.tempprod t on t.pool = p.origid where custid = t.did) And then, upon its completion, s/prod/subprod/. That shouldn't run overnight, should it, let alone for -days-? In experimenting with ways of making the updates take less time, we tried adding product and subproduct columns to tempprod, and updating those. That seemed to work marginally better: explain analyze update public.tempprod set prodid = (select account.prod.prodid::bigint from account.prod where public.tempprod.pool::text = account.prod.origid::text) Seq Scan on tempprod (cost=0.00..9637101.35 rows 4731410 width=56) (actual time=24273.467..16090470.438 rows=4731410 loops=1) SubPlan - Limit (cost=0.00..2.02 rows=2 width=8) (actual time=0.134..0.315 rows=1 loops=4731410) - Seq Scan on prod (cost=0.00..2.02 rows=2 width=8) (actual time=0.126..0.305 rows=1 loops=4731410) Filter: (($0)::text = (origid)::text) Total runtime: 2284551.962 ms But then going from public.tempprod to account.cust again takes days. I just cancelled an update that's been running since last Thursday. Alas,
Re: [PERFORM] atrocious update performance
On Mon, 2004-03-15 at 16:15, Rosser Schwarz wrote: # explain update account.cust set prodid = tempprod.prodid where tempprod.did = origid; Merge Join (cost=0.00..232764.69 rows=4731410 width=252) Merge Cond: ((outer.origid)::text = (inner.did)::text) - Index Scan using ix_origid on cust (cost=0.00..94876.83 rows=4731410 width=244) - Index Scan using ix_did on tempprod (cost=0.00..66916.71 rows=4731410 width=18) I'm going to hazard a guess and say you have a number of foreign keys that refer to account.cust.prodid? This is probably the time consuming part -- perhaps even a missing index on one of those keys that refers to this field. Actually, there are no foreign keys to those columns. Once they're populated, I'll apply a foreign key constraint and they'll refer to the appropriate row in the prod and subprod tables, but nothing will reference account.cust.[sub]prodid. There are, of course, several foreign keys referencing account.cust.custid. If there are no feign keys to it, I wouldn't expect it to take more than 10 minutes on slow hardware. Fresh out of ideas here. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] atrocious update performance
You can create a new table using 'create table as' to produce your target results. This is real fast ... I often see 2 orders of magnitude improvement doing this, and no need to vacuum. Indeed: Query returned successfully with no result in 582761 ms. Though I must say, ten minutes is nominally more than two orders of mangitude performance improvement, versus several days. Many thanks, Aaron. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
Rosser Schwarz [EMAIL PROTECTED] writes: You can create a new table using 'create table as' to produce your target results. This is real fast ... I often see 2 orders of magnitude improvement doing this, and no need to vacuum. Indeed: Query returned successfully with no result in 582761 ms. Though I must say, ten minutes is nominally more than two orders of mangitude performance improvement, versus several days. Hm. There is no way that inserting a row is two orders of magnitude faster than updating a row --- they both require storing a new row and making whatever index entries are needed. The only additional cost of the update is finding the old row (not a very big deal AFAICS in the examples you gave) and marking it deleted (definitely cheap). So there's something awfully fishy going on here. I'm inclined to suspect an issue with foreign-key checking. You didn't give us any details about foreign key relationships your cust table is involved in --- could we see those? And the schemas of the other tables involved? Also, exactly which PG version is this? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] atrocious update performance
The original point was about a very slow update of an entire table with a plan that looped, and over a dozen conditional indices - vs. a 'create as' in a CPU starved environment. I stand by my statement about observing the orders of magnitude difference. In theory I agree that the update should be in the same order of magnitude as the create as, but in practice I disagree. I also think something is wrong on the logical side (besides FKs, are there any triggers?) but was responding to the Gordian knot issue of bailing out of pg. Can you post a sample extract, Rosser? Otherwise, I'll try to put together a sample of a slow mass join update. /Aaron - Original Message - From: Tom Lane [EMAIL PROTECTED] To: Rosser Schwarz [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, March 15, 2004 7:08 PM Subject: Re: [PERFORM] atrocious update performance Rosser Schwarz [EMAIL PROTECTED] writes: You can create a new table using 'create table as' to produce your target results. This is real fast ... I often see 2 orders of magnitude improvement doing this, and no need to vacuum. Indeed: Query returned successfully with no result in 582761 ms. Though I must say, ten minutes is nominally more than two orders of mangitude performance improvement, versus several days. Hm. There is no way that inserting a row is two orders of magnitude faster than updating a row --- they both require storing a new row and making whatever index entries are needed. The only additional cost of the update is finding the old row (not a very big deal AFAICS in the examples you gave) and marking it deleted (definitely cheap). So there's something awfully fishy going on here. I'm inclined to suspect an issue with foreign-key checking. You didn't give us any details about foreign key relationships your cust table is involved in --- could we see those? And the schemas of the other tables involved? Also, exactly which PG version is this? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
Rosser Schwarz wrote: shared_buffers = 4096 sort_mem = 32768 vacuum_mem = 32768 wal_buffers = 16384 checkpoint_segments = 64 checkpoint_timeout = 1800 checkpoint_warning = 30 commit_delay = 5 effective_cache_size = 131072 You didn't mention the OS so I would take it as either linux/freeBSD. First of all, your shared buffers are low. 4096 is 64MB with 16K block size. I would say at least push them to 150-200MB. Secondly your sort mem is too high. Note that it is per sort per query. You could build a massive swap storm with such a setting. Similarly pull down vacuum and WAL buffers to around 512-1024 each. I know that your problem is solved by using insert rather than updates. But I just want to point out that you still need to analyze the table to update the statistics or the further queres will not be exactly good. And lastly, you can bundle entire thing including creating duplicate table, populating it, renaming original table etc in a single transaction and nobody will notice it. I am almost sure MS-SQL can not do that. Not many databases have trasact-safe DDLs out there.. HTH Shridhar ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings