Added to TODO: * Sort large UPDATE/DELETEs so it is done in heap order
http://archives.postgresql.org/pgsql-hackers/2008-01/msg01119.php --------------------------------------------------------------------------- Tom Lane wrote: > We've had a couple of discussions recently revolving around the > inefficiency of using hashjoin/hashaggregation output to update a target > table, because of the resulting very random access pattern. I believe > this same mechanism is underlying the slowness of Stephen Denne's > "alternate query" described here: > http://archives.postgresql.org/pgsql-performance/2008-01/msg00227.php > > I made up the attached doubtless-oversimplified test case to model what > he was seeing. It's cut down about 4x from the table size he describes, > but the UPDATE still takes forever --- I gave up waiting after 2 hours, > when it had deleted about a fifth of its hashjoin temp files, suggesting > that the total runtime would be about 10 hours. > > A brute force idea for fixing this is to sort the intended update or > delete operations of an UPDATE/DELETE command according to the target > table's ctid, which is available for free anyway since the executor top > level must have it to perform the operation. I made up an even more > brute force patch (also attached) that forces that to happen for every > UPDATE or DELETE --- obviously we'd not want that for real, it's just > for crude performance testing. With that patch, I got the results > > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------------------------------- > Sort (cost=6075623.03..6085623.05 rows=4000008 width=618) (actual > time=2078726.637..3371944.124 rows=4000000 loops=1) > Sort Key: df.ctid > Sort Method: external merge Disk: 2478992kB > -> Hash Join (cost=123330.50..1207292.72 rows=4000008 width=618) (actual > time=20186.510..721120.455 rows=4000000 loops=1) > Hash Cond: (df.document_id = d.id) > -> Seq Scan on document_file df (cost=0.00..373334.08 rows=4000008 > width=614) (actual time=11.775..439993.807 rows=4000000 loops=1) > -> Hash (cost=57702.00..57702.00 rows=4000200 width=8) (actual > time=19575.885..19575.885 rows=4000000 loops=1) > -> Seq Scan on document d (cost=0.00..57702.00 rows=4000200 > width=8) (actual time=0.039..14335.615 rows=4000000 loops=1) > Total runtime: 3684037.097 ms > > or just over an hour runtime --- still not exactly speedy, but it > certainly compares favorably to the estimated 10 hours for unsorted > updates. > > This is with default shared_buffers (32MB) and work_mem (1MB); > a more aggressive work_mem would have meant fewer hash batches and fewer > sort runs and hence better performance in both cases, but with the > majority of the runtime going into the sort step here, I think that the > sorted update would benefit much more. > > Nowhere near a workable patch of course, but seems like food for > thought. > > regards, tom lane > Content-Description: bighash.sql > drop table if exists document; > drop table if exists document_file ; > > create table document (document_type_id int, id int primary key); > create table document_file (document_type_id int, document_id int primary key, > filler char(600)); > > insert into document_file select x,x,'z' from generate_series(1,4000000) x; > insert into document select x,x from generate_series(1,4000000) x; > > analyze document_file; > analyze document; > > set enable_mergejoin = false; > > explain analyze UPDATE ONLY document_file AS df SET document_type_id = > d.document_type_id FROM document AS d WHERE d.id = document_id; Content-Description: ctid-sort.patch > Index: src/backend/optimizer/prep/preptlist.c > =================================================================== > RCS file: /cvsroot/pgsql/src/backend/optimizer/prep/preptlist.c,v > retrieving revision 1.88 > diff -c -r1.88 preptlist.c > *** src/backend/optimizer/prep/preptlist.c 1 Jan 2008 19:45:50 -0000 > 1.88 > --- src/backend/optimizer/prep/preptlist.c 30 Jan 2008 03:06:30 -0000 > *************** > *** 32,37 **** > --- 32,38 ---- > #include "optimizer/var.h" > #include "parser/analyze.h" > #include "parser/parsetree.h" > + #include "parser/parse_clause.h" > #include "parser/parse_coerce.h" > > > *************** > *** 103,108 **** > --- 104,120 ---- > tlist = list_copy(tlist); > > tlist = lappend(tlist, tle); > + > + /* > + * Force the query result to be sorted by CTID, for better > update > + * speed. (Note: we expect parse->sortClause to be NIL here, > + * but this code will do no harm if it's not.) > + */ > + parse->sortClause = addTargetToSortList(NULL, tle, > + > parse->sortClause, tlist, > + > SORTBY_DEFAULT, > + > SORTBY_NULLS_DEFAULT, > + > NIL, false); > } > > /* > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers