On Sun, 2003-10-05 at 19:58, Tom Lane wrote:
> That's a fairly useless place to put it, though, since someone would
> only think to look at sort_mem if they already had a clue. It should
> be mentioned under bulk data load (in performance tips chapter)
Attached is a doc patch that does this. The way I've worded it may not
be the best, though.
> and perhaps also in dump/restore procedures.
It's already mentioned there.
Should we also suggest turning off fsync when doing restores?
(BTW, is there a reason the docs consistently call them "B-tree
indexes", not "B+-tree indexes"?)
-Neil
Index: doc/src/sgml/perform.sgml
===
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/perform.sgml,v
retrieving revision 1.33
diff -c -r1.33 perform.sgml
*** doc/src/sgml/perform.sgml 11 Sep 2003 18:30:38 - 1.33
--- doc/src/sgml/perform.sgml 6 Oct 2003 00:21:48 -
***
*** 751,761
Use COPY FROM STDIN to load all the rows in one
! command, instead of using
! a series of INSERT commands. This reduces parsing,
! planning, etc.
! overhead a great deal. If you do this then it is not necessary to turn
! off autocommit, since it is only one command anyway.
--- 751,760
Use COPY FROM STDIN to load all the rows in one
! command, instead of using a series of INSERT
! commands. This reduces parsing, planning, etc. overhead a great
! deal. If you do this then it is not necessary to turn off
! autocommit, since it is only one command anyway.
***
*** 764,772
If you are loading a freshly created table, the fastest way is to
! create the table, bulk-load with COPY, then create any
! indexes needed
! for the table. Creating an index on pre-existing data is quicker than
updating it incrementally as each row is loaded.
--- 763,771
If you are loading a freshly created table, the fastest way is to
! create the table, bulk load the table's data using
! COPY, then create any indexes needed for the
! table. Creating an index on pre-existing data is quicker than
updating it incrementally as each row is loaded.
***
*** 780,785
--- 779,797
+
+Increase sort_mem
+
+
+ Temporarily increasing the sort_mem
+ configuration variable when restoring large amounts of data can
+ lead to improved performance. This is because when a B-tree index
+ is created from scratch, the existing content of the table needs
+ to be sorted. Allowing the merge sort to use more buffer pages
+ means that fewer merge passes will be required.
+
+
+
Run ANALYZE Afterwards
Index: doc/src/sgml/runtime.sgml
===
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/runtime.sgml,v
retrieving revision 1.210
diff -c -r1.210 runtime.sgml
*** doc/src/sgml/runtime.sgml 3 Oct 2003 19:26:49 - 1.210
--- doc/src/sgml/runtime.sgml 6 Oct 2003 00:06:45 -
***
*** 928,935
by ORDER BY, merge joins, and CREATE INDEX.
Hash tables are used in hash joins, hash-based aggregation, and
hash-based processing of IN subqueries. Because
! CREATE INDEX is used when restoring a database, it might
! be good to temporarily increase this value during a restore.
--- 928,936
by ORDER BY, merge joins, and CREATE INDEX.
Hash tables are used in hash joins, hash-based aggregation, and
hash-based processing of IN subqueries. Because
! CREATE INDEX is used when restoring a database,
! increasing sort_mem before doing a large
! restore operation can improve performance.
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match