On Sun, 2003-10-05 at 19:58, Tom Lane wrote:
> That's a fairly useless place to put it, though, since someone would
> only think to look at sort_mem if they already had a clue. It should
> be mentioned under bulk data load (in performance tips chapter)
Attached is a doc patch that does this. The way I've worded it may not
be the best, though.
> and perhaps also in dump/restore procedures.
It's already mentioned there.
Should we also suggest turning off fsync when doing restores?
(BTW, is there a reason the docs consistently call them "B-tree
indexes", not "B+-tree indexes"?)
-Neil
Index: doc/src/sgml/perform.sgml
===================================================================
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/perform.sgml,v
retrieving revision 1.33
diff -c -r1.33 perform.sgml
*** doc/src/sgml/perform.sgml 11 Sep 2003 18:30:38 -0000 1.33
--- doc/src/sgml/perform.sgml 6 Oct 2003 00:21:48 -0000
***************
*** 751,761 ****
<para>
Use <command>COPY FROM STDIN</command> to load all the rows in one
! command, instead of using
! a series of <command>INSERT</command> commands. This reduces parsing,
! planning, etc.
! overhead a great deal. If you do this then it is not necessary to turn
! off autocommit, since it is only one command anyway.
</para>
</sect2>
--- 751,760 ----
<para>
Use <command>COPY FROM STDIN</command> to load all the rows in one
! command, instead of using a series of <command>INSERT</command>
! commands. This reduces parsing, planning, etc. overhead a great
! deal. If you do this then it is not necessary to turn off
! autocommit, since it is only one command anyway.
</para>
</sect2>
***************
*** 764,772 ****
<para>
If you are loading a freshly created table, the fastest way is to
! create the table, bulk-load with <command>COPY</command>, then create any
! indexes needed
! for the table. Creating an index on pre-existing data is quicker than
updating it incrementally as each row is loaded.
</para>
--- 763,771 ----
<para>
If you are loading a freshly created table, the fastest way is to
! create the table, bulk load the table's data using
! <command>COPY</command>, then create any indexes needed for the
! table. Creating an index on pre-existing data is quicker than
updating it incrementally as each row is loaded.
</para>
***************
*** 780,785 ****
--- 779,797 ----
</para>
</sect2>
+ <sect2 id="populate-sort-mem">
+ <title>Increase <varname>sort_mem</varname></title>
+
+ <para>
+ Temporarily increasing the <varname>sort_mem</varname>
+ configuration variable when restoring large amounts of data can
+ lead to improved performance. This is because when a B-tree index
+ is created from scratch, the existing content of the table needs
+ to be sorted. Allowing the merge sort to use more buffer pages
+ means that fewer merge passes will be required.
+ </para>
+ </sect2>
+
<sect2 id="populate-analyze">
<title>Run <command>ANALYZE</command> Afterwards</title>
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/runtime.sgml,v
retrieving revision 1.210
diff -c -r1.210 runtime.sgml
*** doc/src/sgml/runtime.sgml 3 Oct 2003 19:26:49 -0000 1.210
--- doc/src/sgml/runtime.sgml 6 Oct 2003 00:06:45 -0000
***************
*** 928,935 ****
by <literal>ORDER BY</>, merge joins, and <command>CREATE INDEX</>.
Hash tables are used in hash joins, hash-based aggregation, and
hash-based processing of <literal>IN</> subqueries. Because
! <command>CREATE INDEX</> is used when restoring a database, it might
! be good to temporarily increase this value during a restore.
</para>
</listitem>
</varlistentry>
--- 928,936 ----
by <literal>ORDER BY</>, merge joins, and <command>CREATE INDEX</>.
Hash tables are used in hash joins, hash-based aggregation, and
hash-based processing of <literal>IN</> subqueries. Because
! <command>CREATE INDEX</> is used when restoring a database,
! increasing <varname>sort_mem</varname> before doing a large
! restore operation can improve performance.
</para>
</listitem>
</varlistentry>
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html