nbtsort.c has a comment block from the Berkeley days that reads: * This code is moderately slow (~10% slower) compared to the regular * btree (insertion) build code on sorted or well-clustered data. On * random data, however, the insertion build code is unusable -- the * difference on a 60MB heap is a factor of 15 because the random * probes into the btree thrash the buffer pool. (NOTE: the above * "10%" estimate is probably obsolete, since it refers to an old and * not very good external sort implementation that used to exist in * this module. tuplesort.c is almost certainly faster.)
I propose removing this whole comment block (patch attached), because: * The "NOTE" sentence in parenthesis was actually written by Tom in 1999, as part of the original tuplesort commit. If tuplesort.c was "almost certainly faster" in its first incarnation, what are the chances of it actually still being slower than incremental insertions with presorted input at this point? There were numerous large improvements to tuplesort in the years since 1999. * Even if the original claim was still true all these years later, the considerations for nbtsort.c have changed so much that it couldn't possibly matter. The fact that we're not using shared_buffers to build indexes anymore is a significant advantage for nbtsort.c, independent of sort performance. These days, CREATE INDEX spends most of the time bottlenecked on WAL-logging when building a index against presorted SERIAL-like inputs, especially when parallelism is used. Back in 1999, there was no WAL-logging. -- Peter Geoghegan
From 2a77b947e07dadef953019334062580cad560836 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan <p...@bowt.ie> Date: Sun, 24 Jun 2018 11:41:16 -0700 Subject: [PATCH] Remove obsolete comment block in nbtsort.c. Building a new index through incremental nbtree insertions would always be significantly slower than our actual approach of sorting using tuplesort, and then building and WAL-logging whole pages. Remove a comment block from the Berkeley days claiming that incremental insertions might be slightly faster with presorted input. --- src/backend/access/nbtree/nbtsort.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c index e012df596e..16f5755777 100644 --- a/src/backend/access/nbtree/nbtsort.c +++ b/src/backend/access/nbtree/nbtsort.c @@ -14,15 +14,6 @@ * its parent level. When we have only one page on a level, it must be * the root -- it can be attached to the btree metapage and we are done. * - * This code is moderately slow (~10% slower) compared to the regular - * btree (insertion) build code on sorted or well-clustered data. On - * random data, however, the insertion build code is unusable -- the - * difference on a 60MB heap is a factor of 15 because the random - * probes into the btree thrash the buffer pool. (NOTE: the above - * "10%" estimate is probably obsolete, since it refers to an old and - * not very good external sort implementation that used to exist in - * this module. tuplesort.c is almost certainly faster.) - * * It is not wise to pack the pages entirely full, since then *any* * insertion would cause a split (and not only of the leaf page; the need * for a split would cascade right up the tree). The steady-state load -- 2.17.1