Re: some aspects of our qsort might not be ideal

2022-06-23 Thread Matthias van de Meent
On Thu, 23 Jun 2022 at 17:04, Peter Geoghegan  wrote:
>
> On Thu, Jun 23, 2022 at 7:51 AM Matthias van de Meent
>  wrote:
> > I think that mostly has to do with reliability / stability of ORDER BY
> > in combination with LIMIT and OFFSET, even though right now we cannot
> > fully guarantee such stability due to unstable results from underlying
> > plan nodes.
>
> The current qsort isn't stable.

Then I misunderstood Robert's comment, thanks for correcting me.

- Matthias




Re: some aspects of our qsort might not be ideal

2022-06-23 Thread Peter Geoghegan
On Thu, Jun 23, 2022 at 7:51 AM Matthias van de Meent
 wrote:
> I think that mostly has to do with reliability / stability of ORDER BY
> in combination with LIMIT and OFFSET, even though right now we cannot
> fully guarantee such stability due to unstable results from underlying
> plan nodes.

The current qsort isn't stable. While quicksort is never stable, our
particular implementation has fairly significant optimizations that
strongly rely on not using a stable sort. In particular, the B
optimizations for duplicate elements.

These optimizations make things like datum tuplesorts for
'SELECT(DISTINCT mycol) ...' style queries on low cardinality columns
extremely fast. We're not really sorting so much as bucketing. This is
based on Dijkstra's Dutch national flag problem.

-- 
Peter Geoghegan




Re: some aspects of our qsort might not be ideal

2022-06-23 Thread Matthias van de Meent
On Thu, 23 Jun 2022 at 15:52, Robert Haas  wrote:
>
> On Thu, Jun 23, 2022 at 6:13 AM John Naylor
>  wrote:
> > Here is a *rough* first pass at dual-pivot quicksort. I haven't
> > changed the regression tests to adjust for qsort being an unstable
> > sort, ...
>
> Hmm. I thought we had some reasons for preferring a stable sort algorithm.

I think that mostly has to do with reliability / stability of ORDER BY
in combination with LIMIT and OFFSET, even though right now we cannot
fully guarantee such stability due to unstable results from underlying
plan nodes.

As an example, a table scan node under a sort node can start its scan
at an arbitrary point in the table (using synchronize_seqscans), and
because Sort nodes only sort MinimalTuple-s, each set of tuples that
have an equal sort value will be ordered by TID + y (mod tablesize),
with arbitrary values for y.

- Matthias




Re: some aspects of our qsort might not be ideal

2022-06-23 Thread Robert Haas
On Thu, Jun 23, 2022 at 6:13 AM John Naylor
 wrote:
> Here is a *rough* first pass at dual-pivot quicksort. I haven't
> changed the regression tests to adjust for qsort being an unstable
> sort, ...

Hmm. I thought we had some reasons for preferring a stable sort algorithm.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: some aspects of our qsort might not be ideal

2022-06-23 Thread John Naylor
On Fri, Jun 3, 2022 at 10:44 AM Peter Geoghegan  wrote:

> What about dual-pivot quicksort, which is used in Java 7+? That is the
> defacto successor to Bentley & McIlroy. In fact, Jon Bentley himself
> collaborated with its author, and provided benchmarking input. The
> underlying philosophy is essentially the same as the original -- it
> is supposed to be an "industrial strength" quicksort, with various
> adversarial cases considered directly.

Here is a *rough* first pass at dual-pivot quicksort. I haven't
changed the regression tests to adjust for qsort being an unstable
sort, and there is some dead code. I looked to a couple JDKs for
examples of design decisions as a first approximation. It includes a
module with a few debugging printouts, run like so: select
debug_qsort_int(array[7,6,5,4,3,2,1]);

Pivot selection: A common way is to pick five candidates (here: mid,
+/- 1/7, +/- 2/7), sort them in place, then pick the 2nd and 4th of
them, or "tertile of five". If they are all evenly spaced, we can do
insertion sort.

Fall back to single-pivot: If any of the pivot candidates are equal,
JDK assumes there could be a large number of duplicates and falls back
to single-pivot, using the median of the five. I've done the same
here. Despite having two code paths in all of our template instances,
the VM binary is only about 5kb bigger, since MED3 is no longer built.

Performance testing: Not started yet. I'm thinking an apples-to-apples
comparison is not insightful enough, since the current insertion sort
threshold 7 is already a bit constrained for single-pivot, and would
be even more so for dual pivot, especially since 5 of the elements are
pre-sorted to choose the pivots. My plan is to retest the threshold on
HEAD using my latest tests, then use that as a starting point to test
thresholds with dual-pivot.

-- 
John Naylor
EDB: http://www.enterprisedb.com
From 5e920d9d3e8d2a2a75e63ade8bc73c8322c1934b Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Mon, 30 May 2022 10:09:17 +0700
Subject: [PATCH v1 1/2] Create internal workhorse for ST_SORT

No functional changes.
---
 src/include/lib/sort_template.h | 36 -
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/src/include/lib/sort_template.h b/src/include/lib/sort_template.h
index 3122a93009..54921de568 100644
--- a/src/include/lib/sort_template.h
+++ b/src/include/lib/sort_template.h
@@ -191,6 +191,7 @@ ST_SCOPE void ST_SORT(ST_ELEMENT_TYPE * first, size_t n
 
 /* sort private helper functions */
 #define ST_MED3 ST_MAKE_NAME(ST_SORT, med3)
+#define ST_SORT_INTERNAL ST_MAKE_NAME(ST_SORT, internal)
 #define ST_SWAP ST_MAKE_NAME(ST_SORT, swap)
 #define ST_SWAPN ST_MAKE_NAME(ST_SORT, swapn)
 
@@ -217,7 +218,7 @@ ST_SCOPE void ST_SORT(ST_ELEMENT_TYPE * first, size_t n
 			ST_SORT_INVOKE_COMPARE		\
 			ST_SORT_INVOKE_ARG)
 #define DO_SORT(a_, n_)	\
-	ST_SORT((a_), (n_)	\
+	ST_SORT_INTERNAL((a_), (n_)	\
 			ST_SORT_INVOKE_ELEMENT_SIZE	\
 			ST_SORT_INVOKE_COMPARE		\
 			ST_SORT_INVOKE_ARG)
@@ -273,15 +274,15 @@ ST_SWAPN(ST_POINTER_TYPE * a, ST_POINTER_TYPE * b, size_t n)
 }
 
 /*
- * Sort an array.
+ * Workhorse for ST_SORT
  */
-ST_SCOPE void
-ST_SORT(ST_ELEMENT_TYPE * data, size_t n
+static void
+ST_SORT_INTERNAL(ST_POINTER_TYPE * data, size_t n
 		ST_SORT_PROTO_ELEMENT_SIZE
 		ST_SORT_PROTO_COMPARE
 		ST_SORT_PROTO_ARG)
 {
-	ST_POINTER_TYPE *a = (ST_POINTER_TYPE *) data,
+	ST_POINTER_TYPE *a = data,
 			   *pa,
 			   *pb,
 			   *pc,
@@ -399,6 +400,30 @@ loop:
 		}
 	}
 }
+
+/*
+ * Sort an array.
+ */
+ST_SCOPE void
+ST_SORT(ST_ELEMENT_TYPE * data, size_t n
+		ST_SORT_PROTO_ELEMENT_SIZE
+		ST_SORT_PROTO_COMPARE
+		ST_SORT_PROTO_ARG)
+{
+	ST_POINTER_TYPE *begin = (ST_POINTER_TYPE *) data;
+
+	DO_SORT(begin, n);
+
+#ifdef USE_ASSERT_CHECKING
+	/* WIP: verify the sorting worked */
+	for (ST_POINTER_TYPE *pm = begin + ST_POINTER_STEP; pm < begin + n * ST_POINTER_STEP;
+		 pm += ST_POINTER_STEP)
+	{
+		if (DO_COMPARE(pm, pm - ST_POINTER_STEP) < 0)
+			Assert(false);
+	}
+#endif			/* USE_ASSERT_CHECKING */
+}
 #endif
 
 #undef DO_CHECK_FOR_INTERRUPTS
@@ -422,6 +447,7 @@ loop:
 #undef ST_POINTER_TYPE
 #undef ST_SCOPE
 #undef ST_SORT
+#undef ST_SORT_INTERNAL
 #undef ST_SORT_INVOKE_ARG
 #undef ST_SORT_INVOKE_COMPARE
 #undef ST_SORT_INVOKE_ELEMENT_SIZE
-- 
2.36.1

From ac7f8aaa5b7257fb84f819b27525e00a5a6ced84 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Thu, 23 Jun 2022 16:07:10 +0700
Subject: [PATCH v1 2/2] Implement dual-pivot quicksort

Choose pivots by running insertion sort on five candidates and choosing
the 2nd and 4th, ("tertile of five"). If any of the five are equal, we
assume the input has many duplicates and fall back to B since it's
optimized for that case.
---
 src/include/lib/sort_template.h   | 191 +-
 src/test/modules/debug_qsort/Makefile |  18 ++
 .../modules/debug_qsort/debug_qsort--1.0.sql  |   1 +
 

Re: some aspects of our qsort might not be ideal

2022-06-02 Thread Peter Geoghegan
On Thu, Jun 2, 2022 at 9:24 PM John Naylor  wrote:
> I had heard of it but not looked into it deeply. I did read that Java
> 7 uses dual pivot quicksort for primitives and timsort for objects. I
> wasn't sure if dual pivot was not good for objects (which could have
> possibly-complex comparators) or if timsort was just simply good for
> Java's use cases. It seems accessible to try doing, so I'll look into
> that.

I think that Timsort is most useful with cases where individual
comparisons are inherently very expensive (perhaps even implemented in
Python), which might be common with objects due to the indirection
that Python (and even Java) impose on objects in general.

I would imagine that this is a lot less relevant in
Postgres/tuplesort, because we have optimizations like abbreviated
keys. And because we're mostly just sorting tuples on one or more
attributes of scalar types.

The abbreviated keys optimization is very much something that comes
from the world of databases, not the world of sorting. It's pretty much a
domain-specific technique. That seems relevant to me.

--
Peter Geoghegan




Re: some aspects of our qsort might not be ideal

2022-06-02 Thread John Naylor
On Fri, Jun 3, 2022 at 10:44 AM Peter Geoghegan  wrote:
>
> What about dual-pivot quicksort, which is used in Java 7+? That is the
> defacto successor to Bentley & McIlroy. In fact, Jon Bentley himself
> collaborated with its author, and provided benchmarking input. The
> underlying philosophy is essentially the same as the original -- it
> is supposed to be an "industrial strength" quicksort, with various
> adversarial cases considered directly.

I had heard of it but not looked into it deeply. I did read that Java
7 uses dual pivot quicksort for primitives and timsort for objects. I
wasn't sure if dual pivot was not good for objects (which could have
possibly-complex comparators) or if timsort was just simply good for
Java's use cases. It seems accessible to try doing, so I'll look into
that.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: some aspects of our qsort might not be ideal

2022-06-02 Thread Peter Geoghegan
On Thu, Jun 2, 2022 at 8:33 PM John Naylor  wrote:
> Attached is a draft series that implements some but not all features
> of pattern-defeating quicksort, namely the ones I thought were
> interesting for us. Recently this quicksort variant got committed for
> the next release of the Go language 1.19 [1] (which in turn was based
> on that of Rust [2]), and that implementation was a helpful additional
> example. The bottom line is that replacing the partitioning scheme
> this way is likely not worth doing because of our particular use
> cases, but along the way I found some other things that might be worth
> doing, so some good may come out of this.

What about dual-pivot quicksort, which is used in Java 7+? That is the
defacto successor to Bentley & McIlroy. In fact, Jon Bentley himself
collaborated with its author, and provided benchmarking input. The
underlying philosophy is essentially the same as the original -- it
is supposed to be an "industrial strength" quicksort, with various
adversarial cases considered directly.

See:

https://www.wild-inter.net/publications/wild-2018b.pdf

https://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf

At one point quite a few years back I planned on investigating it
myself, but never followed through.

--
Peter Geoghegan




Re: some aspects of our qsort might not be ideal

2022-02-16 Thread Robert Haas
On Wed, Feb 16, 2022 at 2:29 AM John Naylor
 wrote:
> Does anyone see a reason not to put in the necessary work to try it out?

Seems reasonable to me. It's always a bit difficult, I feel, to know
what test cases to use - almost any idea is going to have some case
where it's worse than what we do today, and there can always be some
user who does that exact thing 100% of the time. Moreover, it's hard
to be certain that test cases we construct - say, ordered data,
reverse ordered data, randomly ordered data, almost ordered data with
a single element out of place, etc. - are actually covering all of the
interesting cases. At the same time, I don't think anyone would
seriously disagree with what you say in the subject line, and we won't
make any progress by NOT trying things that are recommended in the
academic literature.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: some aspects of our qsort might not be ideal

2022-02-15 Thread John Naylor
> This way, we *dynamically*
> decide to optimistically start an insertion sort, in the hopes that it
> will occasionally prevent a recursion, and worst case the input is
> slightly more sorted for the next recursion.

I should mention one detail -- we put a limit on how many iterations
we attempt before we give up and partition/recurse. The author's
implementation chooses 8 for this limit. The paper mentions this
technique in section 5.2, but is not the origin of it.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: some aspects of our qsort might not be ideal

2022-02-15 Thread John Naylor
> [3] 
> https://www.postgresql.org/message-id/flat/87F42982BF2B434F831FCEF4C45FC33E5BD369DF%40EXCHANGE.corporate.connx.com#e69718293c45d89555020bd0922ad055

The top of that thread is where I meant to point to:

https://www.postgresql.org/message-id/flat/CAEYLb_Xn4-6f1ofsf2qduf24dDCVHbQidt7JPpdL_RiT1zBJ6A%40mail.gmail.com

-- 
John Naylor
EDB: http://www.enterprisedb.com