Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-20 Thread Robert Haas
On Sun, Oct 18, 2015 at 2:52 PM, Peter Geoghegan  wrote:
> On Thu, Oct 15, 2015 at 9:07 AM, Robert Haas  wrote:
>> Would you be willing to try coding it up that way so we can see how it looks?
>
> I attach a patch that does it that way.

That looks good to me.  I've committed this, and the other patch to
remove the obsolete comment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-18 Thread Peter Geoghegan
On Thu, Oct 15, 2015 at 9:07 AM, Robert Haas  wrote:
> Would you be willing to try coding it up that way so we can see how it looks?

I attach a patch that does it that way.

Note that I will be away for until late this month. Do not expect a
response from me before then, unless it's truly urgent.

Thanks

-- 
Peter Geoghegan
From 22936e4d69e74e8e744053024d7feea63d7cea56 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Mon, 12 Oct 2015 17:13:28 -0700
Subject: [PATCH] Correctly distinguish the contents of text buffers

Avoid confusing an original string with a strxfrm() buffer, or
vice-versa.  This could previously occur in the event of interleaving of
authoritative comparisons with conversions to abbreviated keys.  Such
interleaving is possible with external sorts.
---
 src/backend/utils/adt/varlena.c | 29 +
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d545c34..83523f7 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -65,6 +65,7 @@ typedef struct
 	int			last_len1;		/* Length of last buf1 string/strxfrm() blob */
 	int			last_len2;		/* Length of last buf2 string/strxfrm() blob */
 	int			last_returned;	/* Last comparison result (cache) */
+	bool		cache_blob;		/* Does buf2 contain strxfrm() blob, etc? */
 	bool		collate_c;
 	hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
 	hyperLogLogState full_card; /* Full key cardinality state */
@@ -1838,17 +1839,26 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
 		/* Start with invalid values */
 		tss->last_len1 = -1;
 		tss->last_len2 = -1;
+		/* Initialize */
+		tss->last_returned = 0;
 #ifdef HAVE_LOCALE_T
 		tss->locale = locale;
 #endif
 		/*
-		 * To avoid somehow confusing a strxfrm() blob and an original string
-		 * within bttextfastcmp_locale(), initialize last returned cache to a
-		 * sentinel value.  A platform-specific actual strcoll() return value
-		 * of INT_MIN seems unlikely, but if that occurs it will not cause
-		 * wrong answers.
+		 * To avoid somehow confusing a strxfrm() blob and an original string,
+		 * constantly keep track of the variety of data that buf1 and buf2
+		 * currently contain.
+		 *
+		 * Comparisons may be interleaved with conversion calls.  Frequently,
+		 * conversions and comparisons are batched into two distinct phases,
+		 * but the correctness of caching cannot hinge upon this.  For
+		 * comparison caching, buffer state is only trusted if cache_blob is
+		 * found set to false, whereas strxfrm() caching only trusts the state
+		 * when cache_blob is found set to true.
+		 *
+		 * Arbitrarily initialize cache_blob to true.
 		 */
-		tss->last_returned = INT_MIN;
+		tss->cache_blob = true;
 		tss->collate_c = collate_c;
 		ssup->ssup_extra = tss;
 
@@ -1983,7 +1993,7 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		tss->buf2[len2] = '\0';
 		tss->last_len2 = len2;
 	}
-	else if (arg1_match && tss->last_returned != INT_MIN)
+	else if (arg1_match && !tss->cache_blob)
 	{
 		/* Use result cached following last actual strcoll() call */
 		result = tss->last_returned;
@@ -2006,6 +2016,7 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		result = strcmp(tss->buf1, tss->buf2);
 
 	/* Cache result, perhaps saving an expensive strcoll() call next time */
+	tss->cache_blob = false;
 	tss->last_returned = result;
 done:
 	/* We can't afford to leak memory here. */
@@ -2090,7 +2101,7 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 		}
 
 		/* Might be able to reuse strxfrm() blob from last call */
-		if (tss->last_len1 == len &&
+		if (tss->last_len1 == len && tss->cache_blob &&
 			memcmp(tss->buf1, authoritative_data, len) == 0)
 		{
 			memcpy(pres, tss->buf2, Min(sizeof(Datum), tss->last_len2));
@@ -2171,6 +2182,8 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 
 	addHyperLogLog(&tss->abbr_card, hash);
 
+	/* Cache result, perhaps saving an expensive strxfrm() call next time */
+	tss->cache_blob = true;
 done:
 	/*
 	 * Byteswap on little-endian machines.
-- 
1.9.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-15 Thread Robert Haas
On Wed, Oct 14, 2015 at 7:11 PM, Peter Geoghegan  wrote:
> On Wed, Oct 14, 2015 at 4:09 PM, Robert Haas  wrote:
>> I wonder if it wouldn't be better to just add a separate Boolean
>> indicating exactly the thing we care about.  This doesn't seem
>> particularly easy to understand and verify.
>
> I'm not really sure that that's an improvement. But I defer to you.

Would you be willing to try coding it up that way so we can see how it looks?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-14 Thread Peter Geoghegan
On Wed, Oct 14, 2015 at 4:09 PM, Robert Haas  wrote:
> I wonder if it wouldn't be better to just add a separate Boolean
> indicating exactly the thing we care about.  This doesn't seem
> particularly easy to understand and verify.

I'm not really sure that that's an improvement. But I defer to you.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-14 Thread Robert Haas
On Wed, Oct 14, 2015 at 7:05 PM, Peter Geoghegan  wrote:
> On Mon, Oct 12, 2015 at 12:31 PM, Peter Geoghegan  wrote:
>> I'll consider a more comprehensive fix.
>
> I attach a revised fix that considers the problem of misinterpreting
> the contents of the buffers in both directions.

I wonder if it wouldn't be better to just add a separate Boolean
indicating exactly the thing we care about.  This doesn't seem
particularly easy to understand and verify.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-14 Thread Peter Geoghegan
On Mon, Oct 12, 2015 at 12:31 PM, Peter Geoghegan  wrote:
> I'll consider a more comprehensive fix.

I attach a revised fix that considers the problem of misinterpreting
the contents of the buffers in both directions.

Thanks
-- 
Peter Geoghegan
From c2d4558df3ce93b939fa319dd6e0735400833df0 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Mon, 12 Oct 2015 17:13:28 -0700
Subject: [PATCH] Correctly distinguish the contents of text buffers

Avoid confusing an original string with a strxfrm() buffer, or
vice-versa.  This could previously occur in the event of interleaving of
authoritative comparisons with conversions to abbreviated keys.  Such
interleaving is possible with external sorts.
---
 src/backend/utils/adt/varlena.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d545c34..c8574f0 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -64,7 +64,7 @@ typedef struct
 	int			buflen2;
 	int			last_len1;		/* Length of last buf1 string/strxfrm() blob */
 	int			last_len2;		/* Length of last buf2 string/strxfrm() blob */
-	int			last_returned;	/* Last comparison result (cache) */
+	int			last_returned;	/* Last comparison result/caching sentinel */
 	bool		collate_c;
 	hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
 	hyperLogLogState full_card; /* Full key cardinality state */
@@ -1843,10 +1843,17 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
 #endif
 		/*
 		 * To avoid somehow confusing a strxfrm() blob and an original string
-		 * within bttextfastcmp_locale(), initialize last returned cache to a
+		 * within bttextfastcmp_locale(), initialize last_returned cache to a
 		 * sentinel value.  A platform-specific actual strcoll() return value
-		 * of INT_MIN seems unlikely, but if that occurs it will not cause
-		 * wrong answers.
+		 * of INT_MIN seems unlikely, but will be immediately changed to -1 to
+		 * remove the possibility of wrong answers.
+		 *
+		 * Comparisons that attempt to reuse last_returned may be interleaved
+		 * with conversion calls.  Frequently, conversions and comparisons are
+		 * batched into two distinct phases, but the correctness of caching
+		 * cannot hinge upon this.  For comparison caching we only trust the
+		 * cache when last_returned != INT_MIN, while for strxfrm() caching we
+		 * only trust the cache when last_returned == INT_MIN.
 		 */
 		tss->last_returned = INT_MIN;
 		tss->collate_c = collate_c;
@@ -1931,9 +1938,9 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 	if (len1 == len2 && memcmp(a1p, a2p, len1) == 0)
 	{
 		/*
-		 * No change in buf1 or buf2 contents, so avoid changing last_len1 or
-		 * last_len2.  Existing contents of buffers might still be used by next
-		 * call.
+		 * No change in buf1 or buf2 contents, so avoid changing last_len1,
+		 * last_len2, or last_returned.  Existing contents of buffers might
+		 * still be used by next call.
 		 */
 		result = 0;
 		goto done;
@@ -2005,7 +2012,12 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 	if (result == 0)
 		result = strcmp(tss->buf1, tss->buf2);
 
-	/* Cache result, perhaps saving an expensive strcoll() call next time */
+	/*
+	 * Cache result, perhaps saving an expensive strcoll() call next time.
+	 * Interpret a strcoll() return value that happens to be the sentinel value
+	 * INT_MIN as -1.
+	 */
+	result = Max(-1, result);
 	tss->last_returned = result;
 done:
 	/* We can't afford to leak memory here. */
@@ -2091,6 +2103,7 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 
 		/* Might be able to reuse strxfrm() blob from last call */
 		if (tss->last_len1 == len &&
+			tss->last_returned == INT_MIN &&
 			memcmp(tss->buf1, authoritative_data, len) == 0)
 		{
 			memcpy(pres, tss->buf2, Min(sizeof(Datum), tss->last_len2));
@@ -2173,6 +2186,12 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 
 done:
 	/*
+	 * Set last_returned to sentinel value to indicate that buffers store
+	 * strxfrm() original string and blob
+	 */
+	tss->last_returned = INT_MIN;
+
+	/*
 	 * Byteswap on little-endian machines.
 	 *
 	 * This is needed so that bttextcmp_abbrev() (an unsigned integer 3-way
-- 
1.9.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-12 Thread Peter Geoghegan
On Mon, Oct 12, 2015 at 4:15 PM, Robert Haas  wrote:
> In this case, I think
> the best thing for me to do right now is wait to commit anything
> further until you have had a chance to go over this and come up with a
> fix or set of fixes that you think are completely, 100% ready to go;
> or else until you get to the point of wanting feedback before
> proceeding further.  In general, I would appreciate if this sort of
> post-commit self-review could be done pre-commit.

This came to me while I was making dinner last night - I was not
actually poring over the code on my computer. I don't know why I
thought of it then and not at any earlier point, but it seems
reasonable to suppose it had something to do with perspective, or
being able to see a bigger picture that is difficult to keep in mind
during initial development and self review. I was mostly thinking
about external sorting at the time, and not this patch.

I cannot do that on demand. It isn't a matter of effort. When I come
back from vacation at the end of the month, I may be able to do
somewhat better for a few months.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-12 Thread Robert Haas
On Mon, Oct 12, 2015 at 3:31 PM, Peter Geoghegan  wrote:
> On Mon, Oct 12, 2015 at 12:47 AM, Peter Geoghegan  wrote:
>> I also noticed that I failed to reset the last_returned strcoll()
>> cache variable as part of an abbreviation call, despite the fact that
>> tapesort may freely interleave conversions with comparisons, while
>> reusing buf1 and buf2 both as scratch space for strxfrm() blobs, as
>> well as for storing strings to be compared with strcoll(). I suggest
>> that the attached patch also be applied to fix this issue.
>
> I think that I jumped the gun with this fix, because theoretically you
> can still get the same problem in the opposite direction -- an
> original string treated as a strxfrm() blob when the cache is
> consulted.
>
> I'll consider a more comprehensive fix.

This is not the first time I've committed one of your patches and
promptly received a whole series of emails with fixes for what went
into that commit, despite the fact that I have not and did not change
the relevant parts of the patch while committing.  Since that makes
more work for me, I am not a huge fan, and the practical effect is to
subtract from the amount of time that could otherwise have been spent
reviewing your next patch (or someone else's).  In this case, I think
the best thing for me to do right now is wait to commit anything
further until you have had a chance to go over this and come up with a
fix or set of fixes that you think are completely, 100% ready to go;
or else until you get to the point of wanting feedback before
proceeding further.  In general, I would appreciate if this sort of
post-commit self-review could be done pre-commit.

I apologize for the fact that this email probably sounds grouchy.
Please try to read it in the most positive light possible (and don't
shoot me).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-12 Thread Peter Geoghegan
On Mon, Oct 12, 2015 at 12:47 AM, Peter Geoghegan  wrote:
> I also noticed that I failed to reset the last_returned strcoll()
> cache variable as part of an abbreviation call, despite the fact that
> tapesort may freely interleave conversions with comparisons, while
> reusing buf1 and buf2 both as scratch space for strxfrm() blobs, as
> well as for storing strings to be compared with strcoll(). I suggest
> that the attached patch also be applied to fix this issue.

I think that I jumped the gun with this fix, because theoretically you
can still get the same problem in the opposite direction -- an
original string treated as a strxfrm() blob when the cache is
consulted.

I'll consider a more comprehensive fix.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-12 Thread Peter Geoghegan
On Sat, Oct 10, 2015 at 12:32 PM, Peter Geoghegan  wrote:
> I noticed that there is still one comment that I really should have
> removed as part of this work.

I also noticed that I failed to reset the last_returned strcoll()
cache variable as part of an abbreviation call, despite the fact that
tapesort may freely interleave conversions with comparisons, while
reusing buf1 and buf2 both as scratch space for strxfrm() blobs, as
well as for storing strings to be compared with strcoll(). I suggest
that the attached patch also be applied to fix this issue.

-- 
Peter Geoghegan
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d545c34..9725a5f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -2173,6 +2173,12 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 
 done:
 	/*
+	 * Set last_returned to sentinel value.  Comparisons that attempt to reuse
+	 * last_returned may be interleaved with calls here.
+	 */
+	tss->last_returned = INT_MIN;
+
+	/*
 	 * Byteswap on little-endian machines.
 	 *
 	 * This is needed so that bttextcmp_abbrev() (an unsigned integer 3-way

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-10 Thread Peter Geoghegan
On Fri, Oct 9, 2015 at 5:54 PM, Robert Haas  wrote:
> I think that is true.  I spent some time thinking about whether the
> way you used INT_MIN as a sentinel value should be changed around
> somehow, but ultimately I decided that it wasn't too bad and that
> suggesting something else would be pointless kibitzing.  I also tried
> to think of scenarios in which this would lose, and I'm not totally
> convinced that there aren't any, but I'm convinced that, if they
> exist, I don't know what they are.  Since the patch did deliver a
> small improvement on my test cases and on yours, I think we might as
> well have it in the tree.  If some pathological scenario shows up
> where it turns out to hurt, we can always fix it then, or revert if it
> need be.

That seems very reasonable.

I noticed that there is still one comment that I really should have
removed as part of this work. The comment didn't actually add any new
information for 9.5, but is now obsolete. Attached patch removes it
entirely.

-- 
Peter Geoghegan
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d545c34..3978b1e 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -2056,10 +2056,6 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 	int			len;
 	uint32		hash;
 
-	/*
-	 * Abbreviated key representation is a pass-by-value Datum that is treated
-	 * as a char array by the specialized comparator bttextcmp_abbrev().
-	 */
 	pres = (char *) &res;
 	/* memset(), so any non-overwritten bytes are NUL */
 	memset(pres, 0, sizeof(Datum));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Robert Haas
On Fri, Oct 9, 2015 at 7:49 PM, Peter Geoghegan  wrote:
> On Fri, Oct 9, 2015 at 4:39 PM, Robert Haas  wrote:
>> You're welcome.  After some study and experimentation, I've committed
>> the other part also.
>
> Fantastic. I guess the precedent of the 9.5 text equality fast path
> made this discussion way smoother than last time, since essentially
> the same principle applies.

I think that is true.  I spent some time thinking about whether the
way you used INT_MIN as a sentinel value should be changed around
somehow, but ultimately I decided that it wasn't too bad and that
suggesting something else would be pointless kibitzing.  I also tried
to think of scenarios in which this would lose, and I'm not totally
convinced that there aren't any, but I'm convinced that, if they
exist, I don't know what they are.  Since the patch did deliver a
small improvement on my test cases and on yours, I think we might as
well have it in the tree.  If some pathological scenario shows up
where it turns out to hurt, we can always fix it then, or revert if it
need be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Peter Geoghegan
On Fri, Oct 9, 2015 at 4:39 PM, Robert Haas  wrote:
> You're welcome.  After some study and experimentation, I've committed
> the other part also.

Fantastic. I guess the precedent of the 9.5 text equality fast path
made this discussion way smoother than last time, since essentially
the same principle applies.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Robert Haas
On Fri, Oct 9, 2015 at 3:33 PM, Peter Geoghegan  wrote:
> On Fri, Oct 9, 2015 at 12:11 PM, Robert Haas  wrote:
>> OK, committed that way.
>
> Thank you.

You're welcome.  After some study and experimentation, I've committed
the other part also.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Peter Geoghegan
On Fri, Oct 9, 2015 at 12:11 PM, Robert Haas  wrote:
> OK, committed that way.

Just for the record, with the same "cities" table as my original post
to this thread, this query:

select count(distinct(city)) from cities;

Goes from taking about 296ms (once it stabilizes), to about 265ms
(once it stabilizes) following today's commit of just the unsigned
integer comparison patch. I've shaved just over 10% off the duration
of this representative sort-heavy query (against a 9.5 baseline),
which is nice.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Peter Geoghegan
On Fri, Oct 9, 2015 at 12:11 PM, Robert Haas  wrote:
> OK, committed that way.

Thank you.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Robert Haas
On Fri, Oct 9, 2015 at 2:48 PM, Peter Geoghegan  wrote:
> On Fri, Oct 9, 2015 at 11:44 AM, Robert Haas  wrote:
>> Hmm.  But then this doesn't seem to make much sense:
>>
>> + * Rearrange the bytes of a Datum into little-endian order from big-endian
>> + * order.  On big-endian machines, this does nothing at all.
>>
>> Rearranging bytes into little-endian order ought to be a no-op on a
>> little-endian machine; and rearranging them into big-endian order
>> ought to be a no-op on a big-endian machine.
>
> I think that that's very clearly implied anyway.
>
>> Thinking about this a bit more, it seems like the situation we're in
>> here is that the input datum is always going to be big-endian.
>> Regardless of what the machine's integer format is, the sortsupport
>> abbreviator is going to output a Datum where the most significant byte
>> is the first one stored in the datum.  We want to convert that Datum
>> to one that has *native* endianness.  So maybe we should call this
>> DatumBigEndianToNative or something like that.
>
> I'd be fine with DatumBigEndianToNative() -- I agree that that's
> slightly better.

OK, committed that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Peter Geoghegan
On Fri, Oct 9, 2015 at 11:44 AM, Robert Haas  wrote:
> Hmm.  But then this doesn't seem to make much sense:
>
> + * Rearrange the bytes of a Datum into little-endian order from big-endian
> + * order.  On big-endian machines, this does nothing at all.
>
> Rearranging bytes into little-endian order ought to be a no-op on a
> little-endian machine; and rearranging them into big-endian order
> ought to be a no-op on a big-endian machine.

I think that that's very clearly implied anyway.

> Thinking about this a bit more, it seems like the situation we're in
> here is that the input datum is always going to be big-endian.
> Regardless of what the machine's integer format is, the sortsupport
> abbreviator is going to output a Datum where the most significant byte
> is the first one stored in the datum.  We want to convert that Datum
> to one that has *native* endianness.  So maybe we should call this
> DatumBigEndianToNative or something like that.

I'd be fine with DatumBigEndianToNative() -- I agree that that's
slightly better.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-09 Thread Robert Haas
On Thu, Oct 8, 2015 at 8:20 PM, Peter Geoghegan  wrote:
> I should point out that I did not call the macro DatumToBigEndian(),
> because it's actually the other way around. I called it
> DatumToLittleEndian(), since the unsigned integer comparator would
> work correctly on big-endian systems without calling any new macro
> (which is of course why the new macro does nothing on big-endian
> systems). We start off with a big endian Datum/unsigned integer on all
> platforms, and then we byteswap it to make it a little-endian unsigned
> integer if and when that's required (i.e. only on little-endian
> systems).

Hmm.  But then this doesn't seem to make much sense:

+ * Rearrange the bytes of a Datum into little-endian order from big-endian
+ * order.  On big-endian machines, this does nothing at all.

Rearranging bytes into little-endian order ought to be a no-op on a
little-endian machine; and rearranging them into big-endian order
ought to be a no-op on a big-endian machine.

Thinking about this a bit more, it seems like the situation we're in
here is that the input datum is always going to be big-endian.
Regardless of what the machine's integer format is, the sortsupport
abbreviator is going to output a Datum where the most significant byte
is the first one stored in the datum.  We want to convert that Datum
to one that has *native* endianness.  So maybe we should call this
DatumBigEndianToNative or something like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-08 Thread Peter Geoghegan
On Thu, Oct 8, 2015 at 11:37 AM, Robert Haas  wrote:
> I'm not convinced.  Doesn't this exact same concept get used for
> over-the-wire communication between BE and LE machines?  There, this
> operation is spelled htonl/ntohl.  Some systems even have htonll, but
> I'm sure there are still a bunch that don't.

I continue to disagree with that. The spelling of the macro that you
propose suggests that this process occurs at a relatively high level
of abstraction, which is misleading. Datums that have abbreviated key
bytes packed into them are in general kind of special. All the same,
here is a revision of the patch series along those lines. I'll also
have to update the UUID patch to independently note the same issues.

I should point out that I did not call the macro DatumToBigEndian(),
because it's actually the other way around. I called it
DatumToLittleEndian(), since the unsigned integer comparator would
work correctly on big-endian systems without calling any new macro
(which is of course why the new macro does nothing on big-endian
systems). We start off with a big endian Datum/unsigned integer on all
platforms, and then we byteswap it to make it a little-endian unsigned
integer if and when that's required (i.e. only on little-endian
systems).

-- 
Peter Geoghegan
From ddc2bce56f8d375a22d7a635dd402ad1e507b85c Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sat, 3 Oct 2015 16:11:02 -0700
Subject: [PATCH 2/3] Add two text sort caching optimizations

Firstly, cache strxfrm() blobs across calls made to the text SortSupport
abbreviation routine.  Secondly, cache strcoll() results across calls to
the text comparison routine (SortSupport authoritative comparator only).

The cost of doing this should be immeasurably small in cases where the
optimization does not help, while the benefits in cases where the
optimization is effective should be quite significant.
---
 src/backend/utils/adt/varlena.c | 75 ++---
 1 file changed, 71 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index fadd827..d814502 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -62,6 +62,9 @@ typedef struct
 	char	   *buf2;			/* 2nd string, or abbreviation strxfrm() buf */
 	int			buflen1;
 	int			buflen2;
+	int			last_len1;		/* Length of last buf1 string/strxfrm() blob */
+	int			last_len2;		/* Length of last buf2 string/strxfrm() blob */
+	int			last_returned;	/* Last comparison result (cache) */
 	bool		collate_c;
 	hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
 	hyperLogLogState full_card; /* Full key cardinality state */
@@ -1832,9 +1835,20 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
 		tss->buflen1 = TEXTBUFLEN;
 		tss->buf2 = palloc(TEXTBUFLEN);
 		tss->buflen2 = TEXTBUFLEN;
+		/* Start with invalid values */
+		tss->last_len1 = -1;
+		tss->last_len2 = -1;
 #ifdef HAVE_LOCALE_T
 		tss->locale = locale;
 #endif
+		/*
+		 * To avoid somehow confusing a strxfrm() blob and an original string
+		 * within bttextfastcmp_locale(), initialize last returned cache to a
+		 * sentinel value.  A platform-specific actual strcoll() return value
+		 * of INT_MIN seems unlikely, but if that occurs it will not cause
+		 * wrong answers.
+		 */
+		tss->last_returned = INT_MIN;
 		tss->collate_c = collate_c;
 		ssup->ssup_extra = tss;
 
@@ -1897,6 +1911,7 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 {
 	text	   *arg1 = DatumGetTextPP(x);
 	text	   *arg2 = DatumGetTextPP(y);
+	bool		arg1_match;
 	TextSortSupport *tss = (TextSortSupport *) ssup->ssup_extra;
 
 	/* working state */
@@ -1915,6 +1930,11 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 	/* Fast pre-check for equality, as discussed in varstr_cmp() */
 	if (len1 == len2 && memcmp(a1p, a2p, len1) == 0)
 	{
+		/*
+		 * No change in buf1 or buf2 contents, so avoid changing last_len1 or
+		 * last_len2.  Existing contents of buffers might still be used by next
+		 * call.
+		 */
 		result = 0;
 		goto done;
 	}
@@ -1932,10 +1952,43 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		tss->buf2 = MemoryContextAlloc(ssup->ssup_cxt, tss->buflen2);
 	}
 
-	memcpy(tss->buf1, a1p, len1);
-	tss->buf1[len1] = '\0';
-	memcpy(tss->buf2, a2p, len2);
-	tss->buf2[len2] = '\0';
+	/*
+	 * We're likely to be asked to compare the same strings repeatedly, and
+	 * memcmp() is so much cheaper than strcoll() that it pays to try to cache
+	 * comparisons, even though in general there is no reason to think that
+	 * that will work out (every text datum may be unique).  Caching does not
+	 * slow things down measurably when it doesn't work out, and can speed
+	 * things up by rather a lot when it does.  In part, this is because the
+	 * memcmp() compares data from cachelines that are needed in L1 cache even
+	 * when the last comparison's result cannot be reused.
+	 */
+	arg1_match = true;
+	if (len1 != tss->last_len1 

Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-08 Thread Robert Haas
On Thu, Oct 8, 2015 at 2:07 PM, Peter Geoghegan  wrote:
> On Thu, Oct 8, 2015 at 10:13 AM, Robert Haas  wrote:
>> It seems to me that (1) ABBREV_STRING_UINT isn't
>> a great name for this and (2) the comment is awfully long for the
>> thing to which it refers.  I suggest that we instead call it
>> DatumToBigEndian(), put it pg_bswap.h, and change the comments to
>> something like this:
>>
>> /*
>>  * Rearrange the bytes of a Datum into big-endian order.
>>  *
>>  * One possible application of this macro is to make comparisons
>> cheaper.  An integer
>>  * comparison of the new Datums will return the same result as a memcmp() on 
>> the
>>  * original Datums, but the integer comparison should be much cheaper.
>>  */
>>
>> The specific way that this is used by various sortsupport routines can
>> be adequately explained in the comments for those routines.
>
> This is pretty clearly something specific to SortSupport. I'm not
> opposed to changing the name and making the comments more terse along
> those lines, but I think it should live in sortsupport.h. The macro
> byteswaps datums on little-endian platforms only, which seems very
> specific.
>
> I think that we're going to have SortSupport with abbreviation for
> UUIDs and bytea at some point, and maybe character(n). Centralizing
> information about this to sortsupport.h makes sense to me.

I'm not convinced.  Doesn't this exact same concept get used for
over-the-wire communication between BE and LE machines?  There, this
operation is spelled htonl/ntohl.  Some systems even have htonll, but
I'm sure there are still a bunch that don't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-08 Thread Peter Geoghegan
On Thu, Oct 8, 2015 at 10:13 AM, Robert Haas  wrote:
> It seems to me that (1) ABBREV_STRING_UINT isn't
> a great name for this and (2) the comment is awfully long for the
> thing to which it refers.  I suggest that we instead call it
> DatumToBigEndian(), put it pg_bswap.h, and change the comments to
> something like this:
>
> /*
>  * Rearrange the bytes of a Datum into big-endian order.
>  *
>  * One possible application of this macro is to make comparisons
> cheaper.  An integer
>  * comparison of the new Datums will return the same result as a memcmp() on 
> the
>  * original Datums, but the integer comparison should be much cheaper.
>  */
>
> The specific way that this is used by various sortsupport routines can
> be adequately explained in the comments for those routines.

This is pretty clearly something specific to SortSupport. I'm not
opposed to changing the name and making the comments more terse along
those lines, but I think it should live in sortsupport.h. The macro
byteswaps datums on little-endian platforms only, which seems very
specific.

I think that we're going to have SortSupport with abbreviation for
UUIDs and bytea at some point, and maybe character(n). Centralizing
information about this to sortsupport.h makes sense to me.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-08 Thread Robert Haas
On Wed, Oct 7, 2015 at 8:09 PM, Peter Geoghegan  wrote:
> On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas  wrote:
>> If you would care to revise the patch accordingly, I will commit it
>> (barring objections from others, of course).
>
> Here is a revision of 0001-*, with both BSWAP32() and BSWAP64() in a
> new header, src/port/pg_bswap.h.
>
> No revisions were required to any other patch in the patch series to
> make this work, and so I only include a revised 0001-*.

Great.  I've committed that, minus the sortsupport.h changes which I
think should be part of 0002, and which in any case I'd like to
discuss a bit more.  It seems to me that (1) ABBREV_STRING_UINT isn't
a great name for this and (2) the comment is awfully long for the
thing to which it refers.  I suggest that we instead call it
DatumToBigEndian(), put it pg_bswap.h, and change the comments to
something like this:

/*
 * Rearrange the bytes of a Datum into big-endian order.
 *
 * One possible application of this macro is to make comparisons
cheaper.  An integer
 * comparison of the new Datums will return the same result as a memcmp() on the
 * original Datums, but the integer comparison should be much cheaper.
 */

The specific way that this is used by various sortsupport routines can
be adequately explained in the comments for those routines.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-07 Thread Peter Geoghegan
On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas  wrote:
> If you would care to revise the patch accordingly, I will commit it
> (barring objections from others, of course).

Here is a revision of 0001-*, with both BSWAP32() and BSWAP64() in a
new header, src/port/pg_bswap.h.

No revisions were required to any other patch in the patch series to
make this work, and so I only include a revised 0001-*.

-- 
Peter Geoghegan
From f671bbccc1eb2a7ecb5c64925630a5593f888a9d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Mon, 29 Jun 2015 23:53:05 -0400
Subject: [PATCH 1/4] Provide for unsigned comparisons of abbreviated keys

Add ABBREV_STRING_UINT() macro to allow string-like types to represent
abbreviated keys as unsigned integers.  ABBREV_STRING_UINT() will use
the new BSWAP64() byte swapping macro (also introduced in this commit)
on 64-bit little-endian platforms, or the existing BSWAP32() macro on
32-bit platforms.

In future commits, certain types with abbreviation support will call
ABBREV_STRING_UINT() to perform a final conversion process on their
string-like abbreviated keys.  This makes it safe and portable to
implement the abbreviated key comparator as a simple 3-way unsigned
integer comparator, which must also be changed.  By using an
exceptionally cheap abbreviated comparator, a significant boost in sort
performance can be seen for these types on at least some platforms.
---
 config/c-compiler.m4| 18 ++
 configure   | 24 +++
 configure.in|  1 +
 src/include/pg_config.h.in  |  3 +++
 src/include/pg_config.h.win32   |  3 +++
 src/include/port/pg_bswap.h | 46 +++
 src/include/port/pg_crc32c.h| 12 ++
 src/include/utils/sortsupport.h | 53 +
 8 files changed, 150 insertions(+), 10 deletions(-)
 create mode 100644 src/include/port/pg_bswap.h

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 9feec0c..550d034 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -214,6 +214,24 @@ fi])# PGAC_C_BUILTIN_BSWAP32
 
 
 
+# PGAC_C_BUILTIN_BSWAP64
+# -
+# Check if the C compiler understands __builtin_bswap64(),
+# and define HAVE__BUILTIN_BSWAP64 if so.
+AC_DEFUN([PGAC_C_BUILTIN_BSWAP64],
+[AC_CACHE_CHECK(for __builtin_bswap64, pgac_cv__builtin_bswap64,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_bswap64(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_bswap64=yes],
+[pgac_cv__builtin_bswap64=no])])
+if test x"$pgac_cv__builtin_bswap64" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_BSWAP64, 1,
+  [Define to 1 if your compiler understands __builtin_bswap64.])
+fi])# PGAC_C_BUILTIN_BSWAP64
+
+
+
 # PGAC_C_BUILTIN_CONSTANT_P
 # -
 # Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 99ef10f..b771a83 100755
--- a/configure
+++ b/configure
@@ -11259,6 +11259,30 @@ if test x"$pgac_cv__builtin_bswap32" = xyes ; then
 $as_echo "#define HAVE__BUILTIN_BSWAP32 1" >>confdefs.h
 
 fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_bswap64" >&5
+$as_echo_n "checking for __builtin_bswap64... " >&6; }
+if ${pgac_cv__builtin_bswap64+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+static unsigned long int x = __builtin_bswap64(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv__builtin_bswap64=yes
+else
+  pgac_cv__builtin_bswap64=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_bswap64" >&5
+$as_echo "$pgac_cv__builtin_bswap64" >&6; }
+if test x"$pgac_cv__builtin_bswap64" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+
+fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
 $as_echo_n "checking for __builtin_constant_p... " >&6; }
 if ${pgac_cv__builtin_constant_p+:} false; then :
diff --git a/configure.in b/configure.in
index 4143451..b5868b0 100644
--- a/configure.in
+++ b/configure.in
@@ -1317,6 +1317,7 @@ PGAC_C_FUNCNAME_SUPPORT
 PGAC_C_STATIC_ASSERT
 PGAC_C_TYPES_COMPATIBLE
 PGAC_C_BUILTIN_BSWAP32
+PGAC_C_BUILTIN_BSWAP64
 PGAC_C_BUILTIN_CONSTANT_P
 PGAC_C_BUILTIN_UNREACHABLE
 PGAC_C_VA_ARGS
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index dda73a8..a20e0cd 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -660,6 +660,9 @@
 /* Define to 1 if your compiler understands __builtin_bswap32. */
 #undef HAVE__BUILTIN_BSWAP32
 
+/* Define to 1 if your compiler understands __builtin_bswap64. */
+#undef HAVE__BUILTIN_BSWAP64
+
 /* Define to 1 if your compiler understands __builtin_constant_p. */
 #undef HAVE__BUILTIN_CONSTANT_P
 
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.wi

Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-06 Thread Robert Haas
On Tue, Oct 6, 2015 at 4:26 PM, Peter Geoghegan  wrote:
> On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas  wrote:
>>> I guess I imagined that bswap64() was fundamental infrastructure, but
>>> on second thought that's not actually in evidence -- it is not already
>>> needed in plenty of other places. So yeah, works for me.
>>
>> If you would care to revise the patch accordingly, I will commit it
>> (barring objections from others, of course).
>
> Sure. It might take me a couple of days to get around to it, though --
> things are a bit hectic here.

I know the feeling.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-06 Thread Peter Geoghegan
On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas  wrote:
>> I guess I imagined that bswap64() was fundamental infrastructure, but
>> on second thought that's not actually in evidence -- it is not already
>> needed in plenty of other places. So yeah, works for me.
>
> If you would care to revise the patch accordingly, I will commit it
> (barring objections from others, of course).

Sure. It might take me a couple of days to get around to it, though --
things are a bit hectic here.

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-06 Thread Robert Haas
On Tue, Oct 6, 2015 at 4:09 PM, Peter Geoghegan  wrote:
> On Tue, Oct 6, 2015 at 1:07 PM, Robert Haas  wrote:
>> Reviewing 0001, I'm happy to see us add bswap64, but I'm not sure we
>> should put it in c.h, because that's included by absolutely
>> everything.  How about putting it in a new #include inside src/port,
>> like src/port/pg_bswap.h?  Then pg_crc.h can include that, but other
>> things can, too.
>
> I guess I imagined that bswap64() was fundamental infrastructure, but
> on second thought that's not actually in evidence -- it is not already
> needed in plenty of other places. So yeah, works for me.

If you would care to revise the patch accordingly, I will commit it
(barring objections from others, of course).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-06 Thread Peter Geoghegan
On Tue, Oct 6, 2015 at 1:07 PM, Robert Haas  wrote:
> Reviewing 0001, I'm happy to see us add bswap64, but I'm not sure we
> should put it in c.h, because that's included by absolutely
> everything.  How about putting it in a new #include inside src/port,
> like src/port/pg_bswap.h?  Then pg_crc.h can include that, but other
> things can, too.

I guess I imagined that bswap64() was fundamental infrastructure, but
on second thought that's not actually in evidence -- it is not already
needed in plenty of other places. So yeah, works for me.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-06 Thread Robert Haas
On Sun, Oct 4, 2015 at 2:17 AM, Peter Geoghegan  wrote:
> On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas  wrote:
>> Some comments:
>
> I attach a new version of the patch series that incorporates all your
> feedback. The patch series is now made cumulative in a way that makes
> it easy for someone to independently commit the unsigned integer
> comparison optimization for text, and nothing else. The macro that
> uses is in a dedicated header now, because I have another patch
> (SortSupport for the UUID type) that uses the same optimization for
> the same reason. It seems like something that will probably end up
> with a third or forth client before too long, so I think the byte swap
> macro wrapper belongs in sortsupport.h.

Reviewing 0001, I'm happy to see us add bswap64, but I'm not sure we
should put it in c.h, because that's included by absolutely
everything.  How about putting it in a new #include inside src/port,
like src/port/pg_bswap.h?  Then pg_crc.h can include that, but other
things can, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-10-03 Thread Peter Geoghegan
On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas  wrote:
> Some comments:

I attach a new version of the patch series that incorporates all your
feedback. The patch series is now made cumulative in a way that makes
it easy for someone to independently commit the unsigned integer
comparison optimization for text, and nothing else. The macro that
uses is in a dedicated header now, because I have another patch
(SortSupport for the UUID type) that uses the same optimization for
the same reason. It seems like something that will probably end up
with a third or forth client before too long, so I think the byte swap
macro wrapper belongs in sortsupport.h.

BTW, I think that in practice the merge phase of a tape sort isn't
much helped by comparison caching, contrary to comments appearing in
the original version. The heap data structure used by polyphase merge
has bad properties around locality (both temporal and spatial). I'm
thinking about independently addressing that problem. I now make no
claims about it in this patch.

-- 
Peter Geoghegan
From f9a95e0b930c43a3d5c424df5cd5b3ddb2302763 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sat, 3 Oct 2015 16:11:02 -0700
Subject: [PATCH 3/4] Add two text sort caching optimizations

Firstly, cache strxfrm() blobs across calls made to the text SortSupport
abbreviation routine.  Secondly, cache strcoll() results across calls to
the text comparison routine (SortSupport authoritative comparator only).

The cost of doing this should be immeasurably small in cases where the
optimization does not help, while the benefits in cases where the
optimization is effective should be quite significant.
---
 src/backend/utils/adt/varlena.c | 75 ++---
 1 file changed, 71 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 63ca437..da55a84 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -61,6 +61,9 @@ typedef struct
 	char	   *buf2;			/* 2nd string, or abbreviation strxfrm() buf */
 	int			buflen1;
 	int			buflen2;
+	int			last_len1;		/* Length of last buf1 string/strxfrm() blob */
+	int			last_len2;		/* Length of last buf2 string/strxfrm() blob */
+	int			last_returned;	/* Last comparison result (cache) */
 	bool		collate_c;
 	hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
 	hyperLogLogState full_card; /* Full key cardinality state */
@@ -1831,9 +1834,20 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
 		tss->buflen1 = TEXTBUFLEN;
 		tss->buf2 = palloc(TEXTBUFLEN);
 		tss->buflen2 = TEXTBUFLEN;
+		/* Start with invalid values */
+		tss->last_len1 = -1;
+		tss->last_len2 = -1;
 #ifdef HAVE_LOCALE_T
 		tss->locale = locale;
 #endif
+		/*
+		 * To avoid somehow confusing a strxfrm() blob and an original string
+		 * within bttextfastcmp_locale(), initialize last returned cache to a
+		 * sentinel value.  A platform-specific actual strcoll() return value
+		 * of INT_MIN seems unlikely, but if that occurs it will not cause
+		 * wrong answers.
+		 */
+		tss->last_returned = INT_MIN;
 		tss->collate_c = collate_c;
 		ssup->ssup_extra = tss;
 
@@ -1896,6 +1910,7 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 {
 	text	   *arg1 = DatumGetTextPP(x);
 	text	   *arg2 = DatumGetTextPP(y);
+	bool		arg1_match;
 	TextSortSupport *tss = (TextSortSupport *) ssup->ssup_extra;
 
 	/* working state */
@@ -1914,6 +1929,11 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 	/* Fast pre-check for equality, as discussed in varstr_cmp() */
 	if (len1 == len2 && memcmp(a1p, a2p, len1) == 0)
 	{
+		/*
+		 * No change in buf1 or buf2 contents, so avoid changing last_len1 or
+		 * last_len2.  Existing contents of buffers might still be used by next
+		 * call.
+		 */
 		result = 0;
 		goto done;
 	}
@@ -1931,10 +1951,43 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		tss->buf2 = MemoryContextAlloc(ssup->ssup_cxt, tss->buflen2);
 	}
 
-	memcpy(tss->buf1, a1p, len1);
-	tss->buf1[len1] = '\0';
-	memcpy(tss->buf2, a2p, len2);
-	tss->buf2[len2] = '\0';
+	/*
+	 * We're likely to be asked to compare the same strings repeatedly, and
+	 * memcmp() is so much cheaper than strcoll() that it pays to try to cache
+	 * comparisons, even though in general there is no reason to think that
+	 * that will work out (every text datum may be unique).  Caching does not
+	 * slow things down measurably when it doesn't work out, and can speed
+	 * things up by rather a lot when it does.  In part, this is because the
+	 * memcmp() compares data from cachelines that are needed in L1 cache even
+	 * when the last comparison's result cannot be reused.
+	 */
+	arg1_match = true;
+	if (len1 != tss->last_len1 || memcmp(tss->buf1, a1p, len1) != 0)
+	{
+		arg1_match = false;
+		memcpy(tss->buf1, a1p, len1);
+		tss->buf1[len1] = '\0';
+		tss->last_len1 = len1;
+	}
+
+	/*
+	 * If we're comparing the same two strings as last time, we can return the

Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-08-04 Thread Peter Geoghegan
On Tue, Aug 4, 2015 at 1:30 PM, Peter Geoghegan  wrote:
>> 2. I believe the change to bttextcmp_abbrev() should be pulled out
>> into a separate patch and committed separately.  That part  seems like
>> a slam dunk.
>
> Makes sense.

BTW, I want to put the string_uint() macro in a common header now. It
can be used for other types. I've written a SortSupport + abbreviated
keys patch for the UUID type which will use it, too, so that it too
can use simple unsigned integer comparisons within its abbreviated
comparator. I haven't posted the UUID patch yet only because everyone
is up to their ears in my sorting patches.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-08-04 Thread Peter Geoghegan
On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas  wrote:
> Interesting work.

Thanks.

> 1. My biggest gripe with this patch is that the comments are not easy
> to understand.

> Of course everybody may prefer something different here; I'm just
> telling you what I think.

I have struggled with trying to put just the right amount of
exposition on the theory behind a particular optimization in source
code comments, and things like that. Since no one is telling me that I
need to write more, clearly I don't have the balance right yet. To a
certain extent, it is a matter of personal style, but I'll try and be
more terse.

> 2. I believe the change to bttextcmp_abbrev() should be pulled out
> into a separate patch and committed separately.  That part  seems like
> a slam dunk.

Makes sense.

> 3. What is the worst case for the strxfrm()-reuse stuff?  I suppose
> it's the case where we have many strings that are long, all
> equal-length, and all different, but only in the last few characters.
> Then the memcmp() is as expensive as possible but never works out.
> How does the patch hold up in that case?

I haven't tested it. I'll get around to it at some point in the next
couple of weeks. I imagine that it's exactly the same as the memcmp()
equality thing because of factors like speculative execution, and the
fact that we need both strings in cache anyway. It's almost exactly
the same story, although unlike the memcmp() opportunistic equality
pre-check thing, this check happens only n times, not n log n times.

I'm quite sure that the cost needs to be virtually zero to go ahead
with the idea. I think it probably is. Note that like the memcmp()
thing, we check string length first, before a memcmp().

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More work on SortSupport for text - strcoll() and strxfrm() caching

2015-08-04 Thread Robert Haas
On Fri, Jul 3, 2015 at 8:33 PM, Peter Geoghegan  wrote:
> Since apparently we're back to development work, I thought it was time
> to share a patch implementing a few additional simple tricks to make
> sorting text under a non-C locale even faster than in 9.5. These
> techniques are mostly effective when values are physically clustered
> together. This might be because there is a physical/logical
> correlation, but cases involving any kind of clustering of values are
> helped significantly.

Interesting work.

Some comments:

1. My biggest gripe with this patch is that the comments are not easy
to understand.  For example:

+ * Attempt to re-use buffers across calls.  Also, avoid work in the event
+ * of clustered together identical items by exploiting temporal locality.
+ * This works well with divide-and-conquer, comparison-based sorts like
+ * quicksort and mergesort.
+ *
+ * With quicksort, there is, in general, a pretty strong chance that the
+ * same buffer contents can be used repeatedly for pivot items -- early
+ * pivot items will account for large number of total comparisons, since
+ * they must be compared against many (possibly all other) items.

Well, what I would have written is something like: "We're likely to be
asked to compare the same strings repeatedly, and memcmp() is so much
cheaper than memcpy() that it pays to attempt a memcmp() in the hopes
of avoiding a memcpy().  This doesn't seem to slow things down
measurably even if it doesn't work out very often."

+ * While it is worth going to the trouble of trying to re-use buffer
+ * contents across calls, ideally that will lead to entirely avoiding a
+ * strcoll() call by using a cached return value.
+ *
+ * This optimization can work well again and again for the same set of
+ * clustered together identical attributes;  as they're relocated to new
+ * subpartitions, only one strcoll() is required for each pivot (in respect
+ * of that clump of identical values, at least).  Similarly, the final
+ * N-way merge of a mergesort can be effectively accelerated if each run
+ * has its own locally clustered values.

And here I would have written something like: "If we're comparing the
same two strings that we compared last time, we can return the same
answer without calling strcoll() again.  This is more likely than it
seems, because quicksort compares the same pivot against many values,
and some of those values might be duplicates."

Of course everybody may prefer something different here; I'm just
telling you what I think.

2. I believe the change to bttextcmp_abbrev() should be pulled out
into a separate patch and committed separately.  That part  seems like
a slam dunk.

3. What is the worst case for the strxfrm()-reuse stuff?  I suppose
it's the case where we have many strings that are long, all
equal-length, and all different, but only in the last few characters.
Then the memcmp() is as expensive as possible but never works out.
How does the patch hold up in that case?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers