Re: Optimization for lower(), upper(), casefold() functions.

Jeff Davis Wed, 12 Mar 2025 12:58:49 -0700

On Wed, 2025-03-12 at 19:55 +0300, Alexander Borisov wrote:
> 1. Added static for casemap() function. Otherwise the compiler could
> not
> optimize the code and the performance dropped significantly.


Oops, it was static, but I made it external just to see what code it
generated. I didn't intend to publish it as an external function --
thank you for catching that!

> 2. Added a fast path for codepoint < 0x80.
> 
> v3j-0002:
> In the fast path for codepoints < 0x80, I added a premature return.
> This avoided additional insertions, which increased performance.

What do you mean "additional insertions"?

Also, should we just compute the results in the fast path? We don't
even need a table. Rough patch attached to go on top of v4-0001.

Should we properly return CASEMAP_SELF when *simple == u1, or is it ok
to return CASEMAP_SIMPLE? It probably doesn't matter performance-wise,
but it feels more correct to return CASEMAP_SELF.

> 
> Perhaps for general
> beauty it should be made static inline, I don't have a rigid position
> here.

We ordinarily use "static inline" if it's in a header file, and
"static" if it's in a .c file, so I'll do it that way.

> I was purely based on existing approaches in Postgres, the
> Normalization Forms have them separated into different headers. Just
> trying to be consistent with existing approaches.

I think that was done for normalization primarily because it's not used
#ifndef FRONTEND (see unicode_norm.c), and perhaps also because it's
just a more complex function worthy of its own file.

I looked into the history, and commit 783f0cc64d explains why perfect
hashing is not used in the frontend:

"The decomposition table remains the same, getting used for the binary
search in the frontend code, where we care more about the size of the
libraries like libpq over performance..."

> 
Regards,
        Jeff Davis

From ed4d2803aa32add7c05726286b94e78e49bb1257 Mon Sep 17 00:00:00 2001
From: Jeff Davis <[email protected]>
Date: Wed, 12 Mar 2025 11:56:59 -0700
Subject: [PATCH vtmp] fastpath

---
 src/common/unicode_case.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/common/unicode_case.c b/src/common/unicode_case.c
index 2b3b4cdc2e7..b1fc1651043 100644
--- a/src/common/unicode_case.c
+++ b/src/common/unicode_case.c
@@ -390,11 +390,41 @@ casemap(pg_wchar u1, CaseKind casekind, bool full,
 {
 	const pg_case_map *map;
 
+	/*
+	 * Fast path for early codepoints. The only interesting characters are
+	 * [a-zA-Z].
+	 */
 	if (u1 < 0x80)
 	{
-		*simple = case_map[u1].simplemap[casekind];
+		/* fast-path codepoints do not have special casing */
+		Assert(find_case_map(u1)->special_case == NULL);
 
-		return CASEMAP_SIMPLE;
+		switch (casekind)
+		{
+			case CaseLower:
+			case CaseFold:
+				if (u1 >= 'A' && u1 <= 'Z')
+				{
+					*simple = u1 + 0x20;
+					Assert(case_map[u1].simplemap[casekind] == *simple);
+					return CASEMAP_SIMPLE;
+				}
+				break;
+			case CaseTitle:
+			case CaseUpper:
+				if (u1 >= 'a' && u1 <= 'z')
+				{
+					*simple = u1 - 0x20;
+					Assert(case_map[u1].simplemap[casekind] == *simple);
+					return CASEMAP_SIMPLE;
+				}
+				break;
+			default:
+				Assert(false);
+		}
+
+		Assert(case_map[u1].simplemap[casekind] == u1);
+		return CASEMAP_SELF;
 	}
 
 	map = find_case_map(u1);
-- 
2.34.1

Re: Optimization for lower(), upper(), casefold() functions.

Reply via email to