Re: [hackers] [dmenu][PATCH] minor improvement to the nomatches cache

NRK Fri, 07 Jul 2023 04:00:57 -0700

Hi Hiltjo,

On Thu, Jul 06, 2023 at 07:08:57PM +0200, Hiltjo Posthuma wrote:
> What hash algorithm is this exactly, it looks interesting? I've found it at:
>
>       https://github.com/skeeto/hash-prospector
>
> Is that the one?


The "xor-shift multiply" construct itself is fairly popular and can be
found on many hash algorithms (e.g murmur, fnv etc). I *think* it
originates from LGCs, but not sure about that:
https://en.wikipedia.org/wiki/Linear_congruential_generator

The specific multiplier and shift constants are indeed taken from
hash-prospector README, I've updated v2 commit message to include it.

> What's the measured performance improvement and wouldn't it increase 
> collisions
> vs a simple linear search?

In case the cache gets filled up, the false-negative rate of either a
linear search or the hash-table will be unpredictable since both of them
use a "dumb" eviction policy.

The old method overwrites whatever came in first and the new one
overwrites whatever was in `h1`. Something like an LRU *might* help in
those cases, but I think it's not worth the trouble.

As for collisions, the new 2x size made collisions rarer. Testing with a
list of ~1.6k emojies I saw only 5 collisions. (The result will
obviously vary depending on your installed fonts).

But this made me realize that it was dumb to throw away the high bits of
the hash. It can be used to probe another slot. With that, I now only
get 1 collision. I've updated the v2 patch with the new logic.

> I'm not sure we should change the assumption for ints to POSIX in dmenu in all
> cases and assume 32-bit int (although a lot of software in practise does).  
> But
> this is very pedantic :)
> 
> Bit off-topic: recently I tested some software on MS-DOS with the OpenWatcom
> compiler. It had 16-bit int and caught a bug in my code (lower-level parser
> stuff).

Since dmenu already requires POSIX, I didn't think it'd be a problem.
`uint_least32_t` or `uint32_t` can probably be used, but suckless code
base typically seems to avoid <stdint.h>.

---

Avoiding wastage and thus being able to 2x the cache size is the *main*
improvement here, IMO.

The hash-table is measurably faster than linear search but it won't be
noticeable by the user (~30ns vs ~130ns, in theory it also avoids
polluting the cache but I doubt that'll matter much either) so I don't
mind dropping the hash-table if there's concerns about it.

But I think avoiding waste and 2x cache size is worthwhile since it
means we can avoid even more unnecessary and expensive XftFontMatch()
calls.

- NRK

>From 7359c32b9612f8b5fe8c4e6ebf2ab1d903eda989 Mon Sep 17 00:00:00 2001
From: NRK <[email protected]>
Date: Fri, 9 Jun 2023 21:11:52 +0600
Subject: [PATCH] minor improvement to the nomatches cache

1. use `unsigned int` to store the codepoints, this avoids waste on
   common case where `long` is 64bits. and POSIX guarantees `int` to be
   at least 32bits so there's no risk of truncation.
2. since switching to `unsigned int` cuts down the memory requirement by
   half, double the cache size from 64 to 128.
3. instead of a linear search, use a simple hash-table for O(1) lookups.
   to reduce collusions, the both the lower and the higher bits of the
   hash are used to probe 2 different slots.

the hash constants are taken form:
https://github.com/skeeto/hash-prospector
---
 dmenu.c |  1 -
 drw.c   | 23 ++++++++++++-----------
 util.h  |  1 +
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/dmenu.c b/dmenu.c
index 62f1089..40f93e0 100644
--- a/dmenu.c
+++ b/dmenu.c
@@ -22,7 +22,6 @@
 /* macros */
 #define INTERSECT(x,y,w,h,r)  (MAX(0, MIN((x)+(w),(r).x_org+(r).width)  - 
MAX((x),(r).x_org)) \
                              * MAX(0, MIN((y)+(h),(r).y_org+(r).height) - 
MAX((y),(r).y_org)))
-#define LENGTH(X)             (sizeof X / sizeof X[0])
 #define TEXTW(X)              (drw_fontset_getwidth(drw, (X)) + lrpad)
 
 /* enums */
diff --git a/drw.c b/drw.c
index a58a2b4..78a2b27 100644
--- a/drw.c
+++ b/drw.c
@@ -238,8 +238,8 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, unsigned 
int h, int filled, int
 int
 drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int 
lpad, const char *text, int invert)
 {
-       int i, ty, ellipsis_x = 0;
-       unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len;
+       int ty, ellipsis_x = 0;
+       unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, hash, h0, h1;
        XftDraw *d = NULL;
        Fnt *usedfont, *curfont, *nextfont;
        int utf8strlen, utf8charlen, render = x || y || w || h;
@@ -251,9 +251,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
int h, unsigned int lp
        XftResult result;
        int charexists = 0, overflow = 0;
        /* keep track of a couple codepoints for which we have no match. */
-       enum { nomatches_len = 64 };
-       static struct { long codepoint[nomatches_len]; unsigned int idx; } 
nomatches;
-       static unsigned int ellipsis_width = 0;
+       static unsigned int nomatches[128], ellipsis_width;
 
        if (!drw || (render && (!drw->scheme || !w)) || !text || !drw->fonts)
                return 0;
@@ -338,11 +336,14 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
int h, unsigned int lp
                         * character must be drawn. */
                        charexists = 1;
 
-                       for (i = 0; i < nomatches_len; ++i) {
-                               /* avoid calling XftFontMatch if we know we 
won't find a match */
-                               if (utf8codepoint == nomatches.codepoint[i])
-                                       goto no_match;
-                       }
+                       hash = (unsigned int)utf8codepoint;
+                       hash = ((hash >> 16) ^ hash) * 0x21F0AAAD;
+                       hash = ((hash >> 15) ^ hash) * 0xD35A2D97;
+                       h0 = ((hash >> 15) ^ hash) % LENGTH(nomatches);
+                       h1 = (hash >> 17) % LENGTH(nomatches);
+                       /* avoid expensive XftFontMatch call when we know we 
won't find a match */
+                       if (nomatches[h0] == utf8codepoint || nomatches[h1] == 
utf8codepoint)
+                               goto no_match;
 
                        fccharset = FcCharSetCreate();
                        FcCharSetAddChar(fccharset, utf8codepoint);
@@ -371,7 +372,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
int h, unsigned int lp
                                        curfont->next = usedfont;
                                } else {
                                        xfont_free(usedfont);
-                                       nomatches.codepoint[++nomatches.idx % 
nomatches_len] = utf8codepoint;
+                                       nomatches[nomatches[h0] ? h1 : h0] = 
utf8codepoint;
 no_match:
                                        usedfont = drw->fonts;
                                }
diff --git a/util.h b/util.h
index f633b51..c0a50d4 100644
--- a/util.h
+++ b/util.h
@@ -3,6 +3,7 @@
 #define MAX(A, B)               ((A) > (B) ? (A) : (B))
 #define MIN(A, B)               ((A) < (B) ? (A) : (B))
 #define BETWEEN(X, A, B)        ((A) <= (X) && (X) <= (B))
+#define LENGTH(X)               (sizeof (X) / sizeof (X)[0])
 
 void die(const char *fmt, ...);
 void *ecalloc(size_t nmemb, size_t size);
-- 
2.41.0

Re: [hackers] [dmenu][PATCH] minor improvement to the nomatches cache

Reply via email to