Hi Hiltjo, On Thu, Jul 06, 2023 at 07:08:57PM +0200, Hiltjo Posthuma wrote: > What hash algorithm is this exactly, it looks interesting? I've found it at: > > https://github.com/skeeto/hash-prospector > > Is that the one?
The "xor-shift multiply" construct itself is fairly popular and can be found on many hash algorithms (e.g murmur, fnv etc). I *think* it originates from LGCs, but not sure about that: https://en.wikipedia.org/wiki/Linear_congruential_generator The specific multiplier and shift constants are indeed taken from hash-prospector README, I've updated v2 commit message to include it. > What's the measured performance improvement and wouldn't it increase > collisions > vs a simple linear search? In case the cache gets filled up, the false-negative rate of either a linear search or the hash-table will be unpredictable since both of them use a "dumb" eviction policy. The old method overwrites whatever came in first and the new one overwrites whatever was in `h1`. Something like an LRU *might* help in those cases, but I think it's not worth the trouble. As for collisions, the new 2x size made collisions rarer. Testing with a list of ~1.6k emojies I saw only 5 collisions. (The result will obviously vary depending on your installed fonts). But this made me realize that it was dumb to throw away the high bits of the hash. It can be used to probe another slot. With that, I now only get 1 collision. I've updated the v2 patch with the new logic. > I'm not sure we should change the assumption for ints to POSIX in dmenu in all > cases and assume 32-bit int (although a lot of software in practise does). > But > this is very pedantic :) > > Bit off-topic: recently I tested some software on MS-DOS with the OpenWatcom > compiler. It had 16-bit int and caught a bug in my code (lower-level parser > stuff). Since dmenu already requires POSIX, I didn't think it'd be a problem. `uint_least32_t` or `uint32_t` can probably be used, but suckless code base typically seems to avoid <stdint.h>. --- Avoiding wastage and thus being able to 2x the cache size is the *main* improvement here, IMO. The hash-table is measurably faster than linear search but it won't be noticeable by the user (~30ns vs ~130ns, in theory it also avoids polluting the cache but I doubt that'll matter much either) so I don't mind dropping the hash-table if there's concerns about it. But I think avoiding waste and 2x cache size is worthwhile since it means we can avoid even more unnecessary and expensive XftFontMatch() calls. - NRK
>From 7359c32b9612f8b5fe8c4e6ebf2ab1d903eda989 Mon Sep 17 00:00:00 2001 From: NRK <[email protected]> Date: Fri, 9 Jun 2023 21:11:52 +0600 Subject: [PATCH] minor improvement to the nomatches cache 1. use `unsigned int` to store the codepoints, this avoids waste on common case where `long` is 64bits. and POSIX guarantees `int` to be at least 32bits so there's no risk of truncation. 2. since switching to `unsigned int` cuts down the memory requirement by half, double the cache size from 64 to 128. 3. instead of a linear search, use a simple hash-table for O(1) lookups. to reduce collusions, the both the lower and the higher bits of the hash are used to probe 2 different slots. the hash constants are taken form: https://github.com/skeeto/hash-prospector --- dmenu.c | 1 - drw.c | 23 ++++++++++++----------- util.h | 1 + 3 files changed, 13 insertions(+), 12 deletions(-) diff --git a/dmenu.c b/dmenu.c index 62f1089..40f93e0 100644 --- a/dmenu.c +++ b/dmenu.c @@ -22,7 +22,6 @@ /* macros */ #define INTERSECT(x,y,w,h,r) (MAX(0, MIN((x)+(w),(r).x_org+(r).width) - MAX((x),(r).x_org)) \ * MAX(0, MIN((y)+(h),(r).y_org+(r).height) - MAX((y),(r).y_org))) -#define LENGTH(X) (sizeof X / sizeof X[0]) #define TEXTW(X) (drw_fontset_getwidth(drw, (X)) + lrpad) /* enums */ diff --git a/drw.c b/drw.c index a58a2b4..78a2b27 100644 --- a/drw.c +++ b/drw.c @@ -238,8 +238,8 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, unsigned int h, int filled, int int drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lpad, const char *text, int invert) { - int i, ty, ellipsis_x = 0; - unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len; + int ty, ellipsis_x = 0; + unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, hash, h0, h1; XftDraw *d = NULL; Fnt *usedfont, *curfont, *nextfont; int utf8strlen, utf8charlen, render = x || y || w || h; @@ -251,9 +251,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp XftResult result; int charexists = 0, overflow = 0; /* keep track of a couple codepoints for which we have no match. */ - enum { nomatches_len = 64 }; - static struct { long codepoint[nomatches_len]; unsigned int idx; } nomatches; - static unsigned int ellipsis_width = 0; + static unsigned int nomatches[128], ellipsis_width; if (!drw || (render && (!drw->scheme || !w)) || !text || !drw->fonts) return 0; @@ -338,11 +336,14 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp * character must be drawn. */ charexists = 1; - for (i = 0; i < nomatches_len; ++i) { - /* avoid calling XftFontMatch if we know we won't find a match */ - if (utf8codepoint == nomatches.codepoint[i]) - goto no_match; - } + hash = (unsigned int)utf8codepoint; + hash = ((hash >> 16) ^ hash) * 0x21F0AAAD; + hash = ((hash >> 15) ^ hash) * 0xD35A2D97; + h0 = ((hash >> 15) ^ hash) % LENGTH(nomatches); + h1 = (hash >> 17) % LENGTH(nomatches); + /* avoid expensive XftFontMatch call when we know we won't find a match */ + if (nomatches[h0] == utf8codepoint || nomatches[h1] == utf8codepoint) + goto no_match; fccharset = FcCharSetCreate(); FcCharSetAddChar(fccharset, utf8codepoint); @@ -371,7 +372,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp curfont->next = usedfont; } else { xfont_free(usedfont); - nomatches.codepoint[++nomatches.idx % nomatches_len] = utf8codepoint; + nomatches[nomatches[h0] ? h1 : h0] = utf8codepoint; no_match: usedfont = drw->fonts; } diff --git a/util.h b/util.h index f633b51..c0a50d4 100644 --- a/util.h +++ b/util.h @@ -3,6 +3,7 @@ #define MAX(A, B) ((A) > (B) ? (A) : (B)) #define MIN(A, B) ((A) < (B) ? (A) : (B)) #define BETWEEN(X, A, B) ((A) <= (X) && (X) <= (B)) +#define LENGTH(X) (sizeof (X) / sizeof (X)[0]) void die(const char *fmt, ...); void *ecalloc(size_t nmemb, size_t size); -- 2.41.0
