Re: [hackers] [libsl|dmenu][PATCH v2] Fix truncation issues and improve performance
On Fri, Mar 25, 2022 at 03:34:29PM +0600, NRK wrote: > Hi, > > Fixed a couple bugs/issues with the previous patches and attached the > amended ones. I'm confident that these patches shouldn't have any > remaining logic issues, but feel free to point them out in case there is. > > Some notable changes from the previous patches: > > * The drw related patches have separated. This is so that they can be > cleanly applied to libsl (and dwm). Should additionally help with the > review process. > > * The truncation condition as well as remaining width to override has been > corrected. > > * The TEXTW_CLAMP macro has been converted to a function to avoid > calculating width twice. > > * Since the `invert` parameter in drw_text() does not get used when not > rendering, it's now being internally reused for specifying a minimum width. > > Additionally, there's one *new* patch attached, which fixes the > performance issue in case we have no font match for a given codepoint. > > - NRK > From 1425e94d0eb3cfbb4a9b73d433b9133bcb1b824c Mon Sep 17 00:00:00 2001 > From: NRK > Date: Thu, 24 Mar 2022 00:37:55 +0600 > Subject: [PATCH 1/6] drw_text: improve both performance and correctness > > this patch makes some non-trivial changes, which significantly improves > the performance of drawing large strings as well as fixes any issues > regarding the printing of the ellipsis when string gets truncated. > > * performance: > > before there were two O(n) loops, one which finds how long we can go > without changing font, and the second loop would (incorrectly) truncate > the string if it's too big. > > this patch merges the overflow calculation into the first loop and exits > out when overflow is detected. when dumping lots of emojies into dmenu, > i see some noticeable startup time improvement: > > before -> after > 460ms -> 360ms > > input latency when scrolling up/down is also noticeably better and can > be tested with the following: > > for _ in $(seq 20); do > cat /dev/urandom | base64 | tr -d '\n' | head -c 100 > echo > done | ./dmenu -l 10 > > * correctness: > > the previous version would incorrectly assumed single byte chars and > would overwrite them with '.' , this caused a whole bunch of obvious > problems, including the ellipsis not getting rendered if then font > changed. > > in addition to exiting out when we detect overflow, this patch also > keeps track of the last x-position where the ellipsis would fit. if we > detect overflow, we simply make a recursing call to drw_text() at the > ellipsis_x position and overwrite what was there. > > so now the ellipsis will always be printed properly, regardless of > weather the font changes or if the string is single byte char or not. > > the idea of rendering the ellipsis on top incase of overflow was > from Bakkeby , thanks! however the original patch had > some issues incorrectly truncating the prompt (-p flag) and cutting off > emojies. those have been fixed in here. > --- > drw.c | 56 > 1 file changed, 28 insertions(+), 28 deletions(-) > > diff --git a/drw.c b/drw.c > index 4cdbcbe..e65d069 100644 > --- a/drw.c > +++ b/drw.c > @@ -251,12 +251,10 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, > unsigned int h, int filled, int > int > drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned > int lpad, const char *text, int invert) > { > - char buf[1024]; > - int ty; > - unsigned int ew; > + int ty, ellipsis_x = 0; > + unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width; > XftDraw *d = NULL; > Fnt *usedfont, *curfont, *nextfont; > - size_t i, len; > int utf8strlen, utf8charlen, render = x || y || w || h; > long utf8codepoint = 0; > const char *utf8str; > @@ -264,7 +262,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > FcPattern *fcpattern; > FcPattern *match; > XftResult result; > - int charexists = 0; > + int charexists = 0, overflow = 0; > > if (!drw || (render && !drw->scheme) || !text || !drw->fonts) > return 0; > @@ -282,8 +280,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > } > > usedfont = drw->fonts; > + drw_font_getexts(usedfont, "...", 3, &ellipsis_width, NULL); > while (1) { > - utf8strlen = 0; > + ew = ellipsis_len = utf8strlen = 0; > utf8str = text; > nextfont = NULL; > while (*text) { > @@ -291,9 +290,21 @@ drw_text(Drw *drw, int x, int y, unsigned int w, > unsigned int h, unsigned int lp > for (curfont = drw->fonts; curfont; curfont = > curfont->next) { > charexists = charexists || > XftCharExists(drw->dpy, curfont->xfont, utf8codepoint); >
[hackers] [dmenu] free all allocated items, use %zu for size_t || NRK
commit b43ec0577f2ad8ad33a0b893fe5360d966036786 Author: NRK AuthorDate: Fri Mar 25 22:51:09 2022 +0100 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:53:50 2022 +0100 free all allocated items, use %zu for size_t `items` itself is not checked for NULL as calling free on NULL is defined to be a no-op. diff --git a/dmenu.c b/dmenu.c index d989d39..085dc29 100644 --- a/dmenu.c +++ b/dmenu.c @@ -104,6 +104,9 @@ cleanup(void) XUngrabKey(dpy, AnyKey, AnyModifier, root); for (i = 0; i < SchemeLast; i++) free(scheme[i]); + for (i = 0; items && items[i].text; ++i) + free(items[i].text); + free(items); drw_free(drw); XSync(dpy, False); XCloseDisplay(dpy); @@ -239,7 +242,7 @@ match(void) /* separate input text into tokens to be matched individually */ for (s = strtok(buf, " "); s; tokv[tokc - 1] = s, s = strtok(NULL, " ")) if (++tokc > tokn && !(tokv = realloc(tokv, ++tokn * sizeof *tokv))) - die("cannot realloc %u bytes:", tokn * sizeof *tokv); + die("cannot realloc %zu bytes:", tokn * sizeof *tokv); len = tokc ? strlen(tokv[0]) : 0; matches = lprefix = lsubstr = matchend = prefixend = substrend = NULL; @@ -553,11 +556,11 @@ readstdin(void) for (i = 0; fgets(buf, sizeof buf, stdin); i++) { if (i + 1 >= size / sizeof *items) if (!(items = realloc(items, (size += BUFSIZ - die("cannot realloc %u bytes:", size); + die("cannot realloc %zu bytes:", size); if ((p = strchr(buf, '\n'))) *p = '\0'; if (!(items[i].text = strdup(buf))) - die("cannot strdup %u bytes:", strlen(buf) + 1); + die("cannot strdup %zu bytes:", strlen(buf) + 1); items[i].out = 0; } if (items)
[hackers] [dmenu] drw_text: improve performance when there's no match || NRK
commit 22511c41d55a38a770541ae617a09383d5e6ad1c Author: NRK AuthorDate: Thu Mar 24 00:37:55 2022 +0600 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:49:07 2022 +0100 drw_text: improve performance when there's no match this was the last piece of the puzzle, the case where we can't find any font to draw the codepoint. in such cases, we use XftFontMatch() which is INSANELY slow. but that's not the real problem. the real problem was we were continuously trying to match the same thing over and over again. this patch introduces a small cache, which keeps track a couple codepoints for which we know we won't find any matches. with this, i can dump lots of emojies into dmenu where some of them don't have any matching font, and still not have dmenu lag insanely or FREEZE completely when scrolling up and down. this also improves startup time, which will of course depend on the system and all installed fonts; but on my system and test case i see the following startup time drop: before -> after 60ms -> 34ms diff --git a/drw.c b/drw.c index 7d985b1..a50c9ee 100644 --- a/drw.c +++ b/drw.c @@ -251,7 +251,7 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, unsigned int h, int filled, int int drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lpad, const char *text, int invert) { - int ty, ellipsis_x = 0; + int i, ty, ellipsis_x = 0; unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width; XftDraw *d = NULL; Fnt *usedfont, *curfont, *nextfont; @@ -263,6 +263,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp FcPattern *match; XftResult result; int charexists = 0, overflow = 0; + /* keep track of a couple codepoints for which we have no match. */ + enum { nomatches_len = 64 }; + static struct { long codepoint[nomatches_len]; unsigned int idx; } nomatches; if (!drw || (render && !drw->scheme) || !text || !drw->fonts) return 0; @@ -346,6 +349,12 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp * character must be drawn. */ charexists = 1; + for (i = 0; i < nomatches_len; ++i) { + /* avoid calling XftFontMatch if we know we won't find a match */ + if (utf8codepoint == nomatches.codepoint[i]) + goto no_match; + } + fccharset = FcCharSetCreate(); FcCharSetAddChar(fccharset, utf8codepoint); @@ -374,6 +383,8 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp curfont->next = usedfont; } else { xfont_free(usedfont); + nomatches.codepoint[++nomatches.idx % nomatches_len] = utf8codepoint; +no_match: usedfont = drw->fonts; } }
[hackers] [dmenu] avoid redraw when there's no change || NRK
commit 6818e07291f3b2913e687c8ec3d3fe4711724050 Author: NRK AuthorDate: Fri Mar 25 22:51:45 2022 +0100 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:53:50 2022 +0100 avoid redraw when there's no change while i was timing the performance issue, i noticed that there was lots of random redrawing going on. turns out there were coming from here; if someone presses CTRL/ALT etc without pressing anything else, nothing will be inserted, so nothing will change. but the code will `break`, go down and do a needless redraw. this patch changes it to simply return if the keypress iscntrl() also avoid potential UB by casting *buf into an unsigned char. diff --git a/dmenu.c b/dmenu.c index 085dc29..19f6385 100644 --- a/dmenu.c +++ b/dmenu.c @@ -415,8 +415,9 @@ keypress(XKeyEvent *ev) switch(ksym) { default: insert: - if (!iscntrl(*buf)) - insert(buf, len); + if (iscntrl((unsigned char)*buf)) + return; + insert(buf, len); break; case XK_Delete: case XK_KP_Delete:
[hackers] [dmenu] significantly improve performance on large strings || NRK
commit 7269c5355d257dd2ad2c53f15dc9c1cf6796aea5 Author: NRK AuthorDate: Thu Mar 24 02:04:04 2022 +0600 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:49:07 2022 +0100 significantly improve performance on large strings this replaces inefficient pattern of `MIN(TEXTW(..), n)` with drw_fontset_getwidth_clamp() instead, which is far more efficient when we only want up to a certain width. dumping a decently sized (unicode) emoji file into dmenu, I see the startup time drop significantly with this patch. before -> after 360ms -> 160ms this should also noticeably improve input latency (responsiveness) given that calcoffsets() and drawmenu() are pretty hot functions. diff --git a/dmenu.c b/dmenu.c index eca67ac..cde394b 100644 --- a/dmenu.c +++ b/dmenu.c @@ -58,6 +58,13 @@ static Clr *scheme[SchemeLast]; static int (*fstrncmp)(const char *, const char *, size_t) = strncmp; static char *(*fstrstr)(const char *, const char *) = strstr; +static unsigned int +textw_clamp(const char *str, unsigned int n) +{ + unsigned int w = drw_fontset_getwidth_clamp(drw, str, n) + lrpad; + return MIN(w, n); +} + static void appenditem(struct item *item, struct item **list, struct item **last) { @@ -82,10 +89,10 @@ calcoffsets(void) n = mw - (promptw + inputw + TEXTW("<") + TEXTW(">")); /* calculate which items will begin the next page and previous page */ for (i = 0, next = curr; next; next = next->right) - if ((i += (lines > 0) ? bh : MIN(TEXTW(next->text), n)) > n) + if ((i += (lines > 0) ? bh : textw_clamp(next->text, n)) > n) break; for (i = 0, prev = curr; prev && prev->left; prev = prev->left) - if ((i += (lines > 0) ? bh : MIN(TEXTW(prev->left->text), n)) > n) + if ((i += (lines > 0) ? bh : textw_clamp(prev->left->text, n)) > n) break; } @@ -172,7 +179,7 @@ drawmenu(void) } x += w; for (item = curr; item != next; item = item->right) - x = drawitem(item, x, 0, MIN(TEXTW(item->text), mw - x - TEXTW(">"))); + x = drawitem(item, x, 0, textw_clamp(item->text, mw - x - TEXTW(">"))); if (next) { w = TEXTW(">"); drw_setscheme(drw, scheme[SchemeNorm]);
[hackers] [dmenu] inputw: improve correctness and startup performance || NRK
commit 77526f756e23e362081ac807521f901f2e5cd5e6 Author: NRK AuthorDate: Thu Mar 24 00:37:55 2022 +0600 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:49:07 2022 +0100 inputw: improve correctness and startup performance a massive amount of time inside readstdin() is spent trying to get the max input width and then put it into inputw, only for it to get clamped down to mw/3 inside setup(). it makes more sense to calculate inputw inside setup() once we have mw available. similar to the last patch, i see noticeable startup performance improvement: before -> after 160ms -> 60ms additionally this will take fallback fonts into account compared to the previous version, so it's not only more performant but also more correct. diff --git a/dmenu.c b/dmenu.c index cde394b..d989d39 100644 --- a/dmenu.c +++ b/dmenu.c @@ -547,8 +547,7 @@ static void readstdin(void) { char buf[sizeof text], *p; - size_t i, imax = 0, size = 0; - unsigned int tmpmax = 0; + size_t i, size = 0; /* read each line from stdin and add it to the item list */ for (i = 0; fgets(buf, sizeof buf, stdin); i++) { @@ -560,15 +559,9 @@ readstdin(void) if (!(items[i].text = strdup(buf))) die("cannot strdup %u bytes:", strlen(buf) + 1); items[i].out = 0; - drw_font_getexts(drw->fonts, buf, strlen(buf), &tmpmax, NULL); - if (tmpmax > inputw) { - inputw = tmpmax; - imax = i; - } } if (items) items[i].text = NULL; - inputw = items ? TEXTW(items[imax].text) : 0; lines = MIN(lines, i); } @@ -614,12 +607,13 @@ static void setup(void) { int x, y, i, j; - unsigned int du; + unsigned int du, tmp; XSetWindowAttributes swa; XIM xim; Window w, dw, *dws; XWindowAttributes wa; XClassHint ch = {"dmenu", "dmenu"}; + struct item *item; #ifdef XINERAMA XineramaScreenInfo *info; Window pw; @@ -677,7 +671,12 @@ setup(void) mw = wa.width; } promptw = (prompt && *prompt) ? TEXTW(prompt) - lrpad / 4 : 0; - inputw = MIN(inputw, mw/3); + for (item = items; item && item->text; ++item) { + if ((tmp = textw_clamp(item->text, mw/3)) > inputw) { + if ((inputw = tmp) == mw/3) + break; + } + } match(); /* create menu window */
[hackers] [dmenu] introduce drw_fontset_getwidth_clamp() || NRK
commit 6be057f060543bb0f3ed9423904263617cde Author: NRK AuthorDate: Thu Mar 24 02:00:00 2022 +0600 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:49:07 2022 +0100 introduce drw_fontset_getwidth_clamp() getting the width of a string is an O(n) operation, and in many cases users only care about getting the width upto a certain number. instead of calling drw_fontset_getwidth() and *then* clamping the result, this patch introduces drw_fontset_getwidth_clamp() function, similar to strnlen(), which will stop once we reach n. the `invert` parameter was overloaded internally to preserve the API, however library users should be calling drw_fontset_getwidth_clamp() and not depend upon internal behavior of drw_text(). diff --git a/drw.c b/drw.c index e65d069..7d985b1 100644 --- a/drw.c +++ b/drw.c @@ -268,7 +268,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp return 0; if (!render) { - w = ~w; + w = invert ? invert : ~invert; } else { XSetForeground(drw->dpy, drw->gc, drw->scheme[invert ? ColFg : ColBg].pixel); XFillRectangle(drw->dpy, drw->drawable, drw->gc, x, y, w, h); @@ -300,7 +300,13 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp if (ew + tmpw > w) { overflow = 1; - utf8strlen = ellipsis_len; + /* called from drw_fontset_getwidth_clamp(): +* it wants the width AFTER the overflow +*/ + if (!render) + x += tmpw; + else + utf8strlen = ellipsis_len; } else if (curfont == usedfont) { utf8strlen += utf8charlen; text += utf8charlen; @@ -397,6 +403,15 @@ drw_fontset_getwidth(Drw *drw, const char *text) return drw_text(drw, 0, 0, 0, 0, 0, text, 0); } +unsigned int +drw_fontset_getwidth_clamp(Drw *drw, const char *text, unsigned int n) +{ + unsigned int tmp = 0; + if (drw && drw->fonts && text && n) + tmp = drw_text(drw, 0, 0, 0, 0, 0, text, n); + return MIN(n, tmp); +} + void drw_font_getexts(Fnt *font, const char *text, unsigned int len, unsigned int *w, unsigned int *h) { diff --git a/drw.h b/drw.h index 4c67419..fd7631b 100644 --- a/drw.h +++ b/drw.h @@ -35,6 +35,7 @@ void drw_free(Drw *drw); Fnt *drw_fontset_create(Drw* drw, const char *fonts[], size_t fontcount); void drw_fontset_free(Fnt* set); unsigned int drw_fontset_getwidth(Drw *drw, const char *text); +unsigned int drw_fontset_getwidth_clamp(Drw *drw, const char *text, unsigned int n); void drw_font_getexts(Fnt *font, const char *text, unsigned int len, unsigned int *w, unsigned int *h); /* Colorscheme abstraction */
[hackers] [dmenu] drw_text: improve both performance and correctness || NRK
commit 41fdabbf7c517f8d524b70cbd78238cc319ccef3 Author: NRK AuthorDate: Thu Mar 24 00:37:55 2022 +0600 Commit: Hiltjo Posthuma CommitDate: Fri Mar 25 22:49:07 2022 +0100 drw_text: improve both performance and correctness this patch makes some non-trivial changes, which significantly improves the performance of drawing large strings as well as fixes any issues regarding the printing of the ellipsis when string gets truncated. * performance: before there were two O(n) loops, one which finds how long we can go without changing font, and the second loop would (incorrectly) truncate the string if it's too big. this patch merges the overflow calculation into the first loop and exits out when overflow is detected. when dumping lots of emojies into dmenu, i see some noticeable startup time improvement: before -> after 460ms -> 360ms input latency when scrolling up/down is also noticeably better and can be tested with the following: for _ in $(seq 20); do cat /dev/urandom | base64 | tr -d '\n' | head -c 100 echo done | ./dmenu -l 10 * correctness: the previous version would incorrectly assumed single byte chars and would overwrite them with '.' , this caused a whole bunch of obvious problems, including the ellipsis not getting rendered if then font changed. in addition to exiting out when we detect overflow, this patch also keeps track of the last x-position where the ellipsis would fit. if we detect overflow, we simply make a recursing call to drw_text() at the ellipsis_x position and overwrite what was there. so now the ellipsis will always be printed properly, regardless of weather the font changes or if the string is single byte char or not. the idea of rendering the ellipsis on top incase of overflow was from Bakkeby , thanks! however the original patch had some issues incorrectly truncating the prompt (-p flag) and cutting off emojies. those have been fixed in here. diff --git a/drw.c b/drw.c index 4cdbcbe..e65d069 100644 --- a/drw.c +++ b/drw.c @@ -251,12 +251,10 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, unsigned int h, int filled, int int drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lpad, const char *text, int invert) { - char buf[1024]; - int ty; - unsigned int ew; + int ty, ellipsis_x = 0; + unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width; XftDraw *d = NULL; Fnt *usedfont, *curfont, *nextfont; - size_t i, len; int utf8strlen, utf8charlen, render = x || y || w || h; long utf8codepoint = 0; const char *utf8str; @@ -264,7 +262,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp FcPattern *fcpattern; FcPattern *match; XftResult result; - int charexists = 0; + int charexists = 0, overflow = 0; if (!drw || (render && !drw->scheme) || !text || !drw->fonts) return 0; @@ -282,8 +280,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp } usedfont = drw->fonts; + drw_font_getexts(usedfont, "...", 3, &ellipsis_width, NULL); while (1) { - utf8strlen = 0; + ew = ellipsis_len = utf8strlen = 0; utf8str = text; nextfont = NULL; while (*text) { @@ -291,9 +290,21 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned int lp for (curfont = drw->fonts; curfont; curfont = curfont->next) { charexists = charexists || XftCharExists(drw->dpy, curfont->xfont, utf8codepoint); if (charexists) { - if (curfont == usedfont) { + drw_font_getexts(curfont, text, utf8charlen, &tmpw, NULL); + if (ew + ellipsis_width <= w) { + /* keep track where the ellipsis still fits */ + ellipsis_x = x + ew; + ellipsis_w = w - ew; + ellipsis_len = utf8strlen; + } + + if (ew + tmpw > w) { + overflow = 1; + utf8strlen = ellipsis_len; + } else if (curfont == usedfont) { utf8strlen += utf8charlen; text += utf8charlen; +
Re: [hackers] [libsl|dmenu][PATCH v2] Fix truncation issues and improve performance
> Hi, Hi NRK, > Fixed a couple bugs/issues with the previous patches and attached the > amended ones. I'm confident that these patches shouldn't have any > remaining logic issues, but feel free to point them out in case there is. > Additionally, there's one *new* patch attached, which fixes the > performance issue in case we have no font match for a given codepoint. I haven't reviewed those (yet), but thanks for your work on this!