On Fri, Mar 25, 2022 at 03:34:29PM +0600, NRK wrote: > Hi, > > Fixed a couple bugs/issues with the previous patches and attached the > amended ones. I'm confident that these patches shouldn't have any > remaining logic issues, but feel free to point them out in case there is. > > Some notable changes from the previous patches: > > * The drw related patches have separated. This is so that they can be > cleanly applied to libsl (and dwm). Should additionally help with the > review process. > > * The truncation condition as well as remaining width to override has been > corrected. > > * The TEXTW_CLAMP macro has been converted to a function to avoid > calculating width twice. > > * Since the `invert` parameter in drw_text() does not get used when not > rendering, it's now being internally reused for specifying a minimum width. > > Additionally, there's one *new* patch attached, which fixes the > performance issue in case we have no font match for a given codepoint. > > - NRK
> From 1425e94d0eb3cfbb4a9b73d433b9133bcb1b824c Mon Sep 17 00:00:00 2001 > From: NRK <[email protected]> > Date: Thu, 24 Mar 2022 00:37:55 +0600 > Subject: [PATCH 1/6] drw_text: improve both performance and correctness > > this patch makes some non-trivial changes, which significantly improves > the performance of drawing large strings as well as fixes any issues > regarding the printing of the ellipsis when string gets truncated. > > * performance: > > before there were two O(n) loops, one which finds how long we can go > without changing font, and the second loop would (incorrectly) truncate > the string if it's too big. > > this patch merges the overflow calculation into the first loop and exits > out when overflow is detected. when dumping lots of emojies into dmenu, > i see some noticeable startup time improvement: > > before -> after > 460ms -> 360ms > > input latency when scrolling up/down is also noticeably better and can > be tested with the following: > > for _ in $(seq 20); do > cat /dev/urandom | base64 | tr -d '\n' | head -c 1000000 > echo > done | ./dmenu -l 10 > > * correctness: > > the previous version would incorrectly assumed single byte chars and > would overwrite them with '.' , this caused a whole bunch of obvious > problems, including the ellipsis not getting rendered if then font > changed. > > in addition to exiting out when we detect overflow, this patch also > keeps track of the last x-position where the ellipsis would fit. if we > detect overflow, we simply make a recursing call to drw_text() at the > ellipsis_x position and overwrite what was there. > > so now the ellipsis will always be printed properly, regardless of > weather the font changes or if the string is single byte char or not. > > the idea of rendering the ellipsis on top incase of overflow was > from Bakkeby <[email protected]>, thanks! however the original patch had > some issues incorrectly truncating the prompt (-p flag) and cutting off > emojies. those have been fixed in here. > --- > drw.c | 56 ++++++++++++++++++++++++++++---------------------------- > 1 file changed, 28 insertions(+), 28 deletions(-) > > diff --git a/drw.c b/drw.c > index 4cdbcbe..e65d069 100644 > --- a/drw.c > +++ b/drw.c > @@ -251,12 +251,10 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, > unsigned int h, int filled, int > int > drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned > int lpad, const char *text, int invert) > { > - char buf[1024]; > - int ty; > - unsigned int ew; > + int ty, ellipsis_x = 0; > + unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width; > XftDraw *d = NULL; > Fnt *usedfont, *curfont, *nextfont; > - size_t i, len; > int utf8strlen, utf8charlen, render = x || y || w || h; > long utf8codepoint = 0; > const char *utf8str; > @@ -264,7 +262,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > FcPattern *fcpattern; > FcPattern *match; > XftResult result; > - int charexists = 0; > + int charexists = 0, overflow = 0; > > if (!drw || (render && !drw->scheme) || !text || !drw->fonts) > return 0; > @@ -282,8 +280,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > } > > usedfont = drw->fonts; > + drw_font_getexts(usedfont, "...", 3, &ellipsis_width, NULL); > while (1) { > - utf8strlen = 0; > + ew = ellipsis_len = utf8strlen = 0; > utf8str = text; > nextfont = NULL; > while (*text) { > @@ -291,9 +290,21 @@ drw_text(Drw *drw, int x, int y, unsigned int w, > unsigned int h, unsigned int lp > for (curfont = drw->fonts; curfont; curfont = > curfont->next) { > charexists = charexists || > XftCharExists(drw->dpy, curfont->xfont, utf8codepoint); > if (charexists) { > - if (curfont == usedfont) { > + drw_font_getexts(curfont, text, > utf8charlen, &tmpw, NULL); > + if (ew + ellipsis_width <= w) { > + /* keep track where the > ellipsis still fits */ > + ellipsis_x = x + ew; > + ellipsis_w = w - ew; > + ellipsis_len = utf8strlen; > + } > + > + if (ew + tmpw > w) { > + overflow = 1; > + utf8strlen = ellipsis_len; > + } else if (curfont == usedfont) { > utf8strlen += utf8charlen; > text += utf8charlen; > + ew += tmpw; > } else { > nextfont = curfont; > } > @@ -301,36 +312,25 @@ drw_text(Drw *drw, int x, int y, unsigned int w, > unsigned int h, unsigned int lp > } > } > > - if (!charexists || nextfont) > + if (overflow || !charexists || nextfont) > break; > else > charexists = 0; > } > > if (utf8strlen) { > - drw_font_getexts(usedfont, utf8str, utf8strlen, &ew, > NULL); > - /* shorten text if necessary */ > - for (len = MIN(utf8strlen, sizeof(buf) - 1); len && ew > > w; len--) > - drw_font_getexts(usedfont, utf8str, len, &ew, > NULL); > - > - if (len) { > - memcpy(buf, utf8str, len); > - buf[len] = '\0'; > - if (len < utf8strlen) > - for (i = len; i && i > len - 3; > buf[--i] = '.') > - ; /* NOP */ > - > - if (render) { > - ty = y + (h - usedfont->h) / 2 + > usedfont->xfont->ascent; > - XftDrawStringUtf8(d, > &drw->scheme[invert ? ColBg : ColFg], > - usedfont->xfont, x, > ty, (XftChar8 *)buf, len); > - } > - x += ew; > - w -= ew; > + if (render) { > + ty = y + (h - usedfont->h) / 2 + > usedfont->xfont->ascent; > + XftDrawStringUtf8(d, &drw->scheme[invert ? > ColBg : ColFg], > + usedfont->xfont, x, ty, > (XftChar8 *)utf8str, utf8strlen); > } > + x += ew; > + w -= ew; > } > + if (render && overflow) > + drw_text(drw, ellipsis_x, y, ellipsis_w, h, 0, "...", > invert); > > - if (!*text) { > + if (!*text || overflow) { > break; > } else if (nextfont) { > charexists = 0; > -- > 2.34.1 > > From cebbfe70f25d52c290172640614cb592955ec67f Mon Sep 17 00:00:00 2001 > From: NRK <[email protected]> > Date: Thu, 24 Mar 2022 02:00:00 +0600 > Subject: [PATCH 2/6] introduce drw_fontset_getwidth_clamp() > > getting the width of a string is an O(n) operation, and in many cases > users only care about getting the width upto a certain number. > > instead of calling drw_fontset_getwidth() and *then* clamping the > result, this patch introduces drw_fontset_getwidth_clamp() function, > similar to strnlen(), which will stop once we reach n. > > the `invert` parameter was overloaded internally to preserve the API, > however library users should be calling drw_fontset_getwidth_clamp() and > not depend upon internal behavior of drw_text(). > --- > drw.c | 19 +++++++++++++++++-- > drw.h | 1 + > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/drw.c b/drw.c > index e65d069..7d985b1 100644 > --- a/drw.c > +++ b/drw.c > @@ -268,7 +268,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > return 0; > > if (!render) { > - w = ~w; > + w = invert ? invert : ~invert; > } else { > XSetForeground(drw->dpy, drw->gc, drw->scheme[invert ? ColFg : > ColBg].pixel); > XFillRectangle(drw->dpy, drw->drawable, drw->gc, x, y, w, h); > @@ -300,7 +300,13 @@ drw_text(Drw *drw, int x, int y, unsigned int w, > unsigned int h, unsigned int lp > > if (ew + tmpw > w) { > overflow = 1; > - utf8strlen = ellipsis_len; > + /* called from > drw_fontset_getwidth_clamp(): > + * it wants the width AFTER the > overflow > + */ > + if (!render) > + x += tmpw; > + else > + utf8strlen = > ellipsis_len; > } else if (curfont == usedfont) { > utf8strlen += utf8charlen; > text += utf8charlen; > @@ -397,6 +403,15 @@ drw_fontset_getwidth(Drw *drw, const char *text) > return drw_text(drw, 0, 0, 0, 0, 0, text, 0); > } > > +unsigned int > +drw_fontset_getwidth_clamp(Drw *drw, const char *text, unsigned int n) > +{ > + unsigned int tmp = 0; > + if (drw && drw->fonts && text && n) > + tmp = drw_text(drw, 0, 0, 0, 0, 0, text, n); > + return MIN(n, tmp); > +} > + > void > drw_font_getexts(Fnt *font, const char *text, unsigned int len, unsigned int > *w, unsigned int *h) > { > diff --git a/drw.h b/drw.h > index 4c67419..fd7631b 100644 > --- a/drw.h > +++ b/drw.h > @@ -35,6 +35,7 @@ void drw_free(Drw *drw); > Fnt *drw_fontset_create(Drw* drw, const char *fonts[], size_t fontcount); > void drw_fontset_free(Fnt* set); > unsigned int drw_fontset_getwidth(Drw *drw, const char *text); > +unsigned int drw_fontset_getwidth_clamp(Drw *drw, const char *text, unsigned > int n); > void drw_font_getexts(Fnt *font, const char *text, unsigned int len, > unsigned int *w, unsigned int *h); > > /* Colorscheme abstraction */ > -- > 2.34.1 > > From 557a6ecd4aae5b8baf4f1e8f324ad17efce72df0 Mon Sep 17 00:00:00 2001 > From: NRK <[email protected]> > Date: Thu, 24 Mar 2022 02:04:04 +0600 > Subject: [PATCH 3/6] significantly improve performance on large strings > > this replaces inefficient pattern of `MIN(TEXTW(..), n)` with > drw_fontset_getwidth_clamp() instead, which is far more efficient when > we only want up to a certain width. > > dumping a decently sized (unicode) emoji file into dmenu, I see the > startup time drop significantly with this patch. > > before -> after > 360ms -> 160ms > > this should also noticeably improve input latency (responsiveness) given > that calcoffsets() and drawmenu() are pretty hot functions. > --- > dmenu.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/dmenu.c b/dmenu.c > index eca67ac..cde394b 100644 > --- a/dmenu.c > +++ b/dmenu.c > @@ -58,6 +58,13 @@ static Clr *scheme[SchemeLast]; > static int (*fstrncmp)(const char *, const char *, size_t) = strncmp; > static char *(*fstrstr)(const char *, const char *) = strstr; > > +static unsigned int > +textw_clamp(const char *str, unsigned int n) > +{ > + unsigned int w = drw_fontset_getwidth_clamp(drw, str, n) + lrpad; > + return MIN(w, n); > +} > + > static void > appenditem(struct item *item, struct item **list, struct item **last) > { > @@ -82,10 +89,10 @@ calcoffsets(void) > n = mw - (promptw + inputw + TEXTW("<") + TEXTW(">")); > /* calculate which items will begin the next page and previous page */ > for (i = 0, next = curr; next; next = next->right) > - if ((i += (lines > 0) ? bh : MIN(TEXTW(next->text), n)) > n) > + if ((i += (lines > 0) ? bh : textw_clamp(next->text, n)) > n) > break; > for (i = 0, prev = curr; prev && prev->left; prev = prev->left) > - if ((i += (lines > 0) ? bh : MIN(TEXTW(prev->left->text), n)) > > n) > + if ((i += (lines > 0) ? bh : textw_clamp(prev->left->text, n)) > > n) > break; > } > > @@ -172,7 +179,7 @@ drawmenu(void) > } > x += w; > for (item = curr; item != next; item = item->right) > - x = drawitem(item, x, 0, MIN(TEXTW(item->text), mw - x > - TEXTW(">"))); > + x = drawitem(item, x, 0, textw_clamp(item->text, mw - x > - TEXTW(">"))); > if (next) { > w = TEXTW(">"); > drw_setscheme(drw, scheme[SchemeNorm]); > -- > 2.34.1 > > From 2c1da37e87e219014d838aecc05bd38c4f66fd83 Mon Sep 17 00:00:00 2001 > From: NRK <[email protected]> > Date: Thu, 24 Mar 2022 00:37:55 +0600 > Subject: [PATCH 4/6] inputw: improve correctness and startup performance > > a massive amount of time inside readstdin() is spent trying to get the > max input width and then put it into inputw, only for it to get clamped > down to mw/3 inside setup(). > > it makes more sense to calculate inputw inside setup() once we have mw > available. similar to the last patch, i see noticeable startup > performance improvement: > > before -> after > 160ms -> 60ms > > additionally this will take fallback fonts into account compared to the > previous version, so it's not only more performant but also more correct. > --- > dmenu.c | 19 +++++++++---------- > 1 file changed, 9 insertions(+), 10 deletions(-) > > diff --git a/dmenu.c b/dmenu.c > index cde394b..d989d39 100644 > --- a/dmenu.c > +++ b/dmenu.c > @@ -547,8 +547,7 @@ static void > readstdin(void) > { > char buf[sizeof text], *p; > - size_t i, imax = 0, size = 0; > - unsigned int tmpmax = 0; > + size_t i, size = 0; > > /* read each line from stdin and add it to the item list */ > for (i = 0; fgets(buf, sizeof buf, stdin); i++) { > @@ -560,15 +559,9 @@ readstdin(void) > if (!(items[i].text = strdup(buf))) > die("cannot strdup %u bytes:", strlen(buf) + 1); > items[i].out = 0; > - drw_font_getexts(drw->fonts, buf, strlen(buf), &tmpmax, NULL); > - if (tmpmax > inputw) { > - inputw = tmpmax; > - imax = i; > - } > } > if (items) > items[i].text = NULL; > - inputw = items ? TEXTW(items[imax].text) : 0; > lines = MIN(lines, i); > } > > @@ -614,12 +607,13 @@ static void > setup(void) > { > int x, y, i, j; > - unsigned int du; > + unsigned int du, tmp; > XSetWindowAttributes swa; > XIM xim; > Window w, dw, *dws; > XWindowAttributes wa; > XClassHint ch = {"dmenu", "dmenu"}; > + struct item *item; > #ifdef XINERAMA > XineramaScreenInfo *info; > Window pw; > @@ -677,7 +671,12 @@ setup(void) > mw = wa.width; > } > promptw = (prompt && *prompt) ? TEXTW(prompt) - lrpad / 4 : 0; > - inputw = MIN(inputw, mw/3); > + for (item = items; item && item->text; ++item) { > + if ((tmp = textw_clamp(item->text, mw/3)) > inputw) { > + if ((inputw = tmp) == mw/3) > + break; > + } > + } > match(); > > /* create menu window */ > -- > 2.34.1 > > From 7dcadf228f2dc48aaf3e9416116e06cceca91cb8 Mon Sep 17 00:00:00 2001 > From: NRK <[email protected]> > Date: Thu, 24 Mar 2022 00:37:55 +0600 > Subject: [PATCH 5/6] drw_text: improve performance when there's no match > > this was the last piece of the puzzle, the case where we can't find any > font to draw the codepoint. > > in such cases, we use XftFontMatch() which is INSANELY slow. but that's > not the real problem. the real problem was we were continuously trying > to match the same thing over and over again. > > this patch introduces a small cache, which keeps track a couple > codepoints for which we know we won't find any matches. > > with this, i can dump lots of emojies into dmenu where some of them > don't have any matching font, and still not have dmenu lag insanely or > FREEZE completely when scrolling up and down. > > this also improves startup time, which will of course depend on the > system and all installed fonts; but on my system and test case i see the > following startup time drop: > > before -> after > 60ms -> 34ms > --- > drw.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drw.c b/drw.c > index 7d985b1..a50c9ee 100644 > --- a/drw.c > +++ b/drw.c > @@ -251,7 +251,7 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, unsigned > int h, int filled, int > int > drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned > int lpad, const char *text, int invert) > { > - int ty, ellipsis_x = 0; > + int i, ty, ellipsis_x = 0; > unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width; > XftDraw *d = NULL; > Fnt *usedfont, *curfont, *nextfont; > @@ -263,6 +263,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > FcPattern *match; > XftResult result; > int charexists = 0, overflow = 0; > + /* keep track of a couple codepoints for which we have no match. */ > + enum { nomatches_len = 64 }; > + static struct { long codepoint[nomatches_len]; unsigned int idx; } > nomatches; > > if (!drw || (render && !drw->scheme) || !text || !drw->fonts) > return 0; > @@ -346,6 +349,12 @@ drw_text(Drw *drw, int x, int y, unsigned int w, > unsigned int h, unsigned int lp > * character must be drawn. */ > charexists = 1; > > + for (i = 0; i < nomatches_len; ++i) { > + /* avoid calling XftFontMatch if we know we > won't find a match */ > + if (utf8codepoint == nomatches.codepoint[i]) > + goto no_match; > + } > + > fccharset = FcCharSetCreate(); > FcCharSetAddChar(fccharset, utf8codepoint); > > @@ -374,6 +383,8 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned > int h, unsigned int lp > curfont->next = usedfont; > } else { > xfont_free(usedfont); > + nomatches.codepoint[++nomatches.idx % > nomatches_len] = utf8codepoint; > +no_match: > usedfont = drw->fonts; > } > } > -- > 2.34.1 > Hi NRK, I applied all the patches. Thanks for your work on this! Now people can test the master branch for a while or see if there are any issues. If there are issues it's not a problem. It can be iterated on. After that we can also sync the drw.{c,h} code to dwm, sent and libsl. Thanks, -- Kind regards, Hiltjo
