On Fri, Mar 25, 2022 at 03:34:29PM +0600, NRK wrote:
> Hi,
> 
> Fixed a couple bugs/issues with the previous patches and attached the
> amended ones. I'm confident that these patches shouldn't have any
> remaining logic issues, but feel free to point them out in case there is.
> 
> Some notable changes from the previous patches:
> 
> * The drw related patches have separated. This is so that they can be
>   cleanly applied to libsl (and dwm). Should additionally help with the
>   review process.
> 
> * The truncation condition as well as remaining width to override has been
>   corrected.
> 
> * The TEXTW_CLAMP macro has been converted to a function to avoid
>   calculating width twice.
> 
> * Since the `invert` parameter in drw_text() does not get used when not
>   rendering, it's now being internally reused for specifying a minimum width.
> 
> Additionally, there's one *new* patch attached, which fixes the
> performance issue in case we have no font match for a given codepoint.
> 
> - NRK

> From 1425e94d0eb3cfbb4a9b73d433b9133bcb1b824c Mon Sep 17 00:00:00 2001
> From: NRK <[email protected]>
> Date: Thu, 24 Mar 2022 00:37:55 +0600
> Subject: [PATCH 1/6] drw_text: improve both performance and correctness
> 
> this patch makes some non-trivial changes, which significantly improves
> the performance of drawing large strings as well as fixes any issues
> regarding the printing of the ellipsis when string gets truncated.
> 
> * performance:
> 
> before there were two O(n) loops, one which finds how long we can go
> without changing font, and the second loop would (incorrectly) truncate
> the string if it's too big.
> 
> this patch merges the overflow calculation into the first loop and exits
> out when overflow is detected. when dumping lots of emojies into dmenu,
> i see some noticeable startup time improvement:
> 
> before -> after
> 460ms  -> 360ms
> 
> input latency when scrolling up/down is also noticeably better and can
> be tested with the following:
> 
>       for _ in $(seq 20); do
>               cat /dev/urandom | base64 | tr -d '\n' | head -c 1000000
>       echo
>       done | ./dmenu -l 10
> 
> * correctness:
> 
> the previous version would incorrectly assumed single byte chars and
> would overwrite them with '.' , this caused a whole bunch of obvious
> problems, including the ellipsis not getting rendered if then font
> changed.
> 
> in addition to exiting out when we detect overflow, this patch also
> keeps track of the last x-position where the ellipsis would fit. if we
> detect overflow, we simply make a recursing call to drw_text() at the
> ellipsis_x position and overwrite what was there.
> 
> so now the ellipsis will always be printed properly, regardless of
> weather the font changes or if the string is single byte char or not.
> 
> the idea of rendering the ellipsis on top incase of overflow was
> from Bakkeby <[email protected]>, thanks! however the original patch had
> some issues incorrectly truncating the prompt (-p flag) and cutting off
> emojies. those have been fixed in here.
> ---
>  drw.c | 56 ++++++++++++++++++++++++++++----------------------------
>  1 file changed, 28 insertions(+), 28 deletions(-)
> 
> diff --git a/drw.c b/drw.c
> index 4cdbcbe..e65d069 100644
> --- a/drw.c
> +++ b/drw.c
> @@ -251,12 +251,10 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, 
> unsigned int h, int filled, int
>  int
>  drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned 
> int lpad, const char *text, int invert)
>  {
> -     char buf[1024];
> -     int ty;
> -     unsigned int ew;
> +     int ty, ellipsis_x = 0;
> +     unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width;
>       XftDraw *d = NULL;
>       Fnt *usedfont, *curfont, *nextfont;
> -     size_t i, len;
>       int utf8strlen, utf8charlen, render = x || y || w || h;
>       long utf8codepoint = 0;
>       const char *utf8str;
> @@ -264,7 +262,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
> int h, unsigned int lp
>       FcPattern *fcpattern;
>       FcPattern *match;
>       XftResult result;
> -     int charexists = 0;
> +     int charexists = 0, overflow = 0;
>  
>       if (!drw || (render && !drw->scheme) || !text || !drw->fonts)
>               return 0;
> @@ -282,8 +280,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
> int h, unsigned int lp
>       }
>  
>       usedfont = drw->fonts;
> +     drw_font_getexts(usedfont, "...", 3, &ellipsis_width, NULL);
>       while (1) {
> -             utf8strlen = 0;
> +             ew = ellipsis_len = utf8strlen = 0;
>               utf8str = text;
>               nextfont = NULL;
>               while (*text) {
> @@ -291,9 +290,21 @@ drw_text(Drw *drw, int x, int y, unsigned int w, 
> unsigned int h, unsigned int lp
>                       for (curfont = drw->fonts; curfont; curfont = 
> curfont->next) {
>                               charexists = charexists || 
> XftCharExists(drw->dpy, curfont->xfont, utf8codepoint);
>                               if (charexists) {
> -                                     if (curfont == usedfont) {
> +                                     drw_font_getexts(curfont, text, 
> utf8charlen, &tmpw, NULL);
> +                                     if (ew + ellipsis_width <= w) {
> +                                             /* keep track where the 
> ellipsis still fits */
> +                                             ellipsis_x = x + ew;
> +                                             ellipsis_w = w - ew;
> +                                             ellipsis_len = utf8strlen;
> +                                     }
> +
> +                                     if (ew + tmpw > w) {
> +                                             overflow = 1;
> +                                             utf8strlen = ellipsis_len;
> +                                     } else if (curfont == usedfont) {
>                                               utf8strlen += utf8charlen;
>                                               text += utf8charlen;
> +                                             ew += tmpw;
>                                       } else {
>                                               nextfont = curfont;
>                                       }
> @@ -301,36 +312,25 @@ drw_text(Drw *drw, int x, int y, unsigned int w, 
> unsigned int h, unsigned int lp
>                               }
>                       }
>  
> -                     if (!charexists || nextfont)
> +                     if (overflow || !charexists || nextfont)
>                               break;
>                       else
>                               charexists = 0;
>               }
>  
>               if (utf8strlen) {
> -                     drw_font_getexts(usedfont, utf8str, utf8strlen, &ew, 
> NULL);
> -                     /* shorten text if necessary */
> -                     for (len = MIN(utf8strlen, sizeof(buf) - 1); len && ew 
> > w; len--)
> -                             drw_font_getexts(usedfont, utf8str, len, &ew, 
> NULL);
> -
> -                     if (len) {
> -                             memcpy(buf, utf8str, len);
> -                             buf[len] = '\0';
> -                             if (len < utf8strlen)
> -                                     for (i = len; i && i > len - 3; 
> buf[--i] = '.')
> -                                             ; /* NOP */
> -
> -                             if (render) {
> -                                     ty = y + (h - usedfont->h) / 2 + 
> usedfont->xfont->ascent;
> -                                     XftDrawStringUtf8(d, 
> &drw->scheme[invert ? ColBg : ColFg],
> -                                                       usedfont->xfont, x, 
> ty, (XftChar8 *)buf, len);
> -                             }
> -                             x += ew;
> -                             w -= ew;
> +                     if (render) {
> +                             ty = y + (h - usedfont->h) / 2 + 
> usedfont->xfont->ascent;
> +                             XftDrawStringUtf8(d, &drw->scheme[invert ? 
> ColBg : ColFg],
> +                                               usedfont->xfont, x, ty, 
> (XftChar8 *)utf8str, utf8strlen);
>                       }
> +                     x += ew;
> +                     w -= ew;
>               }
> +             if (render && overflow)
> +                     drw_text(drw, ellipsis_x, y, ellipsis_w, h, 0, "...", 
> invert);
>  
> -             if (!*text) {
> +             if (!*text || overflow) {
>                       break;
>               } else if (nextfont) {
>                       charexists = 0;
> -- 
> 2.34.1
> 

> From cebbfe70f25d52c290172640614cb592955ec67f Mon Sep 17 00:00:00 2001
> From: NRK <[email protected]>
> Date: Thu, 24 Mar 2022 02:00:00 +0600
> Subject: [PATCH 2/6] introduce drw_fontset_getwidth_clamp()
> 
> getting the width of a string is an O(n) operation, and in many cases
> users only care about getting the width upto a certain number.
> 
> instead of calling drw_fontset_getwidth() and *then* clamping the
> result, this patch introduces drw_fontset_getwidth_clamp() function,
> similar to strnlen(), which will stop once we reach n.
> 
> the `invert` parameter was overloaded internally to preserve the API,
> however library users should be calling drw_fontset_getwidth_clamp() and
> not depend upon internal behavior of drw_text().
> ---
>  drw.c | 19 +++++++++++++++++--
>  drw.h |  1 +
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drw.c b/drw.c
> index e65d069..7d985b1 100644
> --- a/drw.c
> +++ b/drw.c
> @@ -268,7 +268,7 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
> int h, unsigned int lp
>               return 0;
>  
>       if (!render) {
> -             w = ~w;
> +             w = invert ? invert : ~invert;
>       } else {
>               XSetForeground(drw->dpy, drw->gc, drw->scheme[invert ? ColFg : 
> ColBg].pixel);
>               XFillRectangle(drw->dpy, drw->drawable, drw->gc, x, y, w, h);
> @@ -300,7 +300,13 @@ drw_text(Drw *drw, int x, int y, unsigned int w, 
> unsigned int h, unsigned int lp
>  
>                                       if (ew + tmpw > w) {
>                                               overflow = 1;
> -                                             utf8strlen = ellipsis_len;
> +                                             /* called from 
> drw_fontset_getwidth_clamp():
> +                                              * it wants the width AFTER the 
> overflow
> +                                              */
> +                                             if (!render)
> +                                                     x += tmpw;
> +                                             else
> +                                                     utf8strlen = 
> ellipsis_len;
>                                       } else if (curfont == usedfont) {
>                                               utf8strlen += utf8charlen;
>                                               text += utf8charlen;
> @@ -397,6 +403,15 @@ drw_fontset_getwidth(Drw *drw, const char *text)
>       return drw_text(drw, 0, 0, 0, 0, 0, text, 0);
>  }
>  
> +unsigned int
> +drw_fontset_getwidth_clamp(Drw *drw, const char *text, unsigned int n)
> +{
> +     unsigned int tmp = 0;
> +     if (drw && drw->fonts && text && n)
> +             tmp = drw_text(drw, 0, 0, 0, 0, 0, text, n);
> +     return MIN(n, tmp);
> +}
> +
>  void
>  drw_font_getexts(Fnt *font, const char *text, unsigned int len, unsigned int 
> *w, unsigned int *h)
>  {
> diff --git a/drw.h b/drw.h
> index 4c67419..fd7631b 100644
> --- a/drw.h
> +++ b/drw.h
> @@ -35,6 +35,7 @@ void drw_free(Drw *drw);
>  Fnt *drw_fontset_create(Drw* drw, const char *fonts[], size_t fontcount);
>  void drw_fontset_free(Fnt* set);
>  unsigned int drw_fontset_getwidth(Drw *drw, const char *text);
> +unsigned int drw_fontset_getwidth_clamp(Drw *drw, const char *text, unsigned 
> int n);
>  void drw_font_getexts(Fnt *font, const char *text, unsigned int len, 
> unsigned int *w, unsigned int *h);
>  
>  /* Colorscheme abstraction */
> -- 
> 2.34.1
> 

> From 557a6ecd4aae5b8baf4f1e8f324ad17efce72df0 Mon Sep 17 00:00:00 2001
> From: NRK <[email protected]>
> Date: Thu, 24 Mar 2022 02:04:04 +0600
> Subject: [PATCH 3/6] significantly improve performance on large strings
> 
> this replaces inefficient pattern of `MIN(TEXTW(..), n)` with
> drw_fontset_getwidth_clamp() instead, which is far more efficient when
> we only want up to a certain width.
> 
> dumping a decently sized (unicode) emoji file into dmenu, I see the
> startup time drop significantly with this patch.
> 
> before -> after
> 360ms  -> 160ms
> 
> this should also noticeably improve input latency (responsiveness) given
> that calcoffsets() and drawmenu() are pretty hot functions.
> ---
>  dmenu.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/dmenu.c b/dmenu.c
> index eca67ac..cde394b 100644
> --- a/dmenu.c
> +++ b/dmenu.c
> @@ -58,6 +58,13 @@ static Clr *scheme[SchemeLast];
>  static int (*fstrncmp)(const char *, const char *, size_t) = strncmp;
>  static char *(*fstrstr)(const char *, const char *) = strstr;
>  
> +static unsigned int
> +textw_clamp(const char *str, unsigned int n)
> +{
> +     unsigned int w = drw_fontset_getwidth_clamp(drw, str, n) + lrpad;
> +     return MIN(w, n);
> +}
> +
>  static void
>  appenditem(struct item *item, struct item **list, struct item **last)
>  {
> @@ -82,10 +89,10 @@ calcoffsets(void)
>               n = mw - (promptw + inputw + TEXTW("<") + TEXTW(">"));
>       /* calculate which items will begin the next page and previous page */
>       for (i = 0, next = curr; next; next = next->right)
> -             if ((i += (lines > 0) ? bh : MIN(TEXTW(next->text), n)) > n)
> +             if ((i += (lines > 0) ? bh : textw_clamp(next->text, n)) > n)
>                       break;
>       for (i = 0, prev = curr; prev && prev->left; prev = prev->left)
> -             if ((i += (lines > 0) ? bh : MIN(TEXTW(prev->left->text), n)) > 
> n)
> +             if ((i += (lines > 0) ? bh : textw_clamp(prev->left->text, n)) 
> > n)
>                       break;
>  }
>  
> @@ -172,7 +179,7 @@ drawmenu(void)
>               }
>               x += w;
>               for (item = curr; item != next; item = item->right)
> -                     x = drawitem(item, x, 0, MIN(TEXTW(item->text), mw - x 
> - TEXTW(">")));
> +                     x = drawitem(item, x, 0, textw_clamp(item->text, mw - x 
> - TEXTW(">")));
>               if (next) {
>                       w = TEXTW(">");
>                       drw_setscheme(drw, scheme[SchemeNorm]);
> -- 
> 2.34.1
> 

> From 2c1da37e87e219014d838aecc05bd38c4f66fd83 Mon Sep 17 00:00:00 2001
> From: NRK <[email protected]>
> Date: Thu, 24 Mar 2022 00:37:55 +0600
> Subject: [PATCH 4/6] inputw: improve correctness and startup performance
> 
> a massive amount of time inside readstdin() is spent trying to get the
> max input width and then put it into inputw, only for it to get clamped
> down to mw/3 inside setup().
> 
> it makes more sense to calculate inputw inside setup() once we have mw
> available. similar to the last patch, i see noticeable startup
> performance improvement:
> 
> before -> after
> 160ms  -> 60ms
> 
> additionally this will take fallback fonts into account compared to the
> previous version, so it's not only more performant but also more correct.
> ---
>  dmenu.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/dmenu.c b/dmenu.c
> index cde394b..d989d39 100644
> --- a/dmenu.c
> +++ b/dmenu.c
> @@ -547,8 +547,7 @@ static void
>  readstdin(void)
>  {
>       char buf[sizeof text], *p;
> -     size_t i, imax = 0, size = 0;
> -     unsigned int tmpmax = 0;
> +     size_t i, size = 0;
>  
>       /* read each line from stdin and add it to the item list */
>       for (i = 0; fgets(buf, sizeof buf, stdin); i++) {
> @@ -560,15 +559,9 @@ readstdin(void)
>               if (!(items[i].text = strdup(buf)))
>                       die("cannot strdup %u bytes:", strlen(buf) + 1);
>               items[i].out = 0;
> -             drw_font_getexts(drw->fonts, buf, strlen(buf), &tmpmax, NULL);
> -             if (tmpmax > inputw) {
> -                     inputw = tmpmax;
> -                     imax = i;
> -             }
>       }
>       if (items)
>               items[i].text = NULL;
> -     inputw = items ? TEXTW(items[imax].text) : 0;
>       lines = MIN(lines, i);
>  }
>  
> @@ -614,12 +607,13 @@ static void
>  setup(void)
>  {
>       int x, y, i, j;
> -     unsigned int du;
> +     unsigned int du, tmp;
>       XSetWindowAttributes swa;
>       XIM xim;
>       Window w, dw, *dws;
>       XWindowAttributes wa;
>       XClassHint ch = {"dmenu", "dmenu"};
> +     struct item *item;
>  #ifdef XINERAMA
>       XineramaScreenInfo *info;
>       Window pw;
> @@ -677,7 +671,12 @@ setup(void)
>               mw = wa.width;
>       }
>       promptw = (prompt && *prompt) ? TEXTW(prompt) - lrpad / 4 : 0;
> -     inputw = MIN(inputw, mw/3);
> +     for (item = items; item && item->text; ++item) {
> +             if ((tmp = textw_clamp(item->text, mw/3)) > inputw) {
> +                     if ((inputw = tmp) == mw/3)
> +                             break;
> +             }
> +     }
>       match();
>  
>       /* create menu window */
> -- 
> 2.34.1
> 

> From 7dcadf228f2dc48aaf3e9416116e06cceca91cb8 Mon Sep 17 00:00:00 2001
> From: NRK <[email protected]>
> Date: Thu, 24 Mar 2022 00:37:55 +0600
> Subject: [PATCH 5/6] drw_text: improve performance when there's no match
> 
> this was the last piece of the puzzle, the case where we can't find any
> font to draw the codepoint.
> 
> in such cases, we use XftFontMatch() which is INSANELY slow. but that's
> not the real problem. the real problem was we were continuously trying
> to match the same thing over and over again.
> 
> this patch introduces a small cache, which keeps track a couple
> codepoints for which we know we won't find any matches.
> 
> with this, i can dump lots of emojies into dmenu where some of them
> don't have any matching font, and still not have dmenu lag insanely or
> FREEZE completely when scrolling up and down.
> 
> this also improves startup time, which will of course depend on the
> system and all installed fonts; but on my system and test case i see the
> following startup time drop:
> 
> before -> after
> 60ms   -> 34ms
> ---
>  drw.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drw.c b/drw.c
> index 7d985b1..a50c9ee 100644
> --- a/drw.c
> +++ b/drw.c
> @@ -251,7 +251,7 @@ drw_rect(Drw *drw, int x, int y, unsigned int w, unsigned 
> int h, int filled, int
>  int
>  drw_text(Drw *drw, int x, int y, unsigned int w, unsigned int h, unsigned 
> int lpad, const char *text, int invert)
>  {
> -     int ty, ellipsis_x = 0;
> +     int i, ty, ellipsis_x = 0;
>       unsigned int tmpw, ew, ellipsis_w = 0, ellipsis_len, ellipsis_width;
>       XftDraw *d = NULL;
>       Fnt *usedfont, *curfont, *nextfont;
> @@ -263,6 +263,9 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
> int h, unsigned int lp
>       FcPattern *match;
>       XftResult result;
>       int charexists = 0, overflow = 0;
> +     /* keep track of a couple codepoints for which we have no match. */
> +     enum { nomatches_len = 64 };
> +     static struct { long codepoint[nomatches_len]; unsigned int idx; } 
> nomatches;
>  
>       if (!drw || (render && !drw->scheme) || !text || !drw->fonts)
>               return 0;
> @@ -346,6 +349,12 @@ drw_text(Drw *drw, int x, int y, unsigned int w, 
> unsigned int h, unsigned int lp
>                        * character must be drawn. */
>                       charexists = 1;
>  
> +                     for (i = 0; i < nomatches_len; ++i) {
> +                             /* avoid calling XftFontMatch if we know we 
> won't find a match */
> +                             if (utf8codepoint == nomatches.codepoint[i])
> +                                     goto no_match;
> +                     }
> +
>                       fccharset = FcCharSetCreate();
>                       FcCharSetAddChar(fccharset, utf8codepoint);
>  
> @@ -374,6 +383,8 @@ drw_text(Drw *drw, int x, int y, unsigned int w, unsigned 
> int h, unsigned int lp
>                                       curfont->next = usedfont;
>                               } else {
>                                       xfont_free(usedfont);
> +                                     nomatches.codepoint[++nomatches.idx % 
> nomatches_len] = utf8codepoint;
> +no_match:
>                                       usedfont = drw->fonts;
>                               }
>                       }
> -- 
> 2.34.1
> 

Hi NRK,

I applied all the patches. Thanks for your work on this!

Now people can test the master branch for a while or see if there are any
issues.  If there are issues it's not a problem. It can be iterated on.

After that we can also sync the drw.{c,h} code to dwm, sent and libsl.

Thanks,

-- 
Kind regards,
Hiltjo

Reply via email to