Hi Pádraig, > I was doing some performance testing on cut(1) and noticed > surprisingly slow per character iteration in cut -c1 (new code using > lib/mcel). > Then I noticed the same performance issue with wc -m. > This was only with non-ASCII chars as both wc and lib/mcel have > shortcuts for ASCII, only deferring to mbrtoc32() for multi-byte. > > Bruno you originally identified this inefficiency at: > https://lists.gnu.org/r/bug-gnulib/2018-05/msg00173.html
But the proposed solution with factories (a factory that produces an mbrtowc-like function for a given locale, and likewise for wcwidth) never made it into Gnulib; it still sits on a local branch in my gnulib checkout. > I.e. that it's best to avoid glibc's mbrtowc() so we can > use gnulib's cached dispatch version. > > We do replace mbrtowc() on glibc always currently, > but wc was changed to using mbrtoc32() in coreutils v9.4-37-g14d35d5ba > which thus took the slower path since then I think. > > > So in summary if I now ./configure ac_cv_func_mbrtowc=no > (noting that coreutils already does > AC_DEFINE([GNULIB_WCHAR_SINGLE_LOCALE], [1]): > I get faster wc -m: > > $ time src/wc-before -m mb.in > 66060288 mb.in > real 0m2.717s > > $ time src/wc -m mb.in > 66060288 mb.in > real 0m1.232s > > > If I remove these lines from lib/mcel.h > and also have the above configure var set > I get faster cut -c: > > -#ifdef __GLIBC__ > -# undef mbrtoc32 > -#endif > > $ time src/cut-before -c1 mb.in >/dev/null > real 0m1.589s > > $ time src/cut -c1 mb.in >/dev/null > real 0m0.626s > > Paul it seems like we should not try to second guess the mbrtoc32 config, > especially as the default can be so significantly slower? > > > Now this is all quite brittle Its brittle because we asked the question "Which optimizations can we build into Gnulib, so that programs that stick to the POSIX API get accelerated?" We did not have the courage to switch from a POSIX API to a Gnulib-only API. And the POSIX API, unfortunately, is based on a hidden static locale (that is not even available as a 'locale_t'). Bruno
