On Fri, Mar 08, 2019 at 02:39:32PM +0100, Ingo Schwarze wrote: > Hi, > > Lauri Tirkkonen wrote on Fri, Mar 08, 2019 at 12:43:11PM +0200: > > > I feel like xterm should just use the system wcwidth() to avoid these > > mismatches, so rudimentary diff to do that below. > > Absolutely, i strongly agree with that sentiment. > > Having a local character width table in an application program is > just wrong. Archaic operating systems like Oracle Solaris 11 that > have broken wcwidth(3) implementations should either fix their C > library or go to hell. > > I also agree with the details of the diff. It is minimal in so far > as it changes only one file, keeping the annoyance minimal when merging > new xterm releases. The function systemWcwidthOk() is horrific > enough that deleting it outright makes sense, to make sure that it > can never get called by accident. > > Note that the command line options -cjk_width and -mk_width and the > related resources remain broken. But i'd say people using them get > what they deserve. The disruption to updates that would be caused > by excising them from multiple files is probably better avoided. > > Unchanged diff left in place below for easy reference. > > OK to commit?
hi, I would prefer a diff that just add a &&!defined(__OpenBSD__) to the condition before the definition of systemWcwidthOk(). This will cause less risk of conflicts in future updates and clearly show the intention. but either way ok matthieu@ > Ingo > > > diff --git a/app/xterm/util.c b/app/xterm/util.c > index a3de4a005..9b6443683 100644 > --- a/app/xterm/util.c > +++ b/app/xterm/util.c > @@ -4980,61 +4980,6 @@ decode_keyboard_type(XtermWidget xw, XTERM_RESOURCE * > rp) > } > > #if OPT_WIDE_CHARS > -#if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH) > -/* > - * If xterm is running in a UTF-8 locale, it is still possible to encounter > - * old runtime configurations which yield incomplete or inaccurate data. > - */ > -static Bool > -systemWcwidthOk(int samplesize, int samplepass) > -{ > - wchar_t n; > - int oops = 0; > - > - for (n = 21; n <= 25; ++n) { > - wchar_t code = (wchar_t) dec2ucs(NULL, (unsigned) n); > - int system_code = wcwidth(code); > - int intern_code = mk_wcwidth(code); > - > - /* > - * Solaris 10 wcwidth() returns "2" for all of the line-drawing (page > - * 0x2500) and most of the geometric shapes (a few are excluded, just > - * to make it more difficult to use). Do a sanity check to avoid using > - * it. > - */ > - if ((system_code < 0 && intern_code >= 1) > - || (system_code >= 0 && intern_code != system_code)) { > - TRACE(("systemWcwidthOk: broken system line-drawing wcwidth\n")); > - oops += (samplepass + 1); > - break; > - } > - } > - > - for (n = 0; n < (wchar_t) samplesize; ++n) { > - int system_code = wcwidth(n); > - int intern_code = mk_wcwidth(n); > - > - /* > - * When this check was originally implemented, there were few if any > - * libraries with full Unicode coverage. Time passes, and it is > - * possible to make a full comparison of the BMP. There are some > - * differences: mk_wcwidth() marks some codes as combining and some > - * as single-width, differing from GNU libc. > - */ > - if ((system_code < 0 && intern_code >= 1) > - || (system_code >= 0 && intern_code != system_code)) { > - TRACE((".. width(U+%04X) = %d, expected %d\n", > - (unsigned) n, system_code, intern_code)); > - if (++oops > samplepass) > - break; > - } > - } > - TRACE(("systemWcwidthOk: %d/%d mismatches, allowed %d\n", > - oops, (int) n, samplepass)); > - return (oops <= samplepass); > -} > -#endif /* HAVE_WCWIDTH */ > - > void > decode_wcwidth(XtermWidget xw) > { > @@ -5045,8 +4990,7 @@ decode_wcwidth(XtermWidget xw) > switch (mode) { > default: > #if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH) > - if (xtermEnvUTF8() && > - systemWcwidthOk(xw->misc.mk_samplesize, xw->misc.mk_samplepass)) { > + if (xtermEnvUTF8()) { > my_wcwidth = wcwidth; > TRACE(("using system wcwidth() function\n")); > break; -- Matthieu Herrb