Hi Christoph, Thanks for the elaborate research and the fix! I've merged it and pushed things to the github repo.
On Sun, May 15, 2016 at 11:54:00PM +0200, Christoph Biedl wrote: > Package: multitail > Version: 6.4.2-1 > Severity: normal > > Dear Maintainer, > > if the locale settings are somewhat utf-8-ish, illegal utf-8 character > sequences break any future multitail output until the program is > terminated. Instead of the actual text, the string "^`" is printed for > each character, inverted. > > > Reproducer: > > printf '\xf8\xa7\nhello\n' | LANG=en_US.UTF-8 multitail -j > > > Analysis: > > The mbsrtowcs invocation in mt.c might fail. The program does not catch > this but uses the pre-defined value for wcur which is the null byte. > The documentation states the value is undefined but appearently it's > just left untouched. That null value is considered a control character > and the code to print that generates the observed sequence (it should > rather be "^@", different story). > > The surprising thing is any future invocations to mbsrtowcs will fail, > too. Appearently the internal state is distorted in a way it cannot > recover even when processing sound input. This might be an issue in > glibc, I've filed #824429 > > > Workaround: > > Set LANG to C before calling multitail. > > > Possible fixes: > > - Set LC_CTYPE to "C" instead of "" in main. Unless this has other > side effects. > - Instead of using the internal state, provide one on our own, > and reset it upon error, see below. > > In either way it's a good idea to set wcur to the question mark to > indicate a character conversion error. More elaborate things were > printing it inverted and/or using the replacement character U+FFFD for > this. Figuring out side effects is left as an exercise to the reader. > > Christoph > > --- a/mt.c > +++ b/mt.c > @@ -617,8 +617,12 @@ void do_color_print(proginfo *cur, char *use_string, int > prt_start, int prt_end, > > #ifdef UTF8_SUPPORT > const char *dummy = &use_string[offset]; > - wchar_t wcur = 0; > - mbsrtowcs(&wcur, &dummy, 1, NULL); > + wchar_t wcur; > + static mbstate_t state; > + if (mbsrtowcs(&wcur, &dummy, 1, &state) == -1) { > + wcur = '?'; > + memset (&state, '\0', sizeof (mbstate_t)); > + } > #else > char wcur = use_string[offset]; > #endif > > > -- System Information: > Debian Release: stretch/sid > APT prefers testing > APT policy: (500, 'testing') > Architecture: amd64 (x86_64) > > Kernel: Linux 4.4.9 (SMP w/4 CPU cores) > Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) > Shell: /bin/sh linked to /bin/dash > Init: unable to detect > > Versions of packages multitail depends on: > ii libc6 2.22-7 > ii libncursesw5 6.0+20160319-1 > ii libtinfo5 6.0+20160319-1 > > multitail recommends no packages. > > multitail suggests no packages. > > -- no debconf information Folkert van Heusden -- Multitail est un outil permettant la visualisation de fichiers de journalisation et/ou le suivi de l'exécution de commandes. Filtrage, mise en couleur de mot-clé, fusions, visualisation de différences (diff-view), etc. http://www.vanheusden.com/multitail/ ---------------------------------------------------------------------- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com