Hi, Ted Unangst wrote on Tue, Jul 11, 2017 at 09:41:36AM -0400:
> and it always sucks to lose information if somebody went to the > trouble of recording the necessary accents already. So here is a patch that makes putting UTF-8 characters into fortune/datfiles safe. Of course, we cannot protect people who go out of their way to harm themselves. In this case, that would require to first request LC_CTYPE=en_US.UTF-8 in the environment and then start an xterm(1) with the +u8 option, which specifically instructs it to ignore the environment, to not be UTF-8 capable, and to interpret certain bytes as terminal control sequences regardless. If people do that, lots of more important programs already become unsafe, as Stuart rightly observed (well, kind of), up to and including basic tools like ls(1) and ps(1). Note that unlike in /usr/src/usr.bin/ssh/utf8.c, inspecting nl_langinfo(3) is not required to distinguish locales. On OpenBSD, inspecting MB_CUR_MAX is sufficient (and KISS). We don't need to worry about giving the "portable fortune" team headaches when porting to Linux and Solaris, right? :) Even with nl_langinfo(3), fortune(6) wouldn't be portable once we put UTF-8 into the datfiles - that would require iconv(3) overkill. OK? Ingo Index: fortune/fortune.6 =================================================================== RCS file: /cvs/src/games/fortune/fortune/fortune.6,v retrieving revision 1.14 diff -u -p -r1.14 fortune.6 --- fortune/fortune.6 25 Sep 2015 17:37:23 -0000 1.14 +++ fortune/fortune.6 11 Jul 2017 17:21:52 -0000 @@ -177,6 +177,17 @@ the source code and a manual page for th can be found in .Pa /usr/src/games/fortune/strfile/ , if it exists. +.Sh ENVIRONMENT +.Bl -tag -width LC_CTYPE +.It Ev LC_CTYPE +The character encoding +.Xr locale 1 . +If unset or set to +.Qq C , +.Qq POSIX , +or an unsupported value, bytes that are not printable ASCII characters +are replaced with question marks in the output. +.El .Sh FILES .Bl -tag -width "/usr/share/games/fortune/*XX" -compact .It Pa /usr/share/games/fortune/* Index: fortune/fortune.c =================================================================== RCS file: /cvs/src/games/fortune/fortune/fortune.c,v retrieving revision 1.58 diff -u -p -r1.58 fortune.c --- fortune/fortune.c 30 Jun 2017 08:39:16 -0000 1.58 +++ fortune/fortune.c 11 Jul 2017 17:21:53 -0000 @@ -41,6 +41,7 @@ #include <err.h> #include <fcntl.h> #include <limits.h> +#include <locale.h> #include <stdbool.h> #include <stdio.h> #include <stdlib.h> @@ -133,6 +134,7 @@ FILEDESC * void print_file_list(void); void print_list(FILEDESC *, int); void rot13(char *, size_t); +void sanitize(unsigned char *cp); void sum_noprobs(FILEDESC *); void sum_tbl(STRFILE *, STRFILE *); __dead void usage(void); @@ -148,6 +150,8 @@ regex_t regex; int main(int ac, char *av[]) { + setlocale(LC_CTYPE, ""); + if (pledge("stdio rpath", NULL) == -1) { perror("pledge"); return 1; @@ -192,6 +196,16 @@ rot13(char *p, size_t len) } void +sanitize(unsigned char *cp) +{ + if (MB_CUR_MAX > 1) + return; + for (; *cp != '\0'; cp++) + if (!isprint(*cp) && !isspace(*cp)) + *cp = '?'; +} + +void display(FILEDESC *fp) { char line[BUFSIZ]; @@ -202,6 +216,7 @@ display(FILEDESC *fp) !STR_ENDSTRING(line, fp->tbl); Fort_len++) { if (fp->tbl.str_flags & STR_ROTATED) rot13(line, strlen(line)); + sanitize(line); fputs(line, stdout); } (void) fflush(stdout); @@ -1189,6 +1204,7 @@ matches_in_list(FILEDESC *list) in_file = 1; } putchar('\n'); + sanitize(Fortbuf); (void) fwrite(Fortbuf, 1, (sp - Fortbuf), stdout); } sp = Fortbuf;