Ingo Schwarze wrote: > Hi Christian & Michael, > > Michael McConville wrote on Thu, Dec 24, 2015 at 04:19:03PM -0500: > > Christian Heckendorf wrote: > > >> A couple of somewhat recent changes in NetBSD's libedit permit > >> el_gets(3) to accept multibyte characters if the locale supports > >> it. > > Ugh. The amount of indirection in that code is disturbing, and the > amount of contortion is disgusting. Such stuff is highly error > prone, in particular since the manual is way below OpenBSD standards > (functions mentioned in the SYNOPSIS but not actually documented, > vague statements, confusion regarding bytes versus characters, ...). > Besides, it's not completely clear which parts of the interface are > public and which are internal to the library. > > In the vicinity of this particular diff: The IGNORE_EXTCHARS flag > appears to be private to the library, the users seems to have no > way to change it. Otherwise, the existing code in el_getc(3) > would already be broken because it clears the flag on exit even > if it was set on entry. But as an internal flag, it's completely > pointless. If CHARSET_IS_UTF8 is set, the present diff makes > sure it is never set. If CHARSET_IS_UTF8 is not set, it has no > effect because the only place where it is used also checks "bytes > 1", > which cannot happen in the C locale. > > But if we want to stay in sync with upstream, freely borrowing Bob's > whale flensing knife may not be the best idea. > > Also note that el_gets(3) is documented to return the number of > characters read, but actually, callers assume it returns the > number of *bytes* read, so what your diff is doing makes sense. > > Michael, as you already looked at NetBSD, is there a documentation > update to go with this diff? >
I'll jump in here. As far as I can tell, the man entry for this function describing the value of count hasn't been touched since the original commit for the document in 1997 (before the wide variants existed). I also found the documentation to be a source of confusion. Perhaps a change like this would help to clarify at least el_gets(3) and el_wgets(3). -- Christian Index: editline.3 =================================================================== RCS file: /cvs/src/lib/libedit/editline.3,v retrieving revision 1.39 diff -u -p -r1.39 editline.3 --- editline.3 14 Sep 2015 13:45:25 -0000 1.39 +++ editline.3 30 Dec 2015 20:07:59 -0000 @@ -208,7 +208,7 @@ state. .It Fn el_gets Read a line from the tty. .Fa count -is modified to contain the number of characters read. +is modified to contain the number of bytes read. Returns the line read if successful, or .Dv NULL if no characters were read or if an error occurred. @@ -220,6 +220,13 @@ contains the error code that caused it. The return value may not remain valid across calls to .Fn el_gets and must be copied if the data is to be retained. +.It Fn el_wgets +Behaves the same way as +.Fn el_gets +except that +.Fa count +is modified to contain the number of wide characters read, rather +than bytes. .It Fn el_getc Read a character from the tty. .Fa ch