Ingo Schwarze wrote:

> Hi Christian & Michael,
> 
> Michael McConville wrote on Thu, Dec 24, 2015 at 04:19:03PM -0500:
> > Christian Heckendorf wrote:
> 
> >> A couple of somewhat recent changes in NetBSD's libedit permit
> >> el_gets(3) to accept multibyte characters if the locale supports
> >> it.
> 
> Ugh.  The amount of indirection in that code is disturbing, and the
> amount of contortion is disgusting.  Such stuff is highly error
> prone, in particular since the manual is way below OpenBSD standards
> (functions mentioned in the SYNOPSIS but not actually documented,
> vague statements, confusion regarding bytes versus characters, ...).
> Besides, it's not completely clear which parts of the interface are
> public and which are internal to the library.
> 
> In the vicinity of this particular diff:  The IGNORE_EXTCHARS flag
> appears to be private to the library, the users seems to have no
> way to change it.  Otherwise, the existing code in el_getc(3)
> would already be broken because it clears the flag on exit even
> if it was set on entry.  But as an internal flag, it's completely
> pointless.  If CHARSET_IS_UTF8 is set, the present diff makes
> sure it is never set.  If CHARSET_IS_UTF8 is not set, it has no
> effect because the only place where it is used also checks "bytes > 1",
> which cannot happen in the C locale.
> 
> But if we want to stay in sync with upstream, freely borrowing Bob's
> whale flensing knife may not be the best idea.
> 
> Also note that el_gets(3) is documented to return the number of
> characters read, but actually, callers assume it returns the
> number of *bytes* read, so what your diff is doing makes sense.
> 
> Michael, as you already looked at NetBSD, is there a documentation
> update to go with this diff?
> 

I'll jump in here. As far as I can tell, the man entry for this
function describing the value of count hasn't been touched since
the original commit for the document in 1997 (before the wide
variants existed). I also found the documentation to be a source
of confusion.

Perhaps a change like this would help to clarify at least el_gets(3)
and el_wgets(3).

--
Christian


Index: editline.3
===================================================================
RCS file: /cvs/src/lib/libedit/editline.3,v
retrieving revision 1.39
diff -u -p -r1.39 editline.3
--- editline.3  14 Sep 2015 13:45:25 -0000      1.39
+++ editline.3  30 Dec 2015 20:07:59 -0000
@@ -208,7 +208,7 @@ state.
 .It Fn el_gets
 Read a line from the tty.
 .Fa count
-is modified to contain the number of characters read.
+is modified to contain the number of bytes read.
 Returns the line read if successful, or
 .Dv NULL
 if no characters were read or if an error occurred.
@@ -220,6 +220,13 @@ contains the error code that caused it.
 The return value may not remain valid across calls to
 .Fn el_gets
 and must be copied if the data is to be retained.
+.It Fn el_wgets
+Behaves the same way as
+.Fn el_gets
+except that
+.Fa count
+is modified to contain the number of wide characters read, rather
+than bytes.
 .It Fn el_getc
 Read a character from the tty.
 .Fa ch

Reply via email to