That is:
- libedit has a wchar_t* buffer (el->el_line.buffer) and el_line calls
ct_encode_string to convert it to a char*.
- ct_encode_string calls wctomb which it expects to make UTF-8 but in
fact because setlocale has not been called it outputs ASCII.
- el_line then uses ct_enc_width which assumes UTF-8 and returns 2. So
the offset is adjusted by 2 even though only 1 byte was filled in.
- ftp obviously isn't happy about having a position after a \0, so it
goes boom.
The setlocale() change below will only fix the problem if LC_CTYPE or
LC_ALL is set to UTF-8. ftp still cores if pasting UTF-8 in C locale.
I think the right fix is for libedit to use the return value of wctomb
to adjust the offset rather than assuming UTF-8 and working out the
width itself.
Perhaps something like this (very lightly tested):
Index: chartype.c
===================================================================
RCS file: /cvs/src/lib/libedit/chartype.c,v
retrieving revision 1.4
diff -u -p -r1.4 chartype.c
--- chartype.c 17 Nov 2011 20:14:24 -0000 1.4
+++ chartype.c 31 Oct 2012 00:13:12 -0000
@@ -44,6 +44,8 @@
#define CT_BUFSIZ 1024
#ifdef WIDECHAR
+protected ssize_t ct_encode_char1(char *, size_t, Char);
+
protected void
ct_conv_buff_resize(ct_buffer_t *conv, size_t mincsize, size_t minwsize)
{
@@ -178,27 +180,25 @@ ct_decode_argv(int argc, const char *arg
protected size_t
ct_enc_width(Char c)
{
- /* UTF-8 encoding specific values */
- if (c < 0x80)
- return 1;
- else if (c < 0x0800)
- return 2;
- else if (c < 0x10000)
- return 3;
- else if (c < 0x110000)
- return 4;
- else
- return 0; /* not a valid codepoint */
+ char s[MB_CUR_MAX];
+
+ return ct_encode_char1(s, sizeof s, c);
}
protected ssize_t
ct_encode_char(char *dst, size_t len, Char c)
{
- ssize_t l = 0;
if (len < ct_enc_width(c))
return -1;
- l = ct_wctomb(dst, c);
+ return ct_encode_char1(dst, len, c);
+}
+protected ssize_t
+ct_encode_char1(char *dst, size_t len, Char c)
+{
+ ssize_t l = 0;
+
+ l = ct_wctomb(dst, c);
if (l < 0) {
ct_wctomb_reset;
l = 0;
On Tue, Oct 30, 2012 at 11:56:18PM +0000, Nicholas Marriott wrote:
> Hi
>
> The buffer isn't zero-terminated, it's the result of calling wctomb to
> convert the internal wchar_t* that libedit has into a char*.
>
> libedit works out the offset in el_line with ct_enc_width which rather
> foolishly makes the assumption that wctomb will convert to UTF-8, but
> ftp doesn't call setlocale so it just leaves it as ASCII.
>
> Try this:
>
> Index: main.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/ftp/main.c,v
> retrieving revision 1.85
> diff -u -p -r1.85 main.c
> --- main.c 26 Aug 2012 02:16:02 -0000 1.85
> +++ main.c 30 Oct 2012 23:52:34 -0000
> @@ -67,6 +67,7 @@
>
> #include <ctype.h>
> #include <err.h>
> +#include <locale.h>
> #include <netdb.h>
> #include <pwd.h>
> #include <stdio.h>
> @@ -90,6 +91,8 @@ main(volatile int argc, char *argv[])
> char *outfile = NULL;
> const char *errstr;
> int dumb_terminal = 0;
> +
> + setlocale(LC_CTYPE, "");
>
> ftpport = "ftp";
> httpport = "http";
>
>
>
>
>
> On Tue, Oct 30, 2012 at 10:31:16PM +0100, Otto Moerbeek wrote:
> > On Tue, Oct 30, 2012 at 10:17:12PM +0100, Otto Moerbeek wrote:
> >
> > > On Tue, Oct 30, 2012 at 08:59:27PM +0100, Juan Francisco Cantero Hurtado
> > > wrote:
> > >
> > > > On Tue, Oct 30, 2012 at 09:31:58AM +0100, Otto Moerbeek wrote:
> > > > > On Mon, Oct 29, 2012 at 06:43:13PM +0100, Juan Francisco Cantero
> > > > > Hurtado wrote:
> > > > >
> > > > > > Chris Cappuccio sent me a mail saying he can't see the characters,
> > > > > > only
> > > > > > a question mark.
> > > > > >
> > > > > > I'm linking each character to their wikipedia page, so you can
> > > > > > copy-paste the character.
> > > > > >
> > > > > > On Thu, Oct 25, 2012 at 05:07:34AM +0200, Juan Francisco Cantero
> > > > > > Hurtado wrote:
> > > > > > > This afternoon I was downloading a tarball from a OpenBSD mirror.
> > > > > > > I
> > > > > > > press the key "?" and after the tab key. ftp crashed with a
> > > > > > > segfault.
> > > > >
> > > > > Please also include your environment settings. It is likely locale
> > > > > plays a role here.
> > > > >
> > > > > At least env | grep LC
> > > > >
> > > >
> > > > I've tried the bug in amd64 without locales and also with
> > > > LC_TIME="es_ES.ISO8859-1" LC_CTYPE="en_US.UTF-8".
> > > >
> > > > The i386 system was a clean installation in a virtual machine.
> > >
> > > I can now reproduce using a terminal that accepts more than just low
> > > ascii.
> > >
> > > What I see is that when complete() is called the cursor position in
> > > the EditLine struct is not what it is supposed to be, it points a
> > > couple of bytes beyond the terminating NUL while it is supposed to
> > > point to the NUL. That causes confusing in the scanner, getting the
> > > argument list count wrong.
> >
> > Ehh, the buffer is not NUL terminated, but observation still holds:
> > the cursor position is a couple of bytes further than it
> > should be.
> >
> > >
> > > The root of the problem seems to be inside the editline lib.
> > >
> > > Cc:ing nicm@, maybe he has a clue
> > >
> > > -Otto
> > >
> > >
> > > >
> > > > >
> > > > > > https://en.wikipedia.org/wiki/%C2%BA
> > > > > > >
> > > > > > > Steps for reproduce:
> > > > > > > # ftp ftp.fr.openbsd.org
> > > > > > > user and password
> > > > > > > ascii art
> > > > > > > ftp> cd pub/Open? <- Here press the tab key
> > > > > > https://en.wikipedia.org/wiki/%C2%BA
> > > > > > > segmentation fault (core dumped) ftp ftp.fr.openbsd.org
> > > > > > >
> > > > > > > It also crashes with the letter "?" and "?".
> > > > > > https://en.wikipedia.org/wiki/%C3%81
> > > > > > https://en.wikipedia.org/wiki/%C3%91
> > > > > > >
> > > > > > > Tested in:
> > > > > > > - A snapshot from yesterday. i386. root account. console/ksh
> > > > > > > without
> > > > > > > locales.
> > > > > > > - A snapshot from a few days ago. amd64. user. urxvt/zsh with utf8
> > > > > > > locales.
> > > > > > >
> > > > > > > I also tested the bug in a remote session with OpenBSD 4.7 and
> > > > > > > ftp works
> > > > > > > without problems.
> > > > > > >
> > > > > > > I've updated the code of usr.bin/ftp to 2012-10-01 and 2012-01-01
> > > > > > > and
> > > > > > > tried both versions. ftp also crashes.
> > > > > > >
> > > > > > > Backtrace:
> > > > > > > Thread 1 (process 3436):
> > > > > > > #0 memcpy (dst0=0x9d4160, src0=Variable "src0" is not available.
> > > > > > > ) at /usr/src/lib/libc/string/bcopy.c:115
> > > > > > > #1 0x000000000040432b in complete (el=Variable "el" is not
> > > > > > > available.
> > > > > > > ) at /usr/src/usr.bin/ftp/complete.c:313
> > > > > > > #2 0x000000000041eb84 in el_wgets (el=0x20da64800,
> > > > > > > nread=0x7f7ffffe3ebc) at read.c:612
> > > > > > > #3 0x000000000041ef8d in el_gets (el=0x20da64800, nread=Variable
> > > > > > > "nread" is not available.
> > > > > > > ) at eln.c:78
> > > > > > > #4 0x000000000040e55f in cmdscanner (top=Variable "top" is not
> > > > > > > available.
> > > > > > > ) at /usr/src/usr.bin/ftp/main.c:465
> > > > > > > #5 0x000000000040eb7c in main (argc=1, argv=0x7f7ffffe4398) at
> > > > > > > /usr/src/usr.bin/ftp/main.c:369
> > > > > > >
> > > > > > > Let me know if it's necessary more info or whatever :)
> > > > > > >
> > > > > > > Cheers.
> > > > > > >
> > > > > >
> > > >
> > > > --
> > > > Juan Francisco Cantero Hurtado http://juanfra.info