Re: Fix handling of ed addresses consisting only of separators
Hi, I've been using this patch for the past 6 months and haven't noticed any regressions. However, during this timespan I became aware of additional POSIX compatibility issues in OpenBSD ed: `3 2p` as well as addresses with interleaved `,` and `;` separators (e.g. `1,;`) are also evaluated incorrectly. These additional compatibility issues are unrelated to the issue fixed by the proposed patch. Is there a general interest in improving the POSIX compatibility of OpenBSD ed? If so, what would need to be done in order to get the proposed patch committed? Greetings, Sören Sören Tempel wrote: > PING. > > I have executed the ed tests (available in bin/ed/test) with this patch > applied and the proposed patch doesn't seem to introduce any new > regressions. > > If the patch needs to be revised further, please let me know. The only > thing I can think of right now is that it might make sense to rename the > `recur` variable to `sawseparator` or something along those lines. > > Greetings, > Sören > > Sören Tempel wrote: > > Hello, > > > > The POSIX specification of ed(1) includes a table in the rationale > > section which (among others) mandates the following address handling > > rules [1]: > > > > Address Addr1 Addr2 > > ,, $ $ > > ;; $ $ > > > > Unfortunately, OpenBSD does not correctly handle these two example > > addresses. As an example, consider the following `,,p` print command: > > > > printf "a\nfoo\nbar\nbaz\n.\n,,p\nQ\n" | ed > > > > This should only print the last line (as `,,p` should expand to `$,$p`) > > but instead prints all lines with OpenBSD ed. This seems to be a > > regression introduced by Jerome Frgagic in 2017 [2]. Prior to this > > change, `,,p` as well as `;;p` correctly printed the last line. The > > aforementioned change added a recursive invocation of next_addr() which > > does not update the local first variable (causing the second address > > separator to be mistaken for the first). The patch below fixes this by > > tracking recursive invocations of next_addr() via an additional function > > parameter. > > > > See also: https://austingroupbugs.net/view.php?id=1582 > > > > [1]: > > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html#tag_20_38_18 > > [2]: > > https://github.com/openbsd/src/commit/b4c905bbf3932a50011ce3436b1e22acc742cfeb > > > > diff --git bin/ed/main.c bin/ed/main.c > > index 231d021ef19..ca1aeb102b3 100644 > > --- bin/ed/main.c > > +++ bin/ed/main.c > > @@ -64,7 +64,7 @@ void signal_hup(int); > > void signal_int(int); > > void handle_winch(int); > > > > -static int next_addr(void); > > +static int next_addr(int); > > static int check_addr_range(int, int); > > static int get_matching_node_addr(regex_t *, int); > > static char *get_filename(int); > > @@ -296,7 +296,7 @@ extract_addr_range(void) > > > > addr_cnt = 0; > > first_addr = second_addr = current_addr; > > - while ((addr = next_addr()) >= 0) { > > + while ((addr = next_addr(0)) >= 0) { > > addr_cnt++; > > first_addr = second_addr; > > second_addr = addr; > > @@ -328,7 +328,7 @@ extract_addr_range(void) > > > > /* next_addr: return the next line address in the command buffer */ > > static int > > -next_addr(void) > > +next_addr(int recur) > > { > > char *hd; > > int addr = current_addr; > > @@ -382,11 +382,15 @@ next_addr(void) > > case '%': > > case ',': > > case ';': > > - if (first) { > > + if (first && !recur) { > > ibufp++; > > addr_cnt++; > > second_addr = (c == ';') ? current_addr : 1; > > - if ((addr = next_addr()) < 0) > > + > > + // If the next address is omitted (e.g. "," or > > ";") or > > + // if it is also a separator (e.g. ",," or > > ";;") then > > + // return the last address (as per the omission > > rules). > > + if ((addr = next_addr(1)) < 0) > > addr = addr_last; > > break; > > }
Re: ed: Make use of stderr compliant with POSIX
PING. I think this is a fairly obvious POSIX compliance issue with an easy fix. Sören Tempel wrote: > Hello, > > Currently, the OpenBSD ed implementation incorrectly writes information > to standard error that should be written to standard out (as per POSIX). > > For the read and write command the POSIX standard states the following: > > If the command is successful, the number of bytes written shall > be written to standard output [...] > > However, OpenBSD ed presently writes the number of bytes (for both the > read and the write command) to standard error. Furthermore, the POSIX > standard mandates that the infamous "?\n" error string should also be > written to standard output while OpenBSD ed presently writes the string > to standard error. > > Both cases are fixed by the patch below. > > OK? > > diff --git bin/ed/io.c bin/ed/io.c > index 97306be16..e75bedd8b 100644 > --- bin/ed/io.c > +++ bin/ed/io.c > @@ -64,7 +64,7 @@ read_file(char *fn, int n) > return ERR; > } > if (!scripted) > - fprintf(stderr, "%d\n", size); > + printf("%d\n", size); > return current_addr - n; > } > > @@ -166,7 +166,7 @@ write_file(char *fn, char *mode, int n, int m) > return ERR; > } > if (!scripted) > - fprintf(stderr, "%d\n", size); > + printf("%d\n", size); > return n ? m - n + 1 : 0; > } > > diff --git bin/ed/main.c bin/ed/main.c > index 231d021ef..674741c7c 100644 > --- bin/ed/main.c > +++ bin/ed/main.c > @@ -184,7 +184,7 @@ top: > signal(SIGINT, signal_int); > if (sigsetjmp(env, 1)) { > status = -1; > - fputs("\n?\n", stderr); > + fputs("\n?\n", stdout); > seterrmsg("interrupt"); > } else { > init_buffers(); > @@ -196,7 +196,7 @@ top: > strlcpy(old_filename, *argv, > sizeof old_filename); > } else if (argc) { > - fputs("?\n", stderr); > + fputs("?\n", stdout); > if (**argv == '\0') > seterrmsg("invalid filename"); > if (!interactive) > @@ -215,7 +215,7 @@ top: > continue; > } else if (n == 0) { > if (modified && !scripted) { > - fputs("?\n", stderr); > + fputs("?\n", stdout); > seterrmsg("warning: file modified"); > if (!interactive) { > if (garrulous) > @@ -250,7 +250,7 @@ top: > break; > case EMOD: > modified = 0; > - fputs("?\n", stderr); /* give warning */ > + fputs("?\n", stdout); /* give warning */ > seterrmsg("warning: file modified"); > if (!interactive) { > if (garrulous) > @@ -271,7 +271,7 @@ top: > quit(3); > break; > default: > - fputs("?\n", stderr); > + fputs("?\n", stdout); > if (!interactive) { > if (garrulous) > fprintf(stderr,
ed: Make use of stderr compliant with POSIX
Hello, Currently, the OpenBSD ed implementation incorrectly writes information to standard error that should be written to standard out (as per POSIX). For the read and write command the POSIX standard states the following: If the command is successful, the number of bytes written shall be written to standard output [...] However, OpenBSD ed presently writes the number of bytes (for both the read and the write command) to standard error. Furthermore, the POSIX standard mandates that the infamous "?\n" error string should also be written to standard output while OpenBSD ed presently writes the string to standard error. Both cases are fixed by the patch below. OK? diff --git bin/ed/io.c bin/ed/io.c index 97306be16..e75bedd8b 100644 --- bin/ed/io.c +++ bin/ed/io.c @@ -64,7 +64,7 @@ read_file(char *fn, int n) return ERR; } if (!scripted) - fprintf(stderr, "%d\n", size); + printf("%d\n", size); return current_addr - n; } @@ -166,7 +166,7 @@ write_file(char *fn, char *mode, int n, int m) return ERR; } if (!scripted) - fprintf(stderr, "%d\n", size); + printf("%d\n", size); return n ? m - n + 1 : 0; } diff --git bin/ed/main.c bin/ed/main.c index 231d021ef..674741c7c 100644 --- bin/ed/main.c +++ bin/ed/main.c @@ -184,7 +184,7 @@ top: signal(SIGINT, signal_int); if (sigsetjmp(env, 1)) { status = -1; - fputs("\n?\n", stderr); + fputs("\n?\n", stdout); seterrmsg("interrupt"); } else { init_buffers(); @@ -196,7 +196,7 @@ top: strlcpy(old_filename, *argv, sizeof old_filename); } else if (argc) { - fputs("?\n", stderr); + fputs("?\n", stdout); if (**argv == '\0') seterrmsg("invalid filename"); if (!interactive) @@ -215,7 +215,7 @@ top: continue; } else if (n == 0) { if (modified && !scripted) { - fputs("?\n", stderr); + fputs("?\n", stdout); seterrmsg("warning: file modified"); if (!interactive) { if (garrulous) @@ -250,7 +250,7 @@ top: break; case EMOD: modified = 0; - fputs("?\n", stderr); /* give warning */ + fputs("?\n", stdout); /* give warning */ seterrmsg("warning: file modified"); if (!interactive) { if (garrulous) @@ -271,7 +271,7 @@ top: quit(3); break; default: - fputs("?\n", stderr); + fputs("?\n", stdout); if (!interactive) { if (garrulous) fprintf(stderr,
Re: Fix handling of ed addresses consisting only of separators
PING. I have executed the ed tests (available in bin/ed/test) with this patch applied and the proposed patch doesn't seem to introduce any new regressions. If the patch needs to be revised further, please let me know. The only thing I can think of right now is that it might make sense to rename the `recur` variable to `sawseparator` or something along those lines. Greetings, Sören Sören Tempel wrote: > Hello, > > The POSIX specification of ed(1) includes a table in the rationale > section which (among others) mandates the following address handling > rules [1]: > >Address Addr1 Addr2 >,, $ $ >;; $ $ > > Unfortunately, OpenBSD does not correctly handle these two example > addresses. As an example, consider the following `,,p` print command: > > printf "a\nfoo\nbar\nbaz\n.\n,,p\nQ\n" | ed > > This should only print the last line (as `,,p` should expand to `$,$p`) > but instead prints all lines with OpenBSD ed. This seems to be a > regression introduced by Jerome Frgagic in 2017 [2]. Prior to this > change, `,,p` as well as `;;p` correctly printed the last line. The > aforementioned change added a recursive invocation of next_addr() which > does not update the local first variable (causing the second address > separator to be mistaken for the first). The patch below fixes this by > tracking recursive invocations of next_addr() via an additional function > parameter. > > See also: https://austingroupbugs.net/view.php?id=1582 > > [1]: > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html#tag_20_38_18 > [2]: > https://github.com/openbsd/src/commit/b4c905bbf3932a50011ce3436b1e22acc742cfeb > > diff --git bin/ed/main.c bin/ed/main.c > index 231d021ef19..ca1aeb102b3 100644 > --- bin/ed/main.c > +++ bin/ed/main.c > @@ -64,7 +64,7 @@ void signal_hup(int); > void signal_int(int); > void handle_winch(int); > > -static int next_addr(void); > +static int next_addr(int); > static int check_addr_range(int, int); > static int get_matching_node_addr(regex_t *, int); > static char *get_filename(int); > @@ -296,7 +296,7 @@ extract_addr_range(void) > > addr_cnt = 0; > first_addr = second_addr = current_addr; > - while ((addr = next_addr()) >= 0) { > + while ((addr = next_addr(0)) >= 0) { > addr_cnt++; > first_addr = second_addr; > second_addr = addr; > @@ -328,7 +328,7 @@ extract_addr_range(void) > > /* next_addr: return the next line address in the command buffer */ > static int > -next_addr(void) > +next_addr(int recur) > { > char *hd; > int addr = current_addr; > @@ -382,11 +382,15 @@ next_addr(void) > case '%': > case ',': > case ';': > - if (first) { > + if (first && !recur) { > ibufp++; > addr_cnt++; > second_addr = (c == ';') ? current_addr : 1; > - if ((addr = next_addr()) < 0) > + > + // If the next address is omitted (e.g. "," or > ";") or > + // if it is also a separator (e.g. ",," or > ";;") then > + // return the last address (as per the omission > rules). > + if ((addr = next_addr(1)) < 0) > addr = addr_last; > break; > }
Fix handling of ed addresses consisting only of separators
Hello, The POSIX specification of ed(1) includes a table in the rationale section which (among others) mandates the following address handling rules [1]: Address Addr1 Addr2 ,, $ $ ;; $ $ Unfortunately, OpenBSD does not correctly handle these two example addresses. As an example, consider the following `,,p` print command: printf "a\nfoo\nbar\nbaz\n.\n,,p\nQ\n" | ed This should only print the last line (as `,,p` should expand to `$,$p`) but instead prints all lines with OpenBSD ed. This seems to be a regression introduced by Jerome Frgagic in 2017 [2]. Prior to this change, `,,p` as well as `;;p` correctly printed the last line. The aforementioned change added a recursive invocation of next_addr() which does not update the local first variable (causing the second address separator to be mistaken for the first). The patch below fixes this by tracking recursive invocations of next_addr() via an additional function parameter. See also: https://austingroupbugs.net/view.php?id=1582 [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html#tag_20_38_18 [2]: https://github.com/openbsd/src/commit/b4c905bbf3932a50011ce3436b1e22acc742cfeb diff --git bin/ed/main.c bin/ed/main.c index 231d021ef19..ca1aeb102b3 100644 --- bin/ed/main.c +++ bin/ed/main.c @@ -64,7 +64,7 @@ void signal_hup(int); void signal_int(int); void handle_winch(int); -static int next_addr(void); +static int next_addr(int); static int check_addr_range(int, int); static int get_matching_node_addr(regex_t *, int); static char *get_filename(int); @@ -296,7 +296,7 @@ extract_addr_range(void) addr_cnt = 0; first_addr = second_addr = current_addr; - while ((addr = next_addr()) >= 0) { + while ((addr = next_addr(0)) >= 0) { addr_cnt++; first_addr = second_addr; second_addr = addr; @@ -328,7 +328,7 @@ extract_addr_range(void) /* next_addr: return the next line address in the command buffer */ static int -next_addr(void) +next_addr(int recur) { char *hd; int addr = current_addr; @@ -382,11 +382,15 @@ next_addr(void) case '%': case ',': case ';': - if (first) { + if (first && !recur) { ibufp++; addr_cnt++; second_addr = (c == ';') ? current_addr : 1; - if ((addr = next_addr()) < 0) + + // If the next address is omitted (e.g. "," or ";") or + // if it is also a separator (e.g. ",," or ";;") then + // return the last address (as per the omission rules). + if ((addr = next_addr(1)) < 0) addr = addr_last; break; }
[PATCH] Fix ed shell command when stdout isn't line-buffered
Hello, The ed shell escape command currently behaves incorrectly if standard output isn't line-buffered. As an example, consider the following ed(1) invocations: $ cat /tmp/ed-cmd !echo foo !! $ ed < /tmp/ed-cmd foo ! echo foo foo ! $ ed < /tmp/ed-cmd > /tmp/ed-out.txt $ cat /tmp/ed-out.txt foo foo ! echo foo ! The output of the first and second invocation differs in regards to when the modified command-string of the !! command is output. In the first invocation, it is output before the command is executed (which is the behavior mandated by POSIX). However, in the second invocation it is output **after** the command was executed (which is incorrect). This is due to the fact that in the second invocation standard output is a file and hence not line-buffered. Unfortunately, ed(1) does currently not flush all I/O buffers before executing the command using system(3). As such, output of the command will be written to standard output before any data buffered by ed itself is written. The patch below fixes this issue by flushing all open output streams before executing the command using system(3). Alternatively, it may also be sufficient to only flush stdout and (maybe) stderr. This issue was originally discovered in GNU ed where it has been fixed since. See: https://lists.gnu.org/archive/html/bug-ed/2021-12/msg0.html Greetings, Sören Tempel diff --git bin/ed/main.c bin/ed/main.c index dc73dd92b..1e3cd8bdb 100644 --- bin/ed/main.c +++ bin/ed/main.c @@ -852,6 +852,7 @@ exec_command(void) return ERR; GET_COMMAND_SUFFIX(); if (sflags) printf("%s\n", shcmd + 1); + fflush(NULL); /* flush any buffered I/O */ system(shcmd + 1); if (!scripted) printf("!\n"); break;
Re: ksh: add support for bracketed paste mode
',OF_CMDLINE }, { "csh-history", 0,OF_ANY }, /* non-standard */ diff --git sh.h sh.h index 93beef31d46..652a1f6dd06 100644 --- sh.h +++ sh.h @@ -134,6 +134,9 @@ extern const struct option sh_options[]; enum sh_flag { FEXPORT = 0,/* -a: export all */ FBRACEEXPAND, /* enable {} globbing */ +#ifdef EMACS + FBBRACKETPASTE, /* bracketed paste mode */ +#endif FBGNICE,/* bgnice */ FCOMMAND, /* -c: (invocation) execute specified command */ FCSHHISTORY,/* csh-style history enabled */ Sören Tempel wrote: > Hello, > > The diff below adds support for bracketed paste mode to ksh. Bracketed > paste mode is a set of special escape sequences, which are employed by > many terminal emulators to allow programs run inside of them to > distinguish pasted text from typed-in text [0]. This is useful for > preventing pasted text from accidentally executing commands in the > application it was pasted to. Commonly this problem arises when copying > text from a web browser to a shell since the user may have copied > hidden text from the web page which may contain control characters such > as \n [1]. > > Bracketed paste mode is supported by all mainstream terminal emulators > including xterm, urxvt, and gnome-terminal. The implementation proposed > here was tested with urxvt. However, since we cannot determine in > advance whether the utilized terminal emulator supports these escape > sequences the feature must be explicitly activated using the set > builtin. I haven't tested the diff extensively yet but it is rather > simple so I don't expect any breakages. One limitation of the proposed > implementation is that in only works in emacs mode. Would be happy to > address any feedback you might have. > > Greetings, > Sören > > [0]: https://www.xfree86.org/4.7.0/ctlseqs.html#Bracketed%20Paste%20Mode > [1]: https://thejh.net/misc/website-terminal-copy-paste > > diff --git bin/ksh/edit.c bin/ksh/edit.c > index 3089d195d..4b6d45050 100644 > --- bin/ksh/edit.c > +++ bin/ksh/edit.c > @@ -150,12 +150,23 @@ x_puts(const char *s) > shf_putc(*s++, shl_out); > } > > +static void > +x_paste_mode(bool onoff) > +{ > + if (!Flag(FBBRACKETPASTE)) > + return; > + > + printf((onoff) ? BRPASTE_INT : BRPASTE_DEINT); > + fflush(stdout); > +} > + > bool > x_mode(bool onoff) > { > static bool x_cur_mode; > boolprev; > > + x_paste_mode(onoff); > if (x_cur_mode == onoff) > return x_cur_mode; > prev = x_cur_mode; > diff --git bin/ksh/edit.h bin/ksh/edit.h > index 0b604cd64..8cc774f01 100644 > --- bin/ksh/edit.h > +++ bin/ksh/edit.h > @@ -34,6 +34,12 @@ extern X_chars edchars; > #define XCF_FULLPATH BIT(2) /* command completion: store full path */ > #define XCF_COMMAND_FILE (XCF_COMMAND|XCF_FILE) > > +/* https://www.xfree86.org/4.7.0/ctlseqs.html#Bracketed%20Paste%20Mode */ > +#define BRPASTE_INT "\033[?2004h" > +#define BRPASTE_DEINT"\033[?2004l" > +#define BRPASTE_PRE kb_encode("^[[200~") > +#define BRPASTE_POST kb_encode("^[[201~") > + > /* edit.c */ > int x_getc(void); > void x_flush(void); > diff --git bin/ksh/emacs.c bin/ksh/emacs.c > index 694c402ff..96136263e 100644 > --- bin/ksh/emacs.c > +++ bin/ksh/emacs.c > @@ -118,6 +118,7 @@ staticchar*xmp; /* mark pointer */ > static char*killstack[KILLSIZE]; > static int killsp, killtp; > static int x_literal_set; > +static int x_brack_paste; > static int x_arg_set; > static char*macro_args; > static int prompt_skip; > @@ -203,6 +204,8 @@ static intx_fold_lower(int); > static int x_fold_upper(int); > static int x_set_arg(int); > static int x_comment(int); > +static int x_brack_paste_start(int); > +static int x_brack_paste_end(int); > #ifdef DEBUG > static int x_debug_info(int); > #endif > @@ -260,6 +263,8 @@ static const struct x_ftab x_ftab[] = { > { x_fold_upper, "upcase-word", XF_ARG }, > { x_set_arg,"set-arg", XF_NOBIND }, > { x_comment,"comment", 0 }, > + { x_brack_paste_start, "bracketed-paste-start",0 }, > + { x_brack_paste_end,"bracketed-paste-end", 0 }, > { 0, 0, 0 }, > #ifdef DEBUG > { x_debug_info, "debug-info", 0 }, > @@ -316,6 +321,8 @@ x_
Re: Patch: ksh: fix input handling for 4 byte UTF-8 sequences
Ingo Schwarze wrote: > It would, and in principle, that would be an improvement. > But i think editline(3) code quality is insufficent for use in a > shell. It's all quite messy and hastily and sloppily written. > I tried to polish some of it in the past, but got distracted, > so editline(3) code is still full of stuff that needs review > and quality improvement. > > Actually, i'm a bit scared that sftp(1) uses it. Then again, i'm not > aware that it caused any major vulnerabilities in the past, and the > OpenSSH developers are not at all reckless people, so i am sure they > know what they are doing. Coincidentally, I actually found two non-critical out-of-bounds memory accesses in the NetBSD version of editline two years ago [1] [2]. Quickly checking the OpenBSD code, it seems the associated fixes haven't made their way into the OpenBSD editline version yet. Not sure if/how this is normally done but maybe it makes sense to (partially) sync the OpenBSD version with the NetBSD one? The latter has accumulated a few general improvements over the past years. Either way, I do understand your resentment towards using editline in ksh. Greetings, Sören [1]: https://github.com/NetBSD/src/commit/e93ca0319937d29fabb0a98abfdbef65a55dba9a [2]: https://github.com/NetBSD/src/commit/f6dff047a3ee27c332a60cef9c76355093a4e05e
Re: Patch: ksh: fix input handling for 4 byte UTF-8 sequences
Ingo Schwarze wrote: > Hi, Hello, > Which problem needs fixing: > Of the four-byte UTF-8 sequences, only a subset is identified by the > existing code. The other four-byte UTF-8 sequences still get chopped > up resulting in individual bytes being passed on. > > > I'm also adding a few comments as suggested by jca@. Parsing of UTF-8 > is less trivial than one might think, witnessed once again by the fact > that i got this code wrong in the first place. > > I also changed "cc & 0x20" to "cc > 0x9f" and "cc & 0x30" to "cc > 0x8f" > for uniformity and readabilty - UTF-8-parsing is bad enough without > needless micro-optimization, right? Nice, wasn't aware that you also had a patch ready. Sounds good to me and also fixes the problem I originally experienced with 4 byte UTF-8 sequences. > Note that even with the patch below, moving backward and forward > over a blowfish icon on the command line still does not work because > the character is width 2 and the ksh code intentionally does not > use wcwidth(3). But maybe it improves something in tmux? Not sure. Character movements over emojis (e.g. U+1F421) are currently broken because the ksh code doesn't correctly determine the amount of columns needed for a given character (i.e. what you would normally do with wcwidth). I tried fixing this but without wchar.h doing so seemed very cumbersome. Inputting emojis works with your patch though and was broken previously. > Either way, unless it causes regressions, this (or a further improved > version) should go in because what is there is clearly wrong. > > OK? Your diff looks good to me. BTW: Is there any reason why ksh doesn't use editline for all its line editing needs? That would allow handling all these nitty-gritty details in a central place. Greetings, Sören
Re: ksh: fix input handling for 4 byte UTF-8 sequences
Ping. Sören Tempel wrote: > Hello, > > Currently, ksh does not correctly calculate the length of 4 byte UTF-8 > sequences in emacs input mode. For demonstration purposes try inputting > an emoji (e.g. U+1F421) at your shell prompt. These 4 byte sequences can > be identified by checking if the first four bits are set and the fifth > bit isn't. The current check for identifying these 4 byte sequences is > incorrect. > > The patch below fixes this, thereby allowing users to enter emojis > (and other 4 byte UTF-8 sequences) at their shell prompt in emacs mode: > > Greetings, > Sören > > diff --git bin/ksh/emacs.c bin/ksh/emacs.c > index 694c402ff..970a0989d 100644 > --- bin/ksh/emacs.c > +++ bin/ksh/emacs.c > @@ -1851,7 +1851,7 @@ x_e_getu8(char *buf, int off) > return -1; > buf[off++] = c; > > - if (c == 0xf4) > + if ((c & 0xf8) == 0xf0) > len = 4; > else if ((c & 0xf0) == 0xe0) > len = 3;
ksh: fix input handling for 4 byte UTF-8 sequences
Hello, Currently, ksh does not correctly calculate the length of 4 byte UTF-8 sequences in emacs input mode. For demonstration purposes try inputting an emoji (e.g. U+1F421) at your shell prompt. These 4 byte sequences can be identified by checking if the first four bits are set and the fifth bit isn't. The current check for identifying these 4 byte sequences is incorrect. The patch below fixes this, thereby allowing users to enter emojis (and other 4 byte UTF-8 sequences) at their shell prompt in emacs mode: Greetings, Sören diff --git bin/ksh/emacs.c bin/ksh/emacs.c index 694c402ff..970a0989d 100644 --- bin/ksh/emacs.c +++ bin/ksh/emacs.c @@ -1851,7 +1851,7 @@ x_e_getu8(char *buf, int off) return -1; buf[off++] = c; - if (c == 0xf4) + if ((c & 0xf8) == 0xf0) len = 4; else if ((c & 0xf0) == 0xe0) len = 3;
ksh: add support for bracketed paste mode
Hello, The diff below adds support for bracketed paste mode to ksh. Bracketed paste mode is a set of special escape sequences, which are employed by many terminal emulators to allow programs run inside of them to distinguish pasted text from typed-in text [0]. This is useful for preventing pasted text from accidentally executing commands in the application it was pasted to. Commonly this problem arises when copying text from a web browser to a shell since the user may have copied hidden text from the web page which may contain control characters such as \n [1]. Bracketed paste mode is supported by all mainstream terminal emulators including xterm, urxvt, and gnome-terminal. The implementation proposed here was tested with urxvt. However, since we cannot determine in advance whether the utilized terminal emulator supports these escape sequences the feature must be explicitly activated using the set builtin. I haven't tested the diff extensively yet but it is rather simple so I don't expect any breakages. One limitation of the proposed implementation is that in only works in emacs mode. Would be happy to address any feedback you might have. Greetings, Sören [0]: https://www.xfree86.org/4.7.0/ctlseqs.html#Bracketed%20Paste%20Mode [1]: https://thejh.net/misc/website-terminal-copy-paste diff --git bin/ksh/edit.c bin/ksh/edit.c index 3089d195d..4b6d45050 100644 --- bin/ksh/edit.c +++ bin/ksh/edit.c @@ -150,12 +150,23 @@ x_puts(const char *s) shf_putc(*s++, shl_out); } +static void +x_paste_mode(bool onoff) +{ + if (!Flag(FBBRACKETPASTE)) + return; + + printf((onoff) ? BRPASTE_INT : BRPASTE_DEINT); + fflush(stdout); +} + bool x_mode(bool onoff) { static bool x_cur_mode; boolprev; + x_paste_mode(onoff); if (x_cur_mode == onoff) return x_cur_mode; prev = x_cur_mode; diff --git bin/ksh/edit.h bin/ksh/edit.h index 0b604cd64..8cc774f01 100644 --- bin/ksh/edit.h +++ bin/ksh/edit.h @@ -34,6 +34,12 @@ extern X_chars edchars; #define XCF_FULLPATH BIT(2) /* command completion: store full path */ #define XCF_COMMAND_FILE (XCF_COMMAND|XCF_FILE) +/* https://www.xfree86.org/4.7.0/ctlseqs.html#Bracketed%20Paste%20Mode */ +#define BRPASTE_INT"\033[?2004h" +#define BRPASTE_DEINT "\033[?2004l" +#define BRPASTE_PREkb_encode("^[[200~") +#define BRPASTE_POST kb_encode("^[[201~") + /* edit.c */ intx_getc(void); void x_flush(void); diff --git bin/ksh/emacs.c bin/ksh/emacs.c index 694c402ff..96136263e 100644 --- bin/ksh/emacs.c +++ bin/ksh/emacs.c @@ -118,6 +118,7 @@ static char*xmp; /* mark pointer */ static char*killstack[KILLSIZE]; static int killsp, killtp; static int x_literal_set; +static int x_brack_paste; static int x_arg_set; static char*macro_args; static int prompt_skip; @@ -203,6 +204,8 @@ static int x_fold_lower(int); static int x_fold_upper(int); static int x_set_arg(int); static int x_comment(int); +static int x_brack_paste_start(int); +static int x_brack_paste_end(int); #ifdef DEBUG static int x_debug_info(int); #endif @@ -260,6 +263,8 @@ static const struct x_ftab x_ftab[] = { { x_fold_upper, "upcase-word", XF_ARG }, { x_set_arg,"set-arg", XF_NOBIND }, { x_comment,"comment", 0 }, + { x_brack_paste_start, "bracketed-paste-start",0 }, + { x_brack_paste_end,"bracketed-paste-end", 0 }, { 0, 0, 0 }, #ifdef DEBUG { x_debug_info, "debug-info", 0 }, @@ -316,6 +321,8 @@ x_emacs(char *buf, size_t len) } x_literal_set = 0; + x_brack_paste = 0; + x_arg = -1; x_last_command = NULL; while (1) { @@ -353,6 +360,13 @@ x_emacs(char *buf, size_t len) } } + /* In bracketed paste mode only allow x_brack_paste_end, +* to quit this mode, for all other commands insert a literal. */ + if (x_brack_paste && (submatch == 1 && kmatch)) { + if (kmatch->ftab->xf_func != x_brack_paste_end) + submatch = 0; + } + if (submatch == 1 && kmatch) { if (kmatch->ftab->xf_func == x_ins_string && kmatch->args && !macro_args) { @@ -1479,6 +1493,10 @@ x_init_emacs(void) TAILQ_INIT(&kblist); + /* bracketed paste mode */ + kb_add_string(x_brack_paste_start, NULL, BRPASTE_PRE); + kb_add_string(x_brack_paste_end,NULL, BRPASTE_POST); + /* man page order */ kb_add(x_abort, CTRL('G'), 0); kb_add(x_mv_back, CTRL('B'), 0); @@ -1984,6 +2002,21 @@ x_comment(int c) return KSTD; }
rwho on OpenBSD 5.6
Hi all, I just updated to OpenBSD 5.6 and I was happy to see that rcp, rsh, rshd, rwho, rwhod, etc have been removed (at least according to the Changelog). However, the upgrade instructions fail to mention that files like /etc/rc.d/rwhod or /usr/bin/rwho should be removed. Sören.