Re: delete ligature support for Arabic "la" from the less(1) command line
Hi Ingo, Thanks for your effort in unicode support. I hope my feedback as a native Persian would be helpful. Ingo Schwarze wrote: > If i understand correctly, xterm(1) does indeed have that problem. > I prepared a test file that contains, in this order, > > - some Latin characters > - the Arabic word "la" ("no"), i.e. first LAM, then ALEF > - some more Latin characters > - the Arabic word "al" ("the"), i.e. first ALEF, then LAM > - some final Latin characters > > And indeed, xterm(1) does not respect the writing direction of the > individual words. When cat(1)'ing the file to stdout, both xterm(1) > and konsole(1) show all the words from left to right, but *inside* > each word, konsole(1) uses the correct writing direction: right to > left for Arabic and left to right for Latin. For example, in the > Arabic word "al", konsole(1) correctly shows the ALEF right of the > LAM, whereas xterm(1) wrongly shows the ALEF left of the LAM. > There are many rules. Each letter / character has a direction by itself. For example English letters are LTR (left-to-right), Arabic / Persian letters are RTL, but some characters, say symbols, have no direction. For example, when you write: 'A' '+' 'B' It should be displayed as is ('+' is LTR), but when you write: 'A' ALEF '+' LAM 'B' The '+' should be displayed in the left side of ALEF ('+' is RTL): 'A' LAM '+' ALEF 'B' I think you need to detect all maximal non-LTR substrings (which don't start or end with a symbol) inside LTR strings to render them correctly. There are also RTL / LTR control characters in Unicode which manipulate this behaviour. > I'm not entirely sure this has much to do with ligatures, though. > What matters for building ligatures is only the logical ordering, > the ordering in *time* so to speak, i.e. what comes before and what > comes after. LAM before ALEF has to become the ligature glyph "al", > whereas ALEF before LAM remains two glyphs. Technically, the > question of ordering in space, whether glyphs are painted onto the > screen right to left or left to right, only comes into play after > characters have already been combined into glyphs. > > Actually, now that you bring up the topic, i see another situation > where less(1) causes an issue. Let's use konsole(1) and not xterm(1) > such that we get the correct writing direction, and let's put the > word "al" onto the screen. No ligature here, so that part of the > topic is suspended for a moment. Now let's slowly scroll right in > one-column steps. All is fine as long as the word "al" is completely > visible on screen. But when the final letter LAM of "al" is in the > last (leftmost) column of the screen and you scroll right one more > column, something weird happens, even in konsole(1). You would > expect the final letter LAM to scroll off screen first and the initial > letter ALEF to remain on the screen for a little longer. Instead, > less(1) incorrectly thinks the *initial* letter of the word scrolls > off screen first, and it tells xterm(1) to display the ALEF in the > leftmost column of the screen while the LAM just went off-screen. > That looks weird because there is no word in that text beginning > with ALEF. > It's a difficult problem. You need to consider all maximal non-LTR substrings, and all LTR / RTL modifiers. Also consider a file with long RTL lines; user prefer to see the beginig of lines (in all languages, readers read from start), so less(1) should display right-most part of each line, and when user scrolls the text to right, less(1) should display left-side of each line. I think that if xterm had a complete RTL mode with swapped right and left keys, it might solve many problems. In your example in RTL xterm, there will be no right scroll (because of swapped keys) and when you scroll less(1) to the left, less(1) will correctly scrolls off the initial letter. Of course it will not work on complex mixed RTL / LTR texts, but it solves the problem in most common situations. > This means that being able to properly view Arabic or Farsi text > with the default OpenBSD terminal emulator and parser would require > > 1. bidi support in xterm(1) > to render Farsi words with the correct writing direction > 2. ligature support in xterm(1) > to correctly connect letters > 3. bidi support in less(1) > to correctly scroll parts of words on and off screen, horizontally According to previous example (a file with long RTL lines), I don't agree with bidi support in less(1). > 4. ligature support in less(1) > for correct columnation > > As far as i understand, you are saying that the extremely fragmentary > support for item 4 which we happen to have right now is not really > useful without items 1-3, and even when using konsole(1), which does > have items 1 and 2, implementing item 3 before item 4 would make > sense because item 3 is more importrant. > > So my understanding is that you are not objecting to the patch because
Re: delete ligature support for Arabic "la" from the less(1) command line
Hello Mohammadreza, Mohammadreza Abdollahzadeh wrote on Sun, Sep 01, 2019 at 09:40:16AM +0430: > Persian is my native language and I think that the major problem that > all RTL (Right-To-Left) languages like Persian and Arabic currentlly suffer > from is the lack of BiDi (Bidirectionality) support in console and terminal > environment like xterm(1). KDE konsole(1) support bidi and that's why it > show ligatures correctly. > I think any attempt to fix such problems must first start with adding bidi > support to xterm and other terminal environment. Thank you for your feedback! If i understand correctly, xterm(1) does indeed have that problem. I prepared a test file that contains, in this order, - some Latin characters - the Arabic word "la" ("no"), i.e. first LAM, then ALEF - some more Latin characters - the Arabic word "al" ("the"), i.e. first ALEF, then LAM - some final Latin characters And indeed, xterm(1) does not respect the writing direction of the individual words. When cat(1)'ing the file to stdout, both xterm(1) and konsole(1) show all the words from left to right, but *inside* each word, konsole(1) uses the correct writing direction: right to left for Arabic and left to right for Latin. For example, in the Arabic word "al", konsole(1) correctly shows the ALEF right of the LAM, whereas xterm(1) wrongly shows the ALEF left of the LAM. I'm not entirely sure this has much to do with ligatures, though. What matters for building ligatures is only the logical ordering, the ordering in *time* so to speak, i.e. what comes before and what comes after. LAM before ALEF has to become the ligature glyph "al", whereas ALEF before LAM remains two glyphs. Technically, the question of ordering in space, whether glyphs are painted onto the screen right to left or left to right, only comes into play after characters have already been combined into glyphs. Actually, now that you bring up the topic, i see another situation where less(1) causes an issue. Let's use konsole(1) and not xterm(1) such that we get the correct writing direction, and let's put the word "al" onto the screen. No ligature here, so that part of the topic is suspended for a moment. Now let's slowly scroll right in one-column steps. All is fine as long as the word "al" is completely visible on screen. But when the final letter LAM of "al" is in the last (leftmost) column of the screen and you scroll right one more column, something weird happens, even in konsole(1). You would expect the final letter LAM to scroll off screen first and the initial letter ALEF to remain on the screen for a little longer. Instead, less(1) incorrectly thinks the *initial* letter of the word scrolls off screen first, and it tells xterm(1) to display the ALEF in the leftmost column of the screen while the LAM just went off-screen. That looks weird because there is no word in that text beginning with ALEF. This means that being able to properly view Arabic or Farsi text with the default OpenBSD terminal emulator and parser would require 1. bidi support in xterm(1) to render Farsi words with the correct writing direction 2. ligature support in xterm(1) to correctly connect letters 3. bidi support in less(1) to correctly scroll parts of words on and off screen, horizontally 4. ligature support in less(1) for correct columnation As far as i understand, you are saying that the extremely fragmentary support for item 4 which we happen to have right now is not really useful without items 1-3, and even when using konsole(1), which does have items 1 and 2, implementing item 3 before item 4 would make sense because item 3 is more importrant. So my understanding is that you are not objecting to the patch because the fragmentary support for item 4 is practically useless in isolation. The following is not related to this patch, but i think it makes sense to mention it here: regarding the future, i think items 1 and 3 are much easier to support than items 2 and 4 because bidi support, if i understand correctly, only needs one bit of information per character because it only needs to know whether the character is part of a right to left or left to right script, so the complexity on the libc level, where we want complexity least of all places, is comparable to other boolean character properties like those listed in the iswalnum(3) manual page. Realistically, though, bidi support would still be a large project, and i don't think it makes sense to tackle it any time soon. Ligature support feels much worse than bidi support because the mapping required is not merely character -> boolean but (character + character) -> character, which is more complicated than even the (character + character) -> -1/0/+1 mapping required for collation support - and we decided that we don't want collation support in libc because it would cause excessive complexity. Admittedly, collations are strongly locale-dependent, while i'm not sure ligatures are locale-depe
Re: delete ligature support for Arabic "la" from the less(1) command line
Hi Ingo, Persian is my native language and I think that the major problem that all RTL (Right-To-Left) languages like Persian and Arabic currentlly suffer from is the lack of BiDi (Bidirectionality) support in console and terminal environment like xterm(1). KDE konsole(1) support bidi and that's why it show ligatures correctly. I think any attempt to fix such problems must first start with adding bidi support to xterm and other terminal environment. best regards.
Re: delete ligature support for Arabic "la" from the less(1) command line
Ingo Schwarze wrote: > I have no idea how many of those work in konsole(1) - but i'm sure > none of those, except the four LAM WITH ALEF discussed here, work > with less(1), so i think support for LAM WITH ALEF provided no value > in the first place. The way it is implemented, with an ad-hoc table > inside less(1) of character combinations that form ligatures, is > just wrong and not sustainable by any stretch of the imagination, > i think. > > On top of that, how characters combine in Arabic is strongly context > dependent; even the syllable "la" forms a different ligature depending > on whether it is isolated or at the end of a longer word, and none > of the context dependencies are implemented in less(1) anyway. > > And finally, people say the situation in many Indian languages is > even more dire than in Arabic, so what our less(1) tries to do is > almost certainly completely useless for those languages, even if > we would expand the ad-hoc table. > > So, i propose to delete support for combining characters into > ligatures from our less(1): at this point, it is only used for > typing at the less prompt anyway (and not for the file displayed), > only for Arabic, and only for the single ligature "la". If we ever > want better ligature support in the future, i think we would have > to make a fresh start anyway - and i think there are many other > things to do before that. I did less practical research than you did when I looked at this bit of code but your conclusions match mine: this is an attempt at an implementation of a tiny subset of the vastly complex problem of digital typesetting of the Arabic alef-bet. Keeping the code is probably worse than no solution at all, because (as you noted) it's the wrong implementation in the wrong place and "improving" it by adding more combination rules here would be a mistake. --Evan Silberman
delete ligature support for Arabic "la" from the less(1) command line
Hi, i have to admit that i am neither able to speak nor to write nor to understand the Arabic language nor the Arabic script, but here is my current, probably incomplete understanding of what our less(1) program is trying to do with Arabic ligatures. If somebody is reading this who is able to read and write Arabic or an Indian language heavily using ligatures, feedback is highly welcome. Arabic is a cursive script, which means that when writing Arabic, characters do not map 1:1 to glyphs. Instead, there are rules about how adjacent characters attach to each other, forming ligatures. As an extremely simple example, consider the Arabic adverb "la", which means the same as the English adverb "no". It consists of the two letters U+0644 LAM and U+0627 ALEF, the LAM appearing before (i.e. to the right of) the ALEF. However, you do not write both letters separately. Instead, the ALEF leans forward (to the left) and attaches to the LAM, forming the glyph U+FEFB, ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM. When displayed in a fixed width font, that ligature only occupies a single display column just like any other Arabic or Latin glyph. The LAM WITH ALEF glyph is not a double-width glyph like Japanese or Chinese characters typically are. So, when this happens, you have four bytes of UTF-8 forming two Unicode characters, and *together*, these two characters occupy only one single display column. Note that in the default configuration, our xterm(1) is not able to display Arabic characters at all. But even when you run xterm -fa arabic or xterm -fa fixed which uses FreeType support instead of the default X toolkit font support, such that xterm(1) does become able to display single Arabic characters, it still displays the word "la" incorrectly, failing to generate the required ligature and instead displaying the two characters LAM and ALEF separately. So i installed konsole-18.12.0p1 for testing (which pulls in ridiculous amounts of dependencies, dozens of them, but oh well, i guess support for advanced Unicode features isn't trivial). The konsole(1) program does display the word "la" correctly, as a ligature. Now, running less(1) inside konsole(1), i found that columnation is already subtly broken. As long as the "la" ligature is visible on screen, all is fine. Now scroll to the right until the "la" appears in the first screen column. Then scroll one more column to the right by pressing "1 RIGHTARROW". Now you see *half* the ligature, i.e. an isolated ALEF, in the first column of the screen, even though the Arabic word does not contain an isolated ALEF. Besides, we just attempted to scroll the "la" off screen, so the ALEF now appears in the column one to the right of where the "la" should actually be, and all the rest of the line is shifted one column to the right, too, so columnation is now off by one. Scrolling back left, columnation recovers to correct display. I strongly suspect i broke that during my previous UTF-8 cleanup work on less(1). However, LAM WITH ALEF is literally the only ligature that less(1) supports, together with three variations (with MADDA above, with HAMZA above, and with HAMZA below). But there are hundreds of ligatures in Arabic, see https://www.unicode.org/charts/PDF/UFB50.pdf https://www.unicode.org/charts/PDF/UFE70.pdf I have no idea how many of those work in konsole(1) - but i'm sure none of those, except the four LAM WITH ALEF discussed here, work with less(1), so i think support for LAM WITH ALEF provided no value in the first place. The way it is implemented, with an ad-hoc table inside less(1) of character combinations that form ligatures, is just wrong and not sustainable by any stretch of the imagination, i think. On top of that, how characters combine in Arabic is strongly context dependent; even the syllable "la" forms a different ligature depending on whether it is isolated or at the end of a longer word, and none of the context dependencies are implemented in less(1) anyway. And finally, people say the situation in many Indian languages is even more dire than in Arabic, so what our less(1) tries to do is almost certainly completely useless for those languages, even if we would expand the ad-hoc table. So, i propose to delete support for combining characters into ligatures from our less(1): at this point, it is only used for typing at the less prompt anyway (and not for the file displayed), only for Arabic, and only for the single ligature "la". If we ever want better ligature support in the future, i think we would have to make a fresh start anyway - and i think there are many other things to do before that. Note that this only removes support for combining characters into ligatures that can also stand on their own; support for purely combining accents like U+300 COMBINING GRAVE ACCENT and U+3099 COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK remains intact. OK? Ingo Index: charset.c =
Re: less(1): `!' command
On Fri, Dec 22 2017, Stuart Henderson wrote: > On 2017/12/22 19:47, Nicholas Marriott wrote: >> I don't think we should bring ! back. >> >> I wanted to remove v and | (and some other stuff) shortly afterwards, but >> several people objected. >> >> I did suggest having a lightweight less in base for most people and adding >> the full upstream less to ports for the stuff we don't want to maintain >> (like we do for eg libevent) but other people didn't like that idea. > > less(1) can already be made more lightweight by setting LESSSECURE=1. > (I quite like this even without the reduced pledge, my biggest annoyance > with less is when I accidentally press 'v'). > > Any opinions on switching the default? Makes sense to me, I can live without the 's' command. ok jca@ > Index: main.c > === > RCS file: /cvs/src/usr.bin/less/main.c,v > retrieving revision 1.35 > diff -u -p -u -1 -2 -r1.35 main.c > --- main.c17 Sep 2016 15:06:41 - 1.35 > +++ main.c22 Dec 2017 22:19:04 - > @@ -87,17 +87,17 @@ main(int argc, char *argv[]) > > - secure = 0; > + secure = 1; > s = lgetenv("LESSSECURE"); > - if (s != NULL && *s != '\0') > - secure = 1; > + if (s != NULL && strcmp(s, "0") == 0) > + secure = 0; > > if (secure) { > if (pledge("stdio rpath wpath tty", NULL) == -1) { > perror("pledge"); > exit(1); > } > } else { > if (pledge("stdio rpath wpath cpath fattr proc exec tty", NULL) > == -1) { > perror("pledge"); > exit(1); > } > } > Index: less.1 > === > RCS file: /cvs/src/usr.bin/less/less.1,v > retrieving revision 1.52 > diff -u -p -r1.52 less.1 > --- less.124 Oct 2016 13:46:58 - 1.52 > +++ less.122 Dec 2017 22:17:28 - > @@ -1674,9 +1674,7 @@ differences in invocation syntax, the > .Ev LESSEDIT > variable can be changed to modify this default. > .Sh SECURITY > -When the environment variable > -.Ev LESSSECURE > -is set to 1, > +Normally, > .Nm > runs in a "secure" mode. > This means these features are disabled: > @@ -1698,6 +1696,10 @@ Metacharacters in filenames, such as "*" > .It " " > Filename completion (TAB, ^L). > .El > +.Pp > +To enable these features, set the environment variable > +.Ev LESSSECURE > +to 0. > .Sh COMPATIBILITY WITH MORE > If the environment variable > .Ev LESS_IS_MORE > -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: less(1): `!' command
On Fri, 22 Dec 2017 22:21:12 +, Stuart Henderson wrote: > On 2017/12/22 19:47, Nicholas Marriott wrote: > > I don't think we should bring ! back. > > > > I wanted to remove v and | (and some other stuff) shortly afterwards, but > > several people objected. > > > > I did suggest having a lightweight less in base for most people and adding > > the full upstream less to ports for the stuff we don't want to maintain > > (like we do for eg libevent) but other people didn't like that idea. > > less(1) can already be made more lightweight by setting LESSSECURE=1. > (I quite like this even without the reduced pledge, my biggest annoyance > with less is when I accidentally press 'v'). > > Any opinions on switching the default? I thought about that possibility too, and I mostly agree with the idea as I also run less(1) in secure mode very often, but it is nevertheless quite irrelevant to the original concern, which is that, when one chooses not to run less(1) in secure mode, whether that mode is the default one or not, it is inconsistent, for multiple reasons, to have removed the `!' command, but not `v' nor `|'. Until some form of agreement can be reached on that issue, I have reverted the removal of `!' in my personal tree, so I still pay the exact same price as everybody else ("proc exec"), but at least I now get something useful out of that. Regards, kshe
Re: less(1): `!' command
On December 22, 2017 11:21:12 PM GMT+01:00, Stuart Henderson wrote: >On 2017/12/22 19:47, Nicholas Marriott wrote: >> I don't think we should bring ! back. >> >> I wanted to remove v and | (and some other stuff) shortly afterwards, >but >> several people objected. >> >> I did suggest having a lightweight less in base for most people and >adding >> the full upstream less to ports for the stuff we don't want to >maintain >> (like we do for eg libevent) but other people didn't like that idea. > >less(1) can already be made more lightweight by setting LESSSECURE=1. >(I quite like this even without the reduced pledge, my biggest >annoyance >with less is when I accidentally press 'v'). > >Any opinions on switching the default? An interesting twist on this is that if someone is currently (mistakenly) using LESSECURE=0, e.g. for not having their system "less secure", they would currently aquire the intended goal, while after this change, that would change. Not sure if misconfigured systems are our main focus, but given the name "less", the aforementioned mistake doesn't strike me as totally unreasonable. /Alexander >Index: main.c >=== >RCS file: /cvs/src/usr.bin/less/main.c,v >retrieving revision 1.35 >diff -u -p -u -1 -2 -r1.35 main.c >--- main.c 17 Sep 2016 15:06:41 - 1.35 >+++ main.c 22 Dec 2017 22:19:04 - >@@ -87,17 +87,17 @@ main(int argc, char *argv[]) > >- secure = 0; >+ secure = 1; > s = lgetenv("LESSSECURE"); >- if (s != NULL && *s != '\0') >- secure = 1; >+ if (s != NULL && strcmp(s, "0") == 0) >+ secure = 0; > > if (secure) { > if (pledge("stdio rpath wpath tty", NULL) == -1) { > perror("pledge"); > exit(1); > } > } else { > if (pledge("stdio rpath wpath cpath fattr proc exec tty", NULL) > == >-1) { > perror("pledge"); > exit(1); > } > } >Index: less.1 >=== >RCS file: /cvs/src/usr.bin/less/less.1,v >retrieving revision 1.52 >diff -u -p -r1.52 less.1 >--- less.1 24 Oct 2016 13:46:58 - 1.52 >+++ less.1 22 Dec 2017 22:17:28 - >@@ -1674,9 +1674,7 @@ differences in invocation syntax, the > .Ev LESSEDIT > variable can be changed to modify this default. > .Sh SECURITY >-When the environment variable >-.Ev LESSSECURE >-is set to 1, >+Normally, > .Nm > runs in a "secure" mode. > This means these features are disabled: >@@ -1698,6 +1696,10 @@ Metacharacters in filenames, such as "*" > .It " " > Filename completion (TAB, ^L). > .El >+.Pp >+To enable these features, set the environment variable >+.Ev LESSSECURE >+to 0. > .Sh COMPATIBILITY WITH MORE > If the environment variable > .Ev LESS_IS_MORE
Re: less(1): `!' command
On 2017/12/22 19:47, Nicholas Marriott wrote: > I don't think we should bring ! back. > > I wanted to remove v and | (and some other stuff) shortly afterwards, but > several people objected. > > I did suggest having a lightweight less in base for most people and adding > the full upstream less to ports for the stuff we don't want to maintain > (like we do for eg libevent) but other people didn't like that idea. less(1) can already be made more lightweight by setting LESSSECURE=1. (I quite like this even without the reduced pledge, my biggest annoyance with less is when I accidentally press 'v'). Any opinions on switching the default? Index: main.c === RCS file: /cvs/src/usr.bin/less/main.c,v retrieving revision 1.35 diff -u -p -u -1 -2 -r1.35 main.c --- main.c 17 Sep 2016 15:06:41 - 1.35 +++ main.c 22 Dec 2017 22:19:04 - @@ -87,17 +87,17 @@ main(int argc, char *argv[]) - secure = 0; + secure = 1; s = lgetenv("LESSSECURE"); - if (s != NULL && *s != '\0') - secure = 1; + if (s != NULL && strcmp(s, "0") == 0) + secure = 0; if (secure) { if (pledge("stdio rpath wpath tty", NULL) == -1) { perror("pledge"); exit(1); } } else { if (pledge("stdio rpath wpath cpath fattr proc exec tty", NULL) == -1) { perror("pledge"); exit(1); } } Index: less.1 === RCS file: /cvs/src/usr.bin/less/less.1,v retrieving revision 1.52 diff -u -p -r1.52 less.1 --- less.1 24 Oct 2016 13:46:58 - 1.52 +++ less.1 22 Dec 2017 22:17:28 - @@ -1674,9 +1674,7 @@ differences in invocation syntax, the .Ev LESSEDIT variable can be changed to modify this default. .Sh SECURITY -When the environment variable -.Ev LESSSECURE -is set to 1, +Normally, .Nm runs in a "secure" mode. This means these features are disabled: @@ -1698,6 +1696,10 @@ Metacharacters in filenames, such as "*" .It " " Filename completion (TAB, ^L). .El +.Pp +To enable these features, set the environment variable +.Ev LESSSECURE +to 0. .Sh COMPATIBILITY WITH MORE If the environment variable .Ev LESS_IS_MORE
Re: less(1): `!' command
I don't think we should bring ! back. I wanted to remove v and | (and some other stuff) shortly afterwards, but several people objected. I did suggest having a lightweight less in base for most people and adding the full upstream less to ports for the stuff we don't want to maintain (like we do for eg libevent) but other people didn't like that idea. On 17 December 2017 at 15:48, kshe wrote: > On Sat, 16 Dec 2017 21:52:44 +, Theo de Raadt wrote: > > > On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote: > > > > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote: > > > > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > > > > > > > Hi, > > > > > > > > > > > > > > Would a patch to bring back the `!' command to less(1) be > accepted? The > > > > > > > commit message for its removal explains that ^Z should be used > instead, > > > > > > > but that obviously does not work if less(1) is run from > something else > > > > > > > than an interactive shell, for example when reading manual > pages from a > > > > > > > vi(1) instance spawned directly by `xterm -e vi' in a window > manager or > > > > > > > by `neww vi' in a tmux(1) session. > > > > > > > > > > > > Why should less be able to spawn another programs? This would > undermine > > > > > > all pledge work. > > > > > > > > > > Because of at least `v' and `|', less(1) already is able to invoke > > > > > arbitrary programs, and accordingly needs the "proc exec" promise, > so > > > > > bringing `!' back would not change anything from a security > perspective > > > > > (otherwise, I would obviously not have made such a proposition). > > > > > > > > > > In fact, technically, what I want to do is still currently > possible: > > > > > from any less(1) instance, one may use `v' to invoke vi(1), and > then use > > > > > vi(1)'s own `!' command as desired. So the functionality of `!' is > > > > > still there; it was only made more difficult to reach for no > apparent > > > > > reason. > > > > > > > > No apparent reason? > > > > > > > > Good you have an opinion. I have a different opinion: We should look > > > > for rarely used functionality and gut it. > > > > > > I completely agree, and I also completely agree with the rest of what > > > you said. However, in this particular case, the functionality of `!' > is > > > still fully (albeit indirectly) accessible, as shown above, and this is > > > why its deletion, when not immediately followed by that of `|' and `v', > > > made little sense for me. > > > > Oh, so you don't agree. Or do you. I can't tell. You haven't made up > > your mind enough to have a final position? > > In the case of less(1), the underlying functionality of `!' (invoking > arbitrary programs) has not been removed at all, as `!' itself was only > one way amongst others of doing that. Therefore, I would have prefered > that such an endeavour be conducted in steps at least as large as a > pledge(2) category. You may say this is absolutist, but, in the end, > users might actually be more inclined to accept such removals if they > come with, and thus are justified by, a real and immediate security > benefit, like stricter pledge(2) promises, rather than some vague > theoretical explanation about the global state of their software > environment. > > > [...] > > > > > May I go ahead and prepare a patch to remove "proc exec" entirely? > > > > Sure you could try, and see who freaks out. Exactly what the plan was > > all along. > > The minimal diff below does that. If it is accepted, further cleanups > would need to follow (in particular, removing a few unused variables and > functions), and of course the manual would also need some adjustments. > > Index: cmd.h > === > RCS file: /cvs/src/usr.bin/less/cmd.h,v > retrieving revision 1.10 > diff -u -p -r1.10 cmd.h > --- cmd.h 6 Nov 2015 15:58:01 - 1.10 > +++ cmd.h 17 Dec 2017 12:23:00 - > @@ -42,12 +42,12 @@ > #defineA_FF_LINE 29 > #defineA_BF_LINE 30 > #defineA_VERSION 31 > -#defineA_VISUAL32 > +/* 32 unused */ > #defineA_F_WINDOW 33 > #defineA_B_WINDOW 34 > #defineA_F_BRACKET 35 > #defineA_B_BRACKET 36 > -#defineA_PIPE 37 > +/* 37 unused */ > #defineA_INDEX_FILE38 > #defineA_UNDO_SEARCH 39 > #defineA_FF_SCREEN 40 > Index: command.c > === > RCS file: /cvs/src/usr.bin/less/command.c,v > retrieving revision 1.31 > diff -u -p -r1.31 command.c > --- command.c 12 Jan 2017 20:32:01 - 1.31 > +++ command.c 17 Dec 2017 12:23:00 - > @@ -241,12 +241,6 @@ exec_mca(void) > /* If tag structure is loaded then clean it up. */ > cleantag
Re: less(1): `!' command
On Sat, 16 Dec 2017 21:52:44 +, Theo de Raadt wrote: > > On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote: > > > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote: > > > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > > > > > > Hi, > > > > > > > > > > > > Would a patch to bring back the `!' command to less(1) be accepted? > > > > > > The > > > > > > commit message for its removal explains that ^Z should be used > > > > > > instead, > > > > > > but that obviously does not work if less(1) is run from something > > > > > > else > > > > > > than an interactive shell, for example when reading manual pages > > > > > > from a > > > > > > vi(1) instance spawned directly by `xterm -e vi' in a window > > > > > > manager or > > > > > > by `neww vi' in a tmux(1) session. > > > > > > > > > > Why should less be able to spawn another programs? This would > > > > > undermine > > > > > all pledge work. > > > > > > > > Because of at least `v' and `|', less(1) already is able to invoke > > > > arbitrary programs, and accordingly needs the "proc exec" promise, so > > > > bringing `!' back would not change anything from a security perspective > > > > (otherwise, I would obviously not have made such a proposition). > > > > > > > > In fact, technically, what I want to do is still currently possible: > > > > from any less(1) instance, one may use `v' to invoke vi(1), and then use > > > > vi(1)'s own `!' command as desired. So the functionality of `!' is > > > > still there; it was only made more difficult to reach for no apparent > > > > reason. > > > > > > No apparent reason? > > > > > > Good you have an opinion. I have a different opinion: We should look > > > for rarely used functionality and gut it. > > > > I completely agree, and I also completely agree with the rest of what > > you said. However, in this particular case, the functionality of `!' is > > still fully (albeit indirectly) accessible, as shown above, and this is > > why its deletion, when not immediately followed by that of `|' and `v', > > made little sense for me. > > Oh, so you don't agree. Or do you. I can't tell. You haven't made up > your mind enough to have a final position? In the case of less(1), the underlying functionality of `!' (invoking arbitrary programs) has not been removed at all, as `!' itself was only one way amongst others of doing that. Therefore, I would have prefered that such an endeavour be conducted in steps at least as large as a pledge(2) category. You may say this is absolutist, but, in the end, users might actually be more inclined to accept such removals if they come with, and thus are justified by, a real and immediate security benefit, like stricter pledge(2) promises, rather than some vague theoretical explanation about the global state of their software environment. > [...] > > > May I go ahead and prepare a patch to remove "proc exec" entirely? > > Sure you could try, and see who freaks out. Exactly what the plan was > all along. The minimal diff below does that. If it is accepted, further cleanups would need to follow (in particular, removing a few unused variables and functions), and of course the manual would also need some adjustments. Index: cmd.h === RCS file: /cvs/src/usr.bin/less/cmd.h,v retrieving revision 1.10 diff -u -p -r1.10 cmd.h --- cmd.h 6 Nov 2015 15:58:01 - 1.10 +++ cmd.h 17 Dec 2017 12:23:00 - @@ -42,12 +42,12 @@ #defineA_FF_LINE 29 #defineA_BF_LINE 30 #defineA_VERSION 31 -#defineA_VISUAL32 +/* 32 unused */ #defineA_F_WINDOW 33 #defineA_B_WINDOW 34 #defineA_F_BRACKET 35 #defineA_B_BRACKET 36 -#defineA_PIPE 37 +/* 37 unused */ #defineA_INDEX_FILE38 #defineA_UNDO_SEARCH 39 #defineA_FF_SCREEN 40 Index: command.c === RCS file: /cvs/src/usr.bin/less/command.c,v retrieving revision 1.31 diff -u -p -r1.31 command.c --- command.c 12 Jan 2017 20:32:01 - 1.31 +++ command.c 17 Dec 2017 12:23:00 - @@ -241,12 +241,6 @@ exec_mca(void) /* If tag structure is loaded then clean it up. */ cleantags(); break; - case A_PIPE: - if (secure) - break; - (void) pipe_mark(pipec, cbuf); - error("|done", NULL); - break; } } @@ -1396,35 +1390,6 @@ again: c = getcc(); goto again; - case A_VISUAL: - /* -* Invoke an editor on the input file. -*/ - if (secure) { -
Re: less(1): `!' command
> On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote: > > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote: > > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > > > > > Hi, > > > > > > > > > > Would a patch to bring back the `!' command to less(1) be accepted? > > > > > The > > > > > commit message for its removal explains that ^Z should be used > > > > > instead, > > > > > but that obviously does not work if less(1) is run from something else > > > > > than an interactive shell, for example when reading manual pages from > > > > > a > > > > > vi(1) instance spawned directly by `xterm -e vi' in a window manager > > > > > or > > > > > by `neww vi' in a tmux(1) session. > > > > > > > > Why should less be able to spawn another programs? This would undermine > > > > all pledge work. > > > > > > Because of at least `v' and `|', less(1) already is able to invoke > > > arbitrary programs, and accordingly needs the "proc exec" promise, so > > > bringing `!' back would not change anything from a security perspective > > > (otherwise, I would obviously not have made such a proposition). > > > > > > In fact, technically, what I want to do is still currently possible: > > > from any less(1) instance, one may use `v' to invoke vi(1), and then use > > > vi(1)'s own `!' command as desired. So the functionality of `!' is > > > still there; it was only made more difficult to reach for no apparent > > > reason. > > > > No apparent reason? > > > > Good you have an opinion. I have a different opinion: We should look > > for rarely used functionality and gut it. > > I completely agree, and I also completely agree with the rest of what > you said. However, in this particular case, the functionality of `!' is > still fully (albeit indirectly) accessible, as shown above, and this is > why its deletion, when not immediately followed by that of `|' and `v', > made little sense for me. Oh, so you don't agree. Or do you. I can't tell. You haven't made up your mind enough to have a final position? > Either the commands that require "proc exec" should all be removed along > with that promise, or `!' should be brought back without any pledge(2) > modifications. That is pretty absolutist. The universe is not always consistant, and neither is OpenBSD. The final decisions haven't been made yet, because we haven't gauged the usage patterns. > But currently it really feels like a big waste (for both > parties) to request such high privileges, and then to do almost nothing > useful with them. Request? pledge isn't a "request" system. It is a 2nd specification of the program about maximum it believes it will use, and therefore it is a hard brake. At the moment the featureset still needs "proc exec". So the specification isn't a waste, it is accurate. > If the plan really was to get rid of all such commands eventually, what > exactly is preventing that from happening now? The plan was to get rid of ! in a few commands, then later get rid of a few more of them, and see where we end up. With such plans, we don't always act all on one step, because then it is too easy to get embroiled in just that one battle and forget about the other things which also need doing. Also it is impossible to ask the community because petty fights result and provide innaccurate usage assessments. There are many other things to do. As a result, our universe is not always consistant. This is an example. > May I go ahead and prepare a patch to remove "proc exec" entirely? Sure you could try, and see who freaks out. Exactly what the plan was all along.
Re: less(1): `!' command
On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote: > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote: > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > > > > Hi, > > > > > > > > Would a patch to bring back the `!' command to less(1) be accepted? The > > > > commit message for its removal explains that ^Z should be used instead, > > > > but that obviously does not work if less(1) is run from something else > > > > than an interactive shell, for example when reading manual pages from a > > > > vi(1) instance spawned directly by `xterm -e vi' in a window manager or > > > > by `neww vi' in a tmux(1) session. > > > > > > Why should less be able to spawn another programs? This would undermine > > > all pledge work. > > > > Because of at least `v' and `|', less(1) already is able to invoke > > arbitrary programs, and accordingly needs the "proc exec" promise, so > > bringing `!' back would not change anything from a security perspective > > (otherwise, I would obviously not have made such a proposition). > > > > In fact, technically, what I want to do is still currently possible: > > from any less(1) instance, one may use `v' to invoke vi(1), and then use > > vi(1)'s own `!' command as desired. So the functionality of `!' is > > still there; it was only made more difficult to reach for no apparent > > reason. > > No apparent reason? > > Good you have an opinion. I have a different opinion: We should look > for rarely used functionality and gut it. I completely agree, and I also completely agree with the rest of what you said. However, in this particular case, the functionality of `!' is still fully (albeit indirectly) accessible, as shown above, and this is why its deletion, when not immediately followed by that of `|' and `v', made little sense for me. Either the commands that require "proc exec" should all be removed along with that promise, or `!' should be brought back without any pledge(2) modifications. But currently it really feels like a big waste (for both parties) to request such high privileges, and then to do almost nothing useful with them. If the plan really was to get rid of all such commands eventually, what exactly is preventing that from happening now? May I go ahead and prepare a patch to remove "proc exec" entirely? Regards, kshe
Re: less(1): `!' command
> > Would a patch to bring back the `!' command to less(1) be accepted? The > > commit message for its removal explains that ^Z should be used instead, > > but that obviously does not work if less(1) is run from something else > > than an interactive shell, for example when reading manual pages from a > > vi(1) instance spawned directly by `xterm -e vi' in a window manager or > > by `neww vi' in a tmux(1) session. > > Why should less be able to spawn another programs? This would undermine > all pledge work. It does not undermine any pledge work at all. The strategy is reduction of "many programs have ways to break out to full system call operation, but why?" Fixing all of these concerns won't happen in a day. We are boiling this frog slowly.
Re: less(1): `!' command
> On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote: > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > > > Hi, > > > > > > Would a patch to bring back the `!' command to less(1) be accepted? The > > > commit message for its removal explains that ^Z should be used instead, > > > but that obviously does not work if less(1) is run from something else > > > than an interactive shell, for example when reading manual pages from a > > > vi(1) instance spawned directly by `xterm -e vi' in a window manager or > > > by `neww vi' in a tmux(1) session. > > > > Why should less be able to spawn another programs? This would undermine > > all pledge work. > > Because of at least `v' and `|', less(1) already is able to invoke > arbitrary programs, and accordingly needs the "proc exec" promise, so > bringing `!' back would not change anything from a security perspective > (otherwise, I would obviously not have made such a proposition). > > In fact, technically, what I want to do is still currently possible: > from any less(1) instance, one may use `v' to invoke vi(1), and then use > vi(1)'s own `!' command as desired. So the functionality of `!' is > still there; it was only made more difficult to reach for no apparent > reason. No apparent reason? Good you have an opinion. I have a different opinion: We should look for rarely used functionality and gut it. Over the last 40 years people have felt a desire to add all possible features and options to all commands, and noone ever considered the impact of having all programs above to reach all system calls, and that these features are being installed in all program operating environents. Then someone adds less(1) to a script which requires security, and just like that it has none. The entire environment is poisoned, and people are pushed to jump to other environments which aren't poisoned in this way, until enough people arrive there, the feature explosion happens there also resulting in "reach all the system calls", and we're stuck in the same rut again. I don't think all programs should be able to run all other programs. As a result I support the idea of trying to find the things people don't actually use, and removing them incrementally. '|' should be on the list next. But you don't. Luckily you have other choices. Are you prepared to die on this hill that less must support '!'? If so, there's that FreeBSD hill over there..
Re: less(1): `!' command
On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote: > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > > Hi, > > > > Would a patch to bring back the `!' command to less(1) be accepted? The > > commit message for its removal explains that ^Z should be used instead, > > but that obviously does not work if less(1) is run from something else > > than an interactive shell, for example when reading manual pages from a > > vi(1) instance spawned directly by `xterm -e vi' in a window manager or > > by `neww vi' in a tmux(1) session. > > Why should less be able to spawn another programs? This would undermine > all pledge work. Because of at least `v' and `|', less(1) already is able to invoke arbitrary programs, and accordingly needs the "proc exec" promise, so bringing `!' back would not change anything from a security perspective (otherwise, I would obviously not have made such a proposition). In fact, technically, what I want to do is still currently possible: from any less(1) instance, one may use `v' to invoke vi(1), and then use vi(1)'s own `!' command as desired. So the functionality of `!' is still there; it was only made more difficult to reach for no apparent reason. Regards, kshe
Re: less(1): `!' command
On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote: > Hi, > > Would a patch to bring back the `!' command to less(1) be accepted? The > commit message for its removal explains that ^Z should be used instead, > but that obviously does not work if less(1) is run from something else > than an interactive shell, for example when reading manual pages from a > vi(1) instance spawned directly by `xterm -e vi' in a window manager or > by `neww vi' in a tmux(1) session. Why should less be able to spawn another programs? This would undermine all pledge work. IIUC your vi scenario, you are not spawing 'vi' from less but the opposite way. That should work. j.
less(1): `!' command
Hi, Would a patch to bring back the `!' command to less(1) be accepted? The commit message for its removal explains that ^Z should be used instead, but that obviously does not work if less(1) is run from something else than an interactive shell, for example when reading manual pages from a vi(1) instance spawned directly by `xterm -e vi' in a window manager or by `neww vi' in a tmux(1) session. If not, then at least documentation for this command should be removed properly (I cannot provide a diff as this file contains raw backspace characters): $ cd /usr/src/usr.bin/less/ $ printf '99d\nwq\n' | ed - less.hlp Regards, kshe