Re: start cleaning up UTF-8 processing in less(1)
Hi, Stefan Sperling wrote on Sat, Feb 23, 2019 at 04:19:02PM +0100: > Your diff looks good to me. > And I can't see how it could make this situation any worse either. thanks for checking; i committed the first patch. To be able to continue with the less cleanup i started, i have to explain some theory first, and i'm kindly asking for advice which functionality is desired. Personally, i'd like to make less simpler and safer, removing some anachronistic, dangerous functionality, see below. Considerable complication in the less code results from support for https://en.wikipedia.org/wiki/ANSI_escape_code sequences. Note that i will *not* propose to delete that support completely, but only some more arcane and more dangerous parts of it, see at the very end of this mail. There are three ways to write an ANSI escape sequence; i'll have to explain all three in turn. First syntax: ESCAPE-BRACKET $ printf "normal \x1b[43m yellow \x1b[0m normal\n" Each ANSI escape sequence is introduced by the two-byte sequence consisting of the ASCII escape character followed by an ASCII opening square bracket. This form is relatively benign because it is both valid ASCII and valid UTF-8 and looks identical in both encodings. In our default configuration of xterm(1), which is moderately secure without being paranoid, the above command works out of the box, showing you the word "yellow" on a yellow background. It doesn't matter what LC_CTYPE is, it always works. By default, less(1) does *not* interpret such escape sequences, but escapes them and shows them as follows: normal ESC[43m yellow ESC[0m normal where the two instances of the string "ESC" are highlighted in reverse video. But if you run less(1) with the -R option, it passes the escape sequences on to the terminal, and again you see the word "yellow" on a yellow background, inside less(1). Horizontal scrolling works correctly, preserving vertical alignment of display columns, with the two escape sequences taking up zero columns each in an xterm(1). Your mileage may vary on exotic terminals, though. I'm conivinced this behaviour ought to be preserved. Second syntax: conventional 8-bit CSI - $ LC_CTYPE=C printf "normal \23343m yellow \2330m normal\n" Here, each ANSI escape sequence is introduced by the single byte 0x9b (CSI = control sequence introducer) instead of by the two-byte sequence ESC-[. Obviously, this implies that the second syntax is incompatible with UTF-8, so we always have to make sure we have LC_CTYPE=C set when doing anything with the second syntax. Note that i had to give the CSI bytes in octal in the printf command above; \x9b43m would not work because printf(1) would be unable to figure out where the hexadecimal character number ends. In our default xterm(1) configuration, this syntax is *NOT* supported. The above command prints the same as $ printf "normal \xef\xbf\27543m yellow \xef\xbf\2750m normal\n" where \xef\xbf\275 is the Unicode replacement character U+FFFD substituted for the (unsafe, ill-encoded) CSI. I you really want to, you can use the second syntax on OpenBSD, but that is *ABSOLUTELY NOT RECOMMENDED*. You would have to run xterm(1) in the unsafe, legacy, so called "conventional 8bit" mode as follows: $ LC_CTYPE=C xterm +lc # This is a really BAD idea. In such an xterm(1), the line $ LC_CTYPE=C printf "normal \23343m yellow \2330m normal\n" does show the word "yellow" in front of a yellow background. Reading the less(1) source code, i guess that it *intends* to support this, but it doesn't actually work for me: $ export LC_CTYPE=C $ printf "normal \23343m yellow \2330m normal\n" | less -R results in normal <9B>43m yellow <9B>0m normal for me, with the two strings "<9B>" highlighted in reverse video. Only with $ printf "normal \23343m yellow \2330m normal\n" | less -r do i see yellow background - but that is utterly useless because -r completely breaks column counting and wrapping in less(1), and besides it is irrelevent because -r doesn't need to inspect anything but simply passes *everything* through no matter what. I think it is better to completely remove support for the second syntax from less than to repair it. It is rarely used because it is incompatible with Unicode, it provides no advantages compared to the first syntax, but it painfully complicates the code. Third syntax: UTF-8 CSI --- $ printf "normal \xc2\23343m yellow \xc2\2330m normal\n" Here, each ANSI escape sequence is introduced by the Unicode character U+009B, represented in its two-byte UTF-8 encoding \xc2\x9b instead of by the two-byte sequence ESC-[. Obviously, this only mskes sense with a UTF-8 locale. However, the xterm documentation says in the file /usr/xenocara/app/xterm/ctlseqs.ms : It is not possible to use a C1 control obtained from decoding the UTF-8 text, because that would require reprocessing the data.
Re: start cleaning up UTF-8 processing in less(1)
Hi Todd, Todd C. Miller wrote on Mon, Feb 25, 2019 at 01:06:02PM -0700: > On Mon, 25 Feb 2019 19:43:36 +0100, Ingo Schwarze wrote: >> Todd C. Miller wrote on Mon, Feb 25, 2019 at 09:45:12AM -0700: >>> On Mon, 25 Feb 2019 12:39:41 +0100, Ingo Schwarze wrote: >>>> Index: line.c >> [...] >>>> @@ -469,11 +469,10 @@ in_ansi_esc_seq(void) >>>> * Search backwards for either an ESC (which means we ARE in a seq); >>>> * or an end char (which means we're NOT in a seq). >>>> */ >>>> - for (p = [curr]; p > linebuf; ) { >>>> - LWCHAR ch = step_char(, -1, linebuf); >>>> - if (IS_CSI_START(ch)) >>>> + for (p = linebuf + curr - 1; p >= linebuf; p--) { >>> Since curr can be 0, can this lead to be a single byte underflow? >> No, in that case (which logically means the line buffer is empty), >> the end condition p >= linebuf is false right away, the loop >> is never entered, the function returns 0 right away and at the >> call site, the first if brach (containing "curr--") isn't entered >> either. > Strictly speaking, the result of "p = linebuf + curr - 1" is undefined > when curr < 1. There is a special case in the standard when the > result is one past the end of an array but no corresponding case > for one element before the array. In practice, it is unlikely to > matter. Ouch, i keep forgetting that one, thanks for reminding me. In that case, let's just use an index rather than a pointer; diff otherwise unchanged. OK? Ingo Index: cvt.c === RCS file: /cvs/src/usr.bin/less/cvt.c,v retrieving revision 1.8 diff -u -p -r1.8 cvt.c --- cvt.c 17 Sep 2016 15:06:41 - 1.8 +++ cvt.c 26 Feb 2019 01:31:31 - @@ -77,7 +77,7 @@ cvt_text(char *odst, char *osrc, int *ch dst--; } while (dst > odst && !IS_ASCII_OCTET(*dst) && !IS_UTF8_LEAD(*dst)); - } else if ((ops & CVT_ANSI) && IS_CSI_START(ch)) { + } else if ((ops & CVT_ANSI) && ch == ESC) { /* Skip to end of ANSI escape sequence. */ src++; /* skip the CSI start char */ while (src < src_end) Index: filename.c === RCS file: /cvs/src/usr.bin/less/filename.c,v retrieving revision 1.26 diff -u -p -r1.26 filename.c --- filename.c 29 Oct 2017 17:10:55 - 1.26 +++ filename.c 26 Feb 2019 01:31:31 - @@ -348,7 +348,7 @@ bin_file(int f) pend = [n]; for (p = data; p < pend; ) { LWCHAR c = step_char(, +1, pend); - if (ctldisp == OPT_ONPLUS && IS_CSI_START(c)) { + if (ctldisp == OPT_ONPLUS && c == ESC) { do { c = step_char(, +1, pend); } while (p < pend && is_ansi_middle(c)); Index: less.h === RCS file: /cvs/src/usr.bin/less/less.h,v retrieving revision 1.28 diff -u -p -r1.28 less.h --- less.h 30 Dec 2018 23:09:58 - 1.28 +++ less.h 26 Feb 2019 01:31:31 - @@ -38,8 +38,6 @@ #defineIS_SPACE(c) isspace((unsigned char)(c)) #defineIS_DIGIT(c) isdigit((unsigned char)(c)) -#defineIS_CSI_START(c) (((LWCHAR)(c)) == ESC || (((LWCHAR)(c)) == CSI)) - #ifndef TRUE #defineTRUE1 #endif @@ -151,9 +149,7 @@ struct textlist { #defineAT_INDET(1 << 7) /* Indeterminate: either bold or underline */ #defineCONTROL(c) ((c)&037) - #defineESC CONTROL('[') -#defineCSI ((unsigned char)'\233') #defineS_INTERRUPT 01 #defineS_STOP 02 Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.23 diff -u -p -r1.23 line.c --- line.c 24 Feb 2019 04:54:36 - 1.23 +++ line.c 26 Feb 2019 01:31:31 - @@ -230,7 +230,7 @@ pshift(int shift) */ while (shifted <= shift && from < curr) { c = linebuf[from]; - if (ctldisp == OPT_ONPLUS && IS_CSI_START(c)) { + if (ctldisp == OPT_ONPLUS && c == ESC) { /* Keep cumulative effect. */ linebuf[to] = c; attr[to++] = attr[from++]; @@ -463,17 +463,16 @@ backc(void) static int in_ansi_esc_seq(void) { - char *p; + int i; /* * Search ba
less(1) UTF-8 cleanup: backc()
Hi, the first step in cleaning up do_append() is the subroutine backc(), which is only called from a single place, namely in do_append(). Its purpose is to back up over the previous positive-width character during the application of backspace-encoding for "bold" or "underlined". The task here is to get rid of the data type LWCHAR and to stop calling the bad function step_char(). I tried to figure out whether the content of linebuf[] is guaranteed to always be valid UTF-8, but i still don't feel completely convinced. For example, the function forw_raw_line() in line.c looks suspicious, as if it might potentially copy arbitrary, invalidly encoded data into linebuf[]. Consequently, i consider it safer to perform complete error handling in backc(), at least for now. The new code for the two backward iterations in backc() looks slightly repetive. But since in the first iteration (for ch), failure means there is no previous character whatsoever and we have to return failure, whereas for the second iteration (for prev_ch), failure only means that we need to call pwidth() without passing a previous character, error handling is different, so trying to unify both backward iterations would probably make the code more complicated rather than simpler. Note that while prev_ch failure must not prevent the width measurement of ch with pwidth(), if that returns 0 (i.e. when ch is a combining accent), iterating to the preceding character only makes sense when the preceding character is valid, so i'm adding "if (prev_ch == L'\0') return (0);". The lack of such a check in the old code may have been a bug, but i'm not completely sure it could actually be triggered in some way. Finally, when mbtowc(3) fails, portability requires to reset the internal state of mbtowc(3). That's a no-op on OpenBSD, and it is unlikely to actually be required for UTF-8 even on other systems because internal state is only supposed to be useful for stateful encodings, but i consider it good idiomatic coding to include the reset anyway. I'm also adding it to one place where i forgot about it in an earlier patch. Of course, it is only required for len == -1, but it does no harm for the (less likely) case of stray intervening continuation bytes (i + len < curr). OK? Ingo Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.26 diff -u -p -r1.26 line.c --- line.c 1 Mar 2019 16:41:37 - 1.26 +++ line.c 2 Mar 2019 11:54:41 - @@ -437,23 +437,51 @@ pwidth(wchar_t ch, int a, wchar_t prev_c static int backc(void) { - LWCHAR prev_ch; - char *p = linebuf + curr; - LWCHAR ch = step_char(, -1, linebuf + lmargin); - int width; + wchar_t ch, prev_ch; + int i, len, width; + + i = curr - 1; + if (utf_mode) { + while (i >= lmargin && IS_UTF8_TRAIL(linebuf[i])) + i--; + } + if (i < lmargin) + return (0); + if (utf_mode) { + len = mbtowc(, linebuf + i, curr - i); + if (len == -1 || i + len < curr) { + (void)mbtowc(NULL, NULL, MB_CUR_MAX); + return (0); + } + } else + ch = linebuf[i]; /* This assumes that there is no '\b' in linebuf. */ while (curr > lmargin && column > lmargin && (!(attr[curr - 1] & (AT_ANSI|AT_BINARY { - curr = p - linebuf; - prev_ch = step_char(, -1, linebuf + lmargin); + curr = i--; + if (utf_mode) { + while (i >= lmargin && IS_UTF8_TRAIL(linebuf[i])) + i--; + } + if (i < lmargin) + prev_ch = L'\0'; + else if (utf_mode) { + len = mbtowc(_ch, linebuf + i, curr - i); + if (len == -1 || i + len < curr) { + (void)mbtowc(NULL, NULL, MB_CUR_MAX); + prev_ch = L'\0'; + } + } else + prev_ch = linebuf[i]; width = pwidth(ch, attr[curr], prev_ch); column -= width; if (width > 0) return (1); + if (prev_ch == L'\0') + return (0); ch = prev_ch; } - return (0); } @@ -536,8 +564,10 @@ store_char(LWCHAR ch, char a, char *rep, break; if (i >= 0) { w = mbtowc(_ch, linebuf + i, curr - i); - if (w == -1 || i + w < curr) + if (w == -1 || i + w < curr) { + (void)mbtowc(NULL, NULL, MB_CUR_MAX);
Re: mandoc -Tlint systat.1 fix
Hi David, David Gwynne wrote on Mon, Mar 04, 2019 at 08:59:12PM +1000: > lint thinks uvm_swap_get() looks like a function name, > so this uses .Fn to mark it up as one. > > ok? Sure. Given that the function uvm_swap_get() appears to be important enough to be mentioned even in a userland manual page, would it maybe also make sense to document it in uvm(9)? Not sure, just asking. If so, then the line would become ".Xr uvm_swap_get 9" afterwards. Yours, Ingo > Index: systat.1 > === > RCS file: /cvs/src/usr.bin/systat/systat.1,v > retrieving revision 1.110 > diff -u -p -r1.110 systat.1 > --- systat.1 25 Jul 2018 17:24:14 - 1.110 > +++ systat.1 4 Mar 2019 10:58:15 - > @@ -697,7 +697,8 @@ swap pages in use > .It swpgonly > in use swap pages not in RAM > .It nswget > -fault called uvm_swap_get() > +fault called > +.Fn uvm_swap_get > .It nanon > total anon's > .Pp
Re: update ctype data to unicode 10
Hi Andrew, Lauri Tirkkonen wrote on Fri, Feb 22, 2019 at 01:57:01AM +0200: > Hi, the recent perl-5.28.1 and related unicore update brought the > unicode data from version 8.0.0 to version 10.0.0. That fixes some > character classifications (eg. emoji characters gained East_Asian_Width > value 'Wide', which causes them to correctly get a wcwidth() of 2). But > the ctype source data needs to be regenerated with this new perl/unicore > to gain the benefits. > > So I've done just that: > cd /usr/src/share/locale/ctype && ./gen_ctype_utf8.pl > en_US.UTF-8.src > and the resulting diff is below. You could obviously run this yourself - > I'm only including the diff because it took quite a long time to run the > script (177m08.01s real). > > The resulting LC_CTYPE generated from this new source gives a wcwidth() > of 2 to eg. U+1F3EE, as expected (it used to be 1 with unicode 8.0 > data). I looked through the diff (not checking each and every line thoroughly, but watching out for stuff that looks suspicious, and doing occasional random sampling tests), and i did not spot anything that looks wrong. Much of it is simply additions of new characters, which are unlikely to cause regressions in the first place, and some parts change width one to width two, which appears to be intended by what Lauri says above and which certainly isn't very dangerous either. So if the diff agrees with what you got, Andrew, it's OK schwarze@. Yours, Ingo
Re: update ctype data to unicode 10
Hi Andrew, Andrew Fresh wrote on Thu, Feb 21, 2019 at 08:22:16PM -0700: > On Fri, Feb 22, 2019 at 01:57:01AM +0200, Lauri Tirkkonen wrote: >> Hi, the recent perl-5.28.1 and related unicore update brought the >> unicode data from version 8.0.0 to version 10.0.0. That fixes some >> character classifications (eg. emoji characters gained East_Asian_Width >> value 'Wide', which causes them to correctly get a wcwidth() of 2). But >> the ctype source data needs to be regenerated with this new perl/unicore >> to gain the benefits. >> >> So I've done just that: >> cd /usr/src/share/locale/ctype && ./gen_ctype_utf8.pl > en_US.UTF-8.src >> and the resulting diff is below. You could obviously run this yourself - > I meant to do that and make sure it was OK with schwarze@, I'm certainly OK with the basic idea of doing an update in this way. > so it is OK afresh1@, I guess once we are confident that not just the idea, but the specific diff is OK, you'll not only get to OK it, but even to commit it, Andrew. :) > although I didn't compare your output to mine. Should be trivial to do, or would that cause any inconvenience? It seems like a free additional test to me, as another easy test to make sure that nothing unexpected went awry. >> I'm only including the diff because it took quite a long time to run the >> script (177m08.01s real). > There are a lot of unicode symbols. Someday if I get super bored I'll > write something to do it in parallel :-) I clearly prefer simplicity over performance in this respect. I'll now have a look at the diff itself to see whether anything looks suspicious. Yours, Ingo
Re: start cleaning up UTF-8 processing in less(1)
Hi, Nicholas Marriott wrote on Mon, Feb 25, 2019 at 10:14:03AM +: > Ingo Schwarze wrote: >> During the upcoming cleanup steps, let use retain full support for >> the first (ESC-[) syntax and lets us completely delete support for >> the second and third CSI syntaxes (single-byte CSI and UTF-8 >> single-character two-byte CSI). >> >> If you are OK with that plan, i'll send diffs implementing that. > Very little, if anything at all, uses 8-bit or UTF-8 CSI, > getting rid of it is a good idea. Thank you for all your feedback. So here is a patch doing that. The central part is dropping the definitions related to the 8-bit and the UTF-8 CSI from less.h, keeping only the definitions needed for ESC-[. In the future, these forms of the CSI will be treated exactly like other control characters: e.g., encoded for display. The rest of the diff mechanically resolves the IS_CSI_START() macro, replacing it by a simple "== ESC" test. Casting is needed at none of the places: the types involved are simply: '[' int (value 0x5b < 0x7f) 037 int (value 0x1f < 0x7f) -- apply & --> ESC int (value 0x1b < 0x7f) So there is no danger in comparing ESC using == or != with any of the types char, unsigned char (which are covered by the int range during promotion) or LWCHAR == unsigned long, in which case the (positve) constant ESC gets sign extended, which is value-preserving for a positive integer. The effects of the various chunks are, in detail: * cvt.c, cvt_text(): 8-bit and UTF-8 CSI are no longer recognized as ANSI escape sequences but copied just like any other control characters. * filename.c, bin_file(): 8-bit and UTF-8 CSI are no longer skipped but counted as binary characters just like any other binary characters. (change in behaviour confirmed by testing) * line.c, pshift(): No longer treat 8-bit and UTF-8 CSI sequences specially when shifting a line left. (shifting left and right was tested) * line.c, in_ansi_esc_seq(): No longer recognize 8-bit and UTF-8 CSI sequences. This allows simplifying the loop to iterate over bytes rather than over characters, getting rif of one LWCHAR variable and one call to the pesky function step_char(-1}. (it was tested that ESC-[ still works) * line.c, store_char(): No longer recognize 8-bit and UTF-8 CSI sequences. Again, minus one LWCHAR and minus one step_char(-1). * line.c, do_append(): No longer pass 8-bit and UTF-8 CSI unencoded, encode them like any other control character. This survived light testing. All code touched will be touched again in the upcoming cleanup because all these functions still contain step_char() or get_wchar(), but this step simplifies many of the upcoming cleanup steps. Already minus six lines of code and minus seven macro calls in this patch alone. OK? Ingo Index: cvt.c === RCS file: /cvs/src/usr.bin/less/cvt.c,v retrieving revision 1.8 diff -u -p -r1.8 cvt.c --- cvt.c 17 Sep 2016 15:06:41 - 1.8 +++ cvt.c 25 Feb 2019 11:25:38 - @@ -77,7 +77,7 @@ cvt_text(char *odst, char *osrc, int *ch dst--; } while (dst > odst && !IS_ASCII_OCTET(*dst) && !IS_UTF8_LEAD(*dst)); - } else if ((ops & CVT_ANSI) && IS_CSI_START(ch)) { + } else if ((ops & CVT_ANSI) && ch == ESC) { /* Skip to end of ANSI escape sequence. */ src++; /* skip the CSI start char */ while (src < src_end) Index: filename.c === RCS file: /cvs/src/usr.bin/less/filename.c,v retrieving revision 1.26 diff -u -p -r1.26 filename.c --- filename.c 29 Oct 2017 17:10:55 - 1.26 +++ filename.c 25 Feb 2019 11:25:38 - @@ -348,7 +348,7 @@ bin_file(int f) pend = [n]; for (p = data; p < pend; ) { LWCHAR c = step_char(, +1, pend); - if (ctldisp == OPT_ONPLUS && IS_CSI_START(c)) { + if (ctldisp == OPT_ONPLUS && c == ESC) { do { c = step_char(, +1, pend); } while (p < pend && is_ansi_middle(c)); Index: less.h === RCS file: /cvs/src/usr.bin/less/less.h,v retrieving revision 1.28 diff -u -p -r1.28 less.h --- less.h 30 Dec 2018 23:09:58 - 1.28 +++ less.h 25 Feb 2019 11:25:38 - @@ -38,8 +38,6 @@ #defineIS_SPACE(c) isspace((unsigned char)(c)) #defineIS_DIGIT(c) isdigit((unsigned char)(c)) -#defineIS_CSI_START(c) (((LWCHAR)(c)) == ESC || (((LWCHAR)(c
Re: nm(1): return on malloc error
Hi, Benjamin Baier wrote on Sat, Mar 02, 2019 at 10:10:40AM +0100: > On malloc error symtab is unmapped, so proceeding on will lead > to a NULL pointer dereference. > When malloc fails we should return like the MMAP case does. while i'm certainly not experienced with nm(1), this change looks correct to me. OK to commit? Note that returning implies that the program might attempt to process further files, which is of dubious value: it is likely to fail, too, and in the unlikely case of success, that's maybe even worse: the output from subsequent files might cause the user to miss the error message about the malloc failure... But given that mmap(2) failure already behaves like that, switching to just err(3) out on resource exhaustion looks like a larger change which i'm not planning to push for, even though it would make sense to me. Here is the patch again, in standard format. Yours Ingo Index: nm.c === RCS file: /cvs/src/usr.bin/nm/nm.c,v retrieving revision 1.53 diff -p -U8 -r1.53 nm.c --- nm.c27 Oct 2017 16:47:08 - 1.53 +++ nm.c3 Mar 2019 15:12:28 - @@ -376,16 +376,17 @@ show_symtab(off_t off, u_long len, const MMAP(symtab, len, PROT_READ, MAP_PRIVATE|MAP_FILE, fileno(fp), off); if (symtab == MAP_FAILED) return (1); namelen = sizeof(ar_head.ar_name); if ((p = malloc(sizeof(ar_head.ar_name))) == NULL) { warn("%s: malloc", name); MUNMAP(symtab, len); + return (1); } printf("\nArchive index:\n"); num = betoh32(*symtab); strtab = (char *)(symtab + num + 1); for (ps = symtab + 1; num--; ps++, strtab += strlen(strtab) + 1) { if (fseeko(fp, betoh32(*ps), SEEK_SET)) { warn("%s: fseeko", name);
Re: nm(1): return on malloc error
Hi, Benjamin Baier wrote on Sat, Mar 02, 2019 at 10:10:40AM +0100: > On malloc error symtab is unmapped, so proceeding on will lead > to a NULL pointer dereference. > When malloc fails we should return like the MMAP case does. Committed. Thanks for the patch (and to those who checked it). Ingo > Index: nm.c > === > RCS file: /cvs/src/usr.bin/nm/nm.c,v > retrieving revision 1.53 > diff -u -p -u -C10 -r1.53 nm.c > *** nm.c 27 Oct 2017 16:47:08 - 1.53 > --- nm.c 20 Feb 2019 17:34:01 - > *** show_symtab(off_t off, u_long len, const > *** 374,393 > --- 374,394 > restore = ftello(fp); > > MMAP(symtab, len, PROT_READ, MAP_PRIVATE|MAP_FILE, fileno(fp), off); > if (symtab == MAP_FAILED) > return (1); > > namelen = sizeof(ar_head.ar_name); > if ((p = malloc(sizeof(ar_head.ar_name))) == NULL) { > warn("%s: malloc", name); > MUNMAP(symtab, len); > + return (1); > } > > printf("\nArchive index:\n"); > num = betoh32(*symtab); > strtab = (char *)(symtab + num + 1); > for (ps = symtab + 1; num--; ps++, strtab += strlen(strtab) + 1) { > if (fseeko(fp, betoh32(*ps), SEEK_SET)) { > warn("%s: fseeko", name); > rval = 1; > break; >
less(1) UTF-8 cleanup: store_char()
Hi, Nicholas Marriott wrote on Tue, Feb 26, 2019 at 07:20:47AM +: > Looks good, ok nicm Thanks to all of you for checking. Here is the next step, slowly coming closer to the meat of the matter: Start cleaning up the function store_char(), which takes a wide character, receiving both the Unicode codepoint "LWCHAR ch" and the UTF-8 multibyte representation "char *rep" as (redundant) input, behaves specially when ch is part of an ESC[...m ANSI sequence, otherwise measures the display width of ch, checks that it still fits on the screen, copies the multibyte representation to the line buffer, and advances the global variables "curr" (byte position) and "column" (visual display position). This is the only function calling in_ansi_esc_seq(), and having that function separate is actually counter-productive: if an escape sequence is found, then store_char() currently does the same backward iteration again if the sequence is incomplete and needs to be deleted. Consequently, merging in_ansi_esc_seq() into store_char() simplifies the code. If ch neither continues an ANSI sequence begun earlier nor starts a new one, the code needs to find the previous character in order to measure the width of the current one with pwidth(): the current one might be BACKSPACE. Currently, this backward iteration is done with the horrific, buggy function step_char(-1) which we aim to delete. Instead, iterate backwards until we find something that is not a UTF-8 continuation byte (i.e. which is an ASCII byte or a UTF-8 start byte) then decode the character with the standard function mbtowc(3). For extra safety, let's not only check that the previous character is valid (mbtowc() != -1) but also that there are no stray continuation bytes between the previous character and the character currently being added (i + w == curr). Maybe invalid UTF-8 can't actually make it into the line buffer, but let's better play it safe. If there is no previous character on the line, if it is invalid, or if there are stray continuation bytes after it, use a dummy blank, which means that in case of doubt, we assume that a backspace backs up by one visual column. Note that the function still contains a call to the buggy function utf_len() that we want to delete. Ultimately, most instances of utf_len() will be replaced with the standard function mblen(3), but we can't do that right now in store_char(). At this point, the function store_char() may still receive bogus characters incorrectly parsed with get_wchar(), and handling these at least consistently (even though not correctly) requires sticking with utf_len() for now. I tested that emboldening with "c\bc" and underlining with "c\b_" still works, even when "c" is a width two East Asian character, and that behaviour is sane with intervening stray continuation bytes. OK? Ingo Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.24 diff -u -p -r1.24 line.c --- line.c 26 Feb 2019 11:01:54 - 1.24 +++ line.c 26 Feb 2019 12:26:42 - @@ -458,27 +458,6 @@ backc(void) } /* - * Are we currently within a recognized ANSI escape sequence? - */ -static int -in_ansi_esc_seq(void) -{ - int i; - - /* -* Search backwards for either an ESC (which means we ARE in a seq); -* or an end char (which means we're NOT in a seq). -*/ - for (i = curr - 1; i >= 0; i--) { - if (linebuf[i] == ESC) - return (1); - if (!is_ansi_middle(linebuf[i])) - return (0); - } - return (0); -} - -/* * Is a character the end of an ANSI escape sequence? */ int @@ -512,6 +491,7 @@ is_ansi_middle(LWCHAR ch) static int store_char(LWCHAR ch, char a, char *rep, off_t pos) { + int i; int w; int replen; char cs; @@ -529,22 +509,43 @@ store_char(LWCHAR ch, char a, char *rep, } } - if (ctldisp == OPT_ONPLUS && in_ansi_esc_seq()) { - if (!is_ansi_end(ch) && !is_ansi_middle(ch)) { + w = -1; + if (ctldisp == OPT_ONPLUS) { + /* +* Set i to the beginning of an ANSI escape sequence +* that was begun and not yet ended, or to -1 otherwise. +*/ + for (i = curr - 1; i >= 0; i--) { + if (linebuf[i] == ESC) + break; + if (!is_ansi_middle(linebuf[i])) + i = 0; + } + if (i >= 0 && !is_ansi_end(ch) && !is_ansi_middle(ch)) { /* Remove whole unrecognized sequence. */ - do { - curr--; - } while (curr > 0 && linebuf[curr] != ESC); + curr = i; return (0); } -
Re: archaic quotes in calendar
Hi, Anthony J. Bentley wrote on Wed, Jan 30, 2019 at 04:01:30AM -0700: > Jason McIntyre writes: >> in the man page you have used Sq. that will make it mark up the same as >> it already does: >> >> (`*') > There's a difference: Ted uses a UTF-8 locale where Sq shows up as > pretty Unicode quotes. Literally typing `' doesn't (not in the terminal, > anyway). > > mandoc should render Sq as ' ' in ASCII mode. We fixed Dq the same way > in both mandoc and groff a couple of years ago (`` '' to " "). To finally conclude this thread: after a thorough discussion on the list , this patch, and the corresponding patch to groff, has been rejected. While some liked our arguments of making output nicer with Unicode-copatible fonts and avoiding ASCII usage that conflicts with Unicode, no consensus could be reached because substantial objections were raised, too. The main objection is that using 0x60 for the opening single quote (as an alternative to using it for a grave accent) and 0x27 for the closing single quote (as an alternative to using it for an apostrophe), according to the latest version of the ASCII standard, is a valid way to use the ASCII character encoding and has been valid since these bytes were frist assigned meaning in the 1965 edition of ASCII - even though this usage is no longer compatible with ISO-LATIN nor with Unicode. Moreover, it is the traditional way in which *roff used ASCII, it is compatible with the way many traditional terminals (including DEC VT-100 and VT-220) display these ASCII characters, and several groff developers are still using fonts for ASCII display that display ` ' as symmetric single quotes. Even our own Gallant console font does that. Several groff developers dislike the idea of changing established groff behaviour, even for the somewhat old-fashioned ASCII output device, unless there are strong technical reasons, and i had to concede that this change would be more of the "nice to have" kind and isn't strongly required for anything. The case Dq -> ``/" is substantially different because in ASCII, double quotes - both opening and closing - were always clearly defined as ". Consequently, people using a modern Unicode-compatible font for ASCII output will have to live with the slightly ugly single quotes, likely for good. Needless to say, if you can use the -T utf8 output device, for example by running with LC_CTYPE=en_US.UTF-8, the issue doesn't occur in the first place, and you already get nice single quotes right now, without needing any code changes. Yours, Ingo > Index: usr.bin/mandoc/chars.c > === > RCS file: /cvs/src/usr.bin/mandoc/chars.c,v > retrieving revision 1.48 > diff -u -p -r1.48 chars.c > --- usr.bin/mandoc/chars.c15 Dec 2018 19:30:19 - 1.48 > +++ usr.bin/mandoc/chars.c30 Jan 2019 10:55:08 - > @@ -100,7 +100,7 @@ static struct ln lines[] = { > { "rq", "\"", 0x201d }, > { "Lq", "\"", 0x201c }, > { "Rq", "\"", 0x201d }, > - { "oq", "`",0x2018 }, > + { "oq", "\'", 0x2018 }, > { "cq", "\'", 0x2019 }, > { "aq", "\'", 0x0027 }, > { "dq", "\"", 0x0022 },
start cleaning up UTF-8 processing in less(1)
Hi, Evan Silberman, by posting a well-founded bug report to bugs@, just drew my attention to the pigsty of having a hand-rolled re-implemention of Unicode character property handling within less(1). The root of the evil is the existence of the custom definition typedef unsigned long LWCHAR; in the file less.h, instead of simply using wchar_t where needed. My first thought was "maybe they are doing that to store Unicode codepoints portably on platforms where the values stored in wchar_t are not Unicode codepoints" - but it turns out less(1) is *already* broken on such platforms because the file cvt.c, function cvt_text(), already calls iswupper(3) and towlower(3) on LWCHAR values, i.e. on Unicode codepoints. And some other functions are even worse than that: * search.c, is_ucase() calls isupper(3) - without *w*! - on an LWCHAR value, which makes no sense whatsoever. * charset.c, control_char(), calls iscntrl((unsigned char)c) on an LWCHAR value - which is absurd to the point of not even being funny any longer, rather making you cry. When i first saw that, i was struck with incredulity, but it is really truncating valid codepoints to their lowest eight bits and then handing those meaningless bits to iscntrl(3)... Of course it would be possible to root out all the insanity in one bold step. But it would be a very large diff, prone to introducing bugs, hard to review, so i decided against that route. Instead, for the moment, let us assume that wchar_t always stores Unicode codepoints and is always contained within the "unsigned long" range. Both will always remain true on OpenBSD. Making these assumptions allows me to proceed step by step, with small, focussed diffs that will be easy to argue about. The end goal is to get rid of the LWCHAR data type and to delete most of the charset.c file, maybe even all of it, without adding any new code. Once that goal is reached, i expect our less(1) will be even *more* portable than it is now, because once that is reached, everything that requires Unicode character properties will be done with wchar_t, which is likely to just work even on platforms where wchar_t uses some arcane format instead of simply storing Unicode codepoints. Pursuing that goal, i first tried to proceed bottom-up, i.e. by starting to first replace the lowest-level functions like get_wchar() and step_char(), and then work upwards. But that failed miserably. What these low-level functions are doing is generally so wrong, so absurd, that it is unreasonable to simply replace them: the internal APIs they define are practically unusable by the callers. Instead, all call sites have to be inspected one by one, figuring out what the code *should* do at the call sites and then implement that with standard functions like mbtowc(3) and iswprint(3). I combed through the less source code and identified 35 functions in seven files that have to be partially rewritten, totally rewritten, or deleted because what they are doing is wrong, in most cases completely absurd. I don't see any alternative to tackling them one by one. I decided to start with the file line.c because it is central to the purpose of less: displaying lines of text to the screen. To quickly get to the important function store_char(), which intends to save a character into the line output buffer, i first have to clean up the function pwidth() that it uses. The function pwidth() intends to measure the display width of a character, i.e. how many columns it takes up on the screen: 0, 1, or 2. So it has to become a shallow wrapper around wcwidth(3). While i doubt the wisdom of the addition of display widths of ANSI escape sequences that it is doing at the end - on any sane terminal, such sequences should never take up any additional columns - i decided to stay focussed on UTF-8 issues and just leave the code related to attributes unchanged for now. Here is the benfit of the following diff: it gets rid of - two LWCHAR function arguments - two calls to is_wide_char() which uses an outdated table - one call to is_composing_char() which uses an outdated table - one call to is_combining_char() which uses an outdated table - one call to control_char() which is totally broken - one call to is_ascii_char() which is a trivial function - two tests of the ideosyncratic "utf_mode" variable None of the horrific features listed above can be deleted just yet because they are also in use at other places in less. Of course, the diff *does* change behaviour: character display widths (for the purpose at hand) get updated to the latest version of Unicode in our tree, and the effects of any bugs contained in the functions listed above silently vanish. That should only make things better and not cause regressions. I didn't check whether no longer calling control_char() here fixes some bugs; that is quite possible, but depends on which arguments pwidth() is called with, which i didn't try to research. The new version
less(1) UTF-8 bugfix and cleanup: search.c
Hi, the following is a very simple patch to completely clean up the file less/search.c with respect to UTF-8 handling. It also fixes an outright bug: Searching for uppercase UTF-8 characters currently doesn't work because passing a Unicode codepoint (in this case, the "ch" retrieved with step_char()) to isupper(3) is just totally wrong. The new loop is fairly standard. Invalid bytes are simply skipped. OK? Ingo P.S. I'm sending this even though my pappend() patch https://marc.info/?l=openbsd-tech=155249735725712 is still looking for review. Both are touching different files and are independent of each other. The cleanup is now maybe about half done with only one or two functions remaining in the most affected file line.c, but the files cmdbuf.c, cvt.c, and filename.c still remain, so i might have to speed up a bit to get it done before release comes too close. Index: search.c === RCS file: /cvs/src/usr.bin/less/search.c,v retrieving revision 1.19 diff -u -p -r1.19 search.c --- search.c2 Aug 2017 19:35:57 - 1.19 +++ search.c14 Mar 2019 13:48:59 - @@ -75,12 +75,14 @@ static struct pattern_info filter_info; static int is_ucase(char *str) { - char *str_end = str + strlen(str); - LWCHAR ch; + wchar_t ch; + int len; - while (str < str_end) { - ch = step_char(, +1, str_end); - if (isupper(ch)) + for (; *str != '\0"; str += len) { + if ((len = mbtowc(, str, MB_CUR_MAX)) == -1) { + mbtowc(NULL, NULL, MB_CUR_MAX); + len = 1; + } else if (iswupper(ch)) return (1); } return (0);
Re: less(1) UTF-8 bugfix and cleanup: search.c
Hi Stefan, Stefan Sperling wrote on Thu, Mar 14, 2019 at 04:11:00PM +0100: > Yes, OK. Thanks for checking. Here is the next step outside line.c, completely cleaning up the file less/filename.c with respect to UTF-8 handling. The loop needed is very similar to the one in search.c except that the case data[i] == '\0' can occur here, resulting in a 0 return value from mbtowc(3), which has to be counted as a binary character. Note that the patch does change behaviour even for LC_CTYPE=C. Currently, arcane ASCII (C0) control characters like 0x01 SOH, 0x02 STX, and so on are not counted as binary, but i think they should. Currently, with LC_CTYPE=C, only bytes with the high bit set are counted as binary. The following patch counts everything as binary except: * printable characters * whitespace characters * the 0x08 backspace character * and (with -R only) the 0x1b escape character There is no need to handle the ASCII and UTF-8 cases separately. With LC_CTYPE=C, the new code will advance byte by byte and count non-ASCII bytes as binary as it should. The third (n - i) argument to mbtowc(3) prevents buffer overruns; no harm is done when n - i exceeds MB_CUR_MAX. The additional, inner loop on escape sequences has no effect and can be deleted. The function is_ansi_middle() only returns true for printable ASCII characters, and for those, the outer loop is already good enough. Neither the old nor the new code tries to check whether the escape sequence ends in a valid or invalid manner. One could count the ESC as binary for invalid sequences - but it is not quite clear which sequences should be considered valid and i don't think this minor detail is worth adding any complexity. The function only provides crude heuristics in the first place. This patch cleans up the only caller of the bad function binary_char(), which can consequently be deleted. While here, use ssize_t for the return value of read(2) rather than int, even though it isn't a bug since the nbytes argument is always the constant 256. And avoid an idiom that relies on undefined behaviour by moving p out of the buffer on read(2) failure; the latter change does the job more portably with one variable less on the stack. OK? Ingo Index: filename.c === RCS file: /cvs/src/usr.bin/less/filename.c,v retrieving revision 1.27 diff -u -p -r1.27 filename.c --- filename.c 26 Feb 2019 11:01:54 - 1.27 +++ filename.c 14 Mar 2019 18:15:54 - @@ -334,25 +334,25 @@ fcomplete(char *s) int bin_file(int f) { - int n; - int bin_count = 0; char data[256]; - char *p; - char *pend; + ssize_t i, n; + wchar_t ch; + int bin_count, len; if (!seekable(f)) return (0); if (lseek(f, (off_t)0, SEEK_SET) == (off_t)-1) return (0); n = read(f, data, sizeof (data)); - pend = [n]; - for (p = data; p < pend; ) { - LWCHAR c = step_char(, +1, pend); - if (ctldisp == OPT_ONPLUS && c == ESC) { - do { - c = step_char(, +1, pend); - } while (p < pend && is_ansi_middle(c)); - } else if (binary_char(c)) + bin_count = 0; + for (i = 0; i < n; i += len) { + len = mbtowc(, data + i, n - i); + if (len <= 0) { + bin_count++; + len = 1; + } else if (iswprint(ch) == 0 && iswspace(ch) == 0 && + data[i] != '\b' && + (ctldisp != OPT_ONPLUS || data[i] != ESC)) bin_count++; } /* Index: charset.c === RCS file: /cvs/src/usr.bin/less/charset.c,v retrieving revision 1.21 diff -u -p -r1.21 charset.c --- charset.c 20 Apr 2017 21:23:16 - 1.21 +++ charset.c 14 Mar 2019 18:15:54 - @@ -146,18 +146,6 @@ init_charset(void) } /* - * Is a given character a "binary" character? - */ -int -binary_char(LWCHAR c) -{ - if (utf_mode) - return (is_ubin_char(c)); - c &= 0377; - return (!isprint((unsigned char)c) && !iscntrl((unsigned char)c)); -} - -/* * Is a given character a "control" character? */ int Index: funcs.h === RCS file: /cvs/src/usr.bin/less/funcs.h,v retrieving revision 1.20 diff -u -p -r1.20 funcs.h --- funcs.h 1 Mar 2019 14:31:34 - 1.20 +++ funcs.h 14 Mar 2019 18:15:54 - @@ -56,7 +56,6 @@ void ch_init(int, int); void ch_close(void); int ch_getflags(void); void init_charset(void); -int binary_char(LWCHAR); int control_char(LWCHAR); char *prchar(LWCHAR); char *prutfchar(LWCHAR);
Re: mount(8): remove unnecessary white space in man page
Hi Alessandro, Alessandro Gallo wrote on Sun, Mar 10, 2019 at 02:12:21PM +0100: > There seems to be an unneeded extra space in the mount(8) man page. > The following diff fixes that: Committed, thanks. Ingo > Index: mount.8 > === > RCS file: /cvs/src/sbin/mount/mount.8,v > retrieving revision 1.89 > diff -u -p -u -p -r1.89 mount.8 > --- mount.8 18 Jan 2018 08:57:12 - 1.89 > +++ mount.8 10 Mar 2019 13:07:06 - > @@ -318,7 +318,7 @@ command: > # mount -a -t nonfs,mfs > .Ed > .Pp > -mounts all file systems except those of type NFS and MFS . > +mounts all file systems except those of type NFS and MFS. > .Pp > .Nm > will attempt to execute a program in > > Thank you, > Alessandro >
less(1) UTF-8 cleanup: pappend()
Hi Todd, Todd C. Miller wrote on Tue, Mar 12, 2019 at 09:14:43AM -0600: > This looks like an improvement to me. OK millert@ Thanks for checking the do_append() patch, i committed it. The next step is to clean up pappend(), which reads one byte of a multibyte character per call and outputs the multibyte character once it is complete. As long as the character is not yet complete, the bytes are cached in the static buffer mbc_buf; the current number of bytes already cached is mbc_buf_index, the byte position of the start byte is mbc_pos. The static variable mbc_buf_len stores the number of bytes expected in the current character according to the start byte. When using standard functions, this information is irrelevant, so i'm deleting the variable, which simplifies global state. The proper tool to distinguish valid, incomplete, and invalid multibyte characters is mbrtowc(3). The simpler and generally more recommended function mbtowc(3) which i used in earlier patches cannot be used for this purpose because it does not distinguish incomplete from invalid sequences. The single call to mbrtowc(3) replaces one call of each of the following functions and macros that we want to get rid of: IS_ASCII_OCTET(), IS_UTF8_LEAD(), utf_len(), IS_UTF8_TRAIL(), and is_utf8_well_formed(). Some additional observations: * A goto is replaced by a proper loop. * In case of an invalid start byte, calling store_prchar() directly is simpler than the detour via flush_mbc_buf(). * When the line overflows, rather than first storing the number of bytes to back up in mbc_buf_index, then moving it to r at the end of the function, it is simpler to store it into r right away. * The patch replaces 11 lines of code with 10 lines of comments. :-) Tests done: * ASCII CR bytes are correctly shown encoded as ^M after ASCII bytes, after valid UTF-8 characters, and when interrupting incomplete UTF-8 characters. * Incomplete UTF-8 characters are correctly shown in encoded form when interrupted by ASCII characters, by valid UTF-8 characters, or by invalid bytes. * Width 2 Unicode characters correctly wrap to the next line when their left half would be in the last column of the screen. * Encoded representations of invalid bytes and of valid, but non-printable UTF-8 characters correctly wrap to the next line in their entirety unless all their output bytes still fit on the current screen output line. * All tests were also performed with LC_CTYPE=C. OK? Ingo Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.28 diff -u -p -r1.28 line.c --- line.c 13 Mar 2019 11:22:56 - 1.28 +++ line.c 13 Mar 2019 16:07:35 - @@ -64,7 +64,6 @@ extern off_t start_attnpos; extern off_t end_attnpos; static char mbc_buf[MAX_UTF_CHAR_LEN]; -static int mbc_buf_len = 0; static int mbc_buf_index = 0; static off_t mbc_pos; @@ -129,7 +128,6 @@ prewind(void) column = 0; cshift = 0; overstrike = 0; - mbc_buf_len = 0; is_null_line = 0; pendc = '\0'; lmargin = 0; @@ -683,6 +681,9 @@ flush_mbc_buf(off_t pos) int pappend(char c, off_t pos) { + mbstate_t mbs; + size_t sz; + wchar_t ch; int r; if (pendc) { @@ -696,13 +697,12 @@ pappend(char c, off_t pos) } if (c == '\r' && bs_mode == BS_SPECIAL) { - if (mbc_buf_len > 0) /* utf_mode must be on. */ { + if (mbc_buf_index > 0) /* utf_mode must be on. */ { /* Flush incomplete (truncated) sequence. */ r = flush_mbc_buf(mbc_pos); - mbc_buf_index = r + 1; - mbc_buf_len = 0; + mbc_buf_index = 0; if (r) - return (mbc_buf_index); + return (r + 1); } /* @@ -718,40 +718,43 @@ pappend(char c, off_t pos) if (!utf_mode) { r = do_append((LWCHAR) c, NULL, pos); } else { - /* Perform strict validation in all possible cases. */ - if (mbc_buf_len == 0) { -retry: - mbc_buf_index = 1; - *mbc_buf = c; - if (IS_ASCII_OCTET(c)) { - r = do_append((LWCHAR) c, NULL, pos); - } else if (IS_UTF8_LEAD(c)) { - mbc_buf_len = utf_len(c); + for (;;) { + if (mbc_buf_index == 0) mbc_pos = pos; - return (0); - } else { - /* UTF8_INVALID or stray UTF8_TRAIL */ - r = flush_mbc_buf(pos); - } - } else if
Re: xman.man: typo in mandesc section
Hi Alejandro, we normally avoid trivial changes to X11 documentation. Such changes provide little benefit but make life harder for matthieu@. If the typo still exists upstream, please report it upstream, and it will automatically appear in OpenBSD with the next regular update. Yours, Ingo Alejandro G. Peregrina wrote on Fri, Mar 08, 2019 at 01:04:10PM +0100: > Index: xman.man > === > RCS file: /cvs/xenocara/app/xman/man/xman.man,v > retrieving revision 1.2 > diff -u -p -u -r1.2 xman.man > --- xman.man 28 Sep 2013 16:23:02 - 1.2 > +++ xman.man 8 Mar 2019 11:34:51 - > @@ -152,7 +152,7 @@ n(n) New > o(o) Old > > .fi > -Xman will read any section that is of the from \fIman\fP, where > +Xman will read any section that is of the form \fIman\fP, where > is an upper or lower case letter (they are treated distinctly) or > a numeral (0-9). Be warned, however, that man(__appmansuffix__) and > catman(__adminmansuffix__) will not search directories that are non-standard.
Re: xterm and wcwidth()
Hi, Lauri Tirkkonen wrote on Fri, Mar 08, 2019 at 12:43:11PM +0200: > I feel like xterm should just use the system wcwidth() to avoid these > mismatches, so rudimentary diff to do that below. Absolutely, i strongly agree with that sentiment. Having a local character width table in an application program is just wrong. Archaic operating systems like Oracle Solaris 11 that have broken wcwidth(3) implementations should either fix their C library or go to hell. I also agree with the details of the diff. It is minimal in so far as it changes only one file, keeping the annoyance minimal when merging new xterm releases. The function systemWcwidthOk() is horrific enough that deleting it outright makes sense, to make sure that it can never get called by accident. Note that the command line options -cjk_width and -mk_width and the related resources remain broken. But i'd say people using them get what they deserve. The disruption to updates that would be caused by excising them from multiple files is probably better avoided. Unchanged diff left in place below for easy reference. OK to commit? Ingo diff --git a/app/xterm/util.c b/app/xterm/util.c index a3de4a005..9b6443683 100644 --- a/app/xterm/util.c +++ b/app/xterm/util.c @@ -4980,61 +4980,6 @@ decode_keyboard_type(XtermWidget xw, XTERM_RESOURCE * rp) } #if OPT_WIDE_CHARS -#if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH) -/* - * If xterm is running in a UTF-8 locale, it is still possible to encounter - * old runtime configurations which yield incomplete or inaccurate data. - */ -static Bool -systemWcwidthOk(int samplesize, int samplepass) -{ -wchar_t n; -int oops = 0; - -for (n = 21; n <= 25; ++n) { - wchar_t code = (wchar_t) dec2ucs(NULL, (unsigned) n); - int system_code = wcwidth(code); - int intern_code = mk_wcwidth(code); - - /* -* Solaris 10 wcwidth() returns "2" for all of the line-drawing (page -* 0x2500) and most of the geometric shapes (a few are excluded, just -* to make it more difficult to use). Do a sanity check to avoid using -* it. -*/ - if ((system_code < 0 && intern_code >= 1) - || (system_code >= 0 && intern_code != system_code)) { - TRACE(("systemWcwidthOk: broken system line-drawing wcwidth\n")); - oops += (samplepass + 1); - break; - } -} - -for (n = 0; n < (wchar_t) samplesize; ++n) { - int system_code = wcwidth(n); - int intern_code = mk_wcwidth(n); - - /* -* When this check was originally implemented, there were few if any -* libraries with full Unicode coverage. Time passes, and it is -* possible to make a full comparison of the BMP. There are some -* differences: mk_wcwidth() marks some codes as combining and some -* as single-width, differing from GNU libc. -*/ - if ((system_code < 0 && intern_code >= 1) - || (system_code >= 0 && intern_code != system_code)) { - TRACE((".. width(U+%04X) = %d, expected %d\n", - (unsigned) n, system_code, intern_code)); - if (++oops > samplepass) - break; - } -} -TRACE(("systemWcwidthOk: %d/%d mismatches, allowed %d\n", - oops, (int) n, samplepass)); -return (oops <= samplepass); -} -#endif /* HAVE_WCWIDTH */ - void decode_wcwidth(XtermWidget xw) { @@ -5045,8 +4990,7 @@ decode_wcwidth(XtermWidget xw) switch (mode) { default: #if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH) - if (xtermEnvUTF8() && - systemWcwidthOk(xw->misc.mk_samplesize, xw->misc.mk_samplepass)) { + if (xtermEnvUTF8()) { my_wcwidth = wcwidth; TRACE(("using system wcwidth() function\n")); break;
Re: xterm and wcwidth()
Hi, Ted Unangst wrote on Fri, Mar 08, 2019 at 03:24:56PM -0500: > Matthieu Herrb wrote: >> I would prefer a diff that just add a &&!defined(__OpenBSD__) to the >> condition before the definition of systemWcwidthOk(). This will cause >> less risk of conflicts in future updates and clearly show the >> intention. > If you prefer that, I would suggest the following to avoid patching multiple > places. Just short circuit the function. Makes sense, i committed that version. Thanks for reporting and for all the feedback, Ingo >>> #if OPT_WIDE_CHARS >>> -#if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH) >>> -/* >>> - * If xterm is running in a UTF-8 locale, it is still possible to encounter >>> - * old runtime configurations which yield incomplete or inaccurate data. >>> - */ >>> -static Bool >>> -systemWcwidthOk(int samplesize, int samplepass) >>> -{ >>> -wchar_t n; >>> -int oops = 0; > #ifdef __OpenBSD__ > return 1; > #endif >>> - >>> -for (n = 21; n <= 25; ++n) { >>> - wchar_t code = (wchar_t) dec2ucs(NULL, (unsigned) n); >>> - int system_code = wcwidth(code); >>> - int intern_code = mk_wcwidth(code);
Re: xterm and wcwidth()
Hi, Ted Unangst wrote on Fri, Mar 08, 2019 at 12:57:17PM -0500: > Lauri Tirkkonen wrote: >> in other words, xterm is (and was) using its own idea of how many >> columns characters take up. The way this manifested itself to me was >> that I received some email that contained emoji characters in the >> subject, and they look fine in mutt when I use another terminal (st on >> OpenBSD, or termux on my Android phone), but on 'xterm -fa "DejaVu Sans >> Mono"' on OpenBSD, only the left half of these characters is rendered, >> and mutt's index background color indicating the selection doesn't reach >> the right edge of the screen: libc wcwidth() returns 2, but xterm's idea >> seems to be 1 column. > Sigh, here we go again. Some systems have a bad function, so let's replace it > with a slightly better version, and then never update it. If there are bugs in > the libc wcwidth function, we'd very much like to know and fix them, not have > these workarounds. I'm counting that as OK tedu@, but i'll wait a bit before committing in case matthieu@ (or someone else) wants to speak up. Yours, Ingo
Re: EC_POINT_new.3 glitch
Hi Jan, Jan Stary wrote on Mon, Mar 18, 2019 at 08:38:04AM +0100: > This seems to be a missed newline. Committed, thanks. Ingo > Index: EC_POINT_new.3 > === > RCS file: /cvs/src/lib/libcrypto/man/EC_POINT_new.3,v > retrieving revision 1.9 > diff -u -p -r1.9 EC_POINT_new.3 > --- EC_POINT_new.329 Mar 2018 20:56:49 - 1.9 > +++ EC_POINT_new.318 Mar 2019 07:36:50 - > @@ -472,7 +472,8 @@ on error. > .Pp > .Fn EC_POINT_hex2point > returns the pointer to the > -.Vt EC_POINT supplied or > +.Vt EC_POINT > +supplied or > .Dv NULL > on error. > .Sh SEE ALSO
Re: Unescaped backslashes in manual pages
Hi Peter, Peter Piwowarski wrote on Tue, Mar 19, 2019 at 10:03:20PM -0400: > qsort(3) has in its EXAMPLE a C \n with its backslash unescaped; a quick > grep through manpages showed similar errors in CONF_modules_load_file.3. Ouch! Thanks for the report, committed. Ingo > Index: lib/libc/stdlib/qsort.3 > === > RCS file: /cvs/src/lib/libc/stdlib/qsort.3,v > retrieving revision 1.24 > diff -u -p -r1.24 qsort.3 > --- lib/libc/stdlib/qsort.3 22 Jan 2019 06:49:17 - 1.24 > +++ lib/libc/stdlib/qsort.3 19 Mar 2019 16:49:59 - > @@ -179,7 +179,7 @@ main() > > qsort(array, N, sizeof(array[0]), cmp); > for (i = 0; i < N; i++) > - printf("%s\n", array[i]); > + printf("%s\en", array[i]); > } > > > Index: lib/libcrypto/man/CONF_modules_load_file.3 > === > RCS file: /cvs/src/lib/libcrypto/man/CONF_modules_load_file.3,v > retrieving revision 1.7 > diff -u -p -r1.7 CONF_modules_load_file.3 > --- lib/libcrypto/man/CONF_modules_load_file.322 Mar 2018 21:08:22 > - 1.7 > +++ lib/libcrypto/man/CONF_modules_load_file.319 Mar 2019 16:49:59 > - > @@ -163,7 +163,7 @@ Load a configuration file and print out > file considered fatal): > .Bd -literal > if (CONF_modules_load_file(NULL, NULL, 0) <= 0) { > - fprintf(stderr, "FATAL: error loading configuration file\n"); > + fprintf(stderr, "FATAL: error loading configuration file\en"); > ERR_print_errors_fp(stderr); > exit(1); > } > @@ -174,7 +174,7 @@ by "myapp", tolerate missing files, but > .Bd -literal > if (CONF_modules_load_file(NULL, "myapp", > CONF_MFLAGS_IGNORE_MISSING_FILE) <= 0) { > - fprintf(stderr, "FATAL: error loading configuration file\n"); > + fprintf(stderr, "FATAL: error loading configuration file\en"); > ERR_print_errors_fp(stderr); > exit(1); > } > @@ -185,7 +185,7 @@ error, missing configuration file ignore > .Bd -literal > if (CONF_modules_load_file("/something/app.cnf", "myapp", > CONF_MFLAGS_IGNORE_MISSING_FILE) <= 0) { > - fprintf(stderr, "WARNING: error loading configuration file\n"); > + fprintf(stderr, "WARNING: error loading configuration file\en"); > ERR_print_errors_fp(stderr); > } > .Ed > @@ -198,17 +198,17 @@ long eline; > > fp = fopen("/somepath/app.cnf", "r"); > if (fp == NULL) { > - fprintf(stderr, "Error opening configuration file\n"); > + fprintf(stderr, "Error opening configuration file\en"); > /* Other missing configuration file behaviour */ > } else { > cnf = NCONF_new(NULL); > if (NCONF_load_fp(cnf, fp, ) == 0) { > - fprintf(stderr, "Error on line %ld of configuration file\n", > + fprintf(stderr, "Error on line %ld of configuration file\en", > eline); > ERR_print_errors_fp(stderr); > /* Other malformed configuration file behaviour */ > } else if (CONF_modules_load(cnf, "appname", 0) <= 0) { > - fprintf(stderr, "Error configuring application\n"); > + fprintf(stderr, "Error configuring application\en"); > ERR_print_errors_fp(stderr); > /* Other configuration error behaviour */ > }
Re: Adding fstab(5) fs_mntops to mount_nfs(8)
Hi Kurt, Kurt Mosiejczuk wrote on Mon, Mar 11, 2019 at 09:32:41AM -0400: > As far as I can tell, no man page documents the fs_mntops form of the > mount_nfs(8) options for fstab(5). Most of us who use them either know > from some other documentation, reading the source, or just doing a > copy-paste from somwhere. > > I went through the source and wrote them down, and here is a first stab > at documenting them in mount_nfs(8). I like the general idea, but have lots of comments regarding specific details. > I can't really think of a better place to put this information than > mount_nfs(8), but welcome suggestions. I also welcome feedback on how > best to push this information into the man page. I can't think of a better place than mount_nfs(8) either. One aspect that i overlooked when we initially discussed this (because i didn't actually read the code back then, i only looked at it right now checking the details): the fstab(5) options can also be given via -o and vice versa, and that ought to be documented. Equivalence is guaranteed because mount(8) actually forks and execs mount_nfs(8), passing the data taken from fstab(5) via argv[]. So i suggest the following style: .It Fl 2 No or Fl o Cm nfsv2 Use the NFS Version 2 protocol. Regarding the command line options taking arguments (-agIRrtwx): It appears these can be specified as mount -o -t=... on the command line or as ... nfs ro,-t=... in fstab(5). According to my testing (and my reading of the code), the forms mount -o timeo=... ... nfs ro,timeo=... that you are documenting do *not* work. I suspect you confused the field names in the struct nfs_args in the source code with the option names to be used in -o/fstab(5). So, we have four classes of options: * no arguments, equivalent -o/fstab(5) option exists -2 = -o nfsv2 /* wrong name in your diff? */ -3 = -o nfsv3 /* wrong name in your diff? */ -b = -o bg -c = -o noconn -d = -o dumbtimer /* missing in your diff */ -i = -o intr -l = -o rdirplus/* wrong name in your diff? */ -s = -o soft -T = -o tcp -U = -o mntudp /* wrong name in your diff? */ * no arguments, can be specified with -o/fstab(5) only: -o resvport/* missing in your diff */ -o noac * takes an argument, no explicit -o equivalent -a arg = -o -a=arg /* wrong name in your diff */ -g arg = -o -g=arg /* wrong name in your diff */ -I arg = -o -I=arg /* wrong name in your diff */ -R arg = -o -R=arg /* wrong name in your diff */ -r arg = -o -r=arg /* wrong name in your diff */ -t arg = -o -t=arg /* wrong name in your diff */ -w arg = -o -w=arg /* wrong name in your diff */ -x arg = -o -x=arg /* wrong name in your diff */ * takes an argument, can be specified with -o/fstab(5) only: -o port=arg -o acdirmax=arg -o acdirmin=arg -o acregmax=arg -o acregmin=arg Below -o, it should probably be said that all other options described in mount_nfs(8) 1. can be passed to mount(8) like "mount -o -b,-t=timeout" or like "mount -o bg,-t=timeout" 2. and can be specified in fstab(5) like ... /mountpoint nfs options,-b,-t=timeout ... or like ... /mountpoint nfs options,bg,-t=timeout ... This is not intended as a suggestion for the wording, but only as an indication of the content that probably needs to be conveyed. > Index: mount_nfs.8 > === > RCS file: /cvs/src/sbin/mount_nfs/mount_nfs.8,v > retrieving revision 1.39 > diff -u -p -r1.39 mount_nfs.8 > --- mount_nfs.8 6 Jun 2009 20:58:50 - 1.39 > +++ mount_nfs.8 10 Mar 2019 19:28:01 - > @@ -66,27 +66,48 @@ It implements the mount protocol as desc > .%T "NFS: Network File System Version 3 Protocol Specification" , > Appendix I. > .Pp > -The options are as follows: > +The options are as follows. > +Many of them can be specified permanently in the > +.Fa fs_mntopts > +field of the > +.Xr fstab 5 > +configuration file for use with the one-argument form of > +.Xr mount 8 . Even though i suggested this wording, this change should probably be reverted, given the above analysis. I think you need to rework you diff. Yours, Ingo
less(1) UTF-8 cleanup: do_append()
Hi Nicholas, Nicholas Marriott wrote on Tue, Mar 12, 2019 at 11:05:07AM +: > Sorry, ok nicm if you haven't committed this already. No i hadn't, but now i have, thank you very much for checking the patch. Here is the next step, cleaning up most parts of the crucial function do_append() which decides how to handle backspaces and overstrikes, in particular what to print literally and what to encode. Four bad function calls have to be replaced here: 1. The call to the bad function get_wchar() is used to find the character already present at the current position. Replacing it with mbtowc(3) also eliminates LWCHAR. In case of failure - which may no be likely since at least usually, the linebuf[] will contain valid UTF-8 - setting prev_ch to NUL makes sure that whatever is already there will simply be replaced. I think linebuf[curr] cannot be NUL at this point because only backc() sets overstrike, and just having backed up, *something* will be there. But even if linebuf[curr] *is* NUL and hence mbtowc(3) returns 0, the new code should do the right thing and simply append. 2. The calls to the bad functions is_composing_char() and is_combining_char() have to be replaced with wcwidth(3). That also eliminates the second call to get_wchar(). If wcwidth(3) fails (i.e. ch is not printable), we simply have to treat it like a width 1 character. 3. The call to the bad function control_char() has to be eliminated. Start by considering two cases separately. In utf_mode, we have is_ascii_char(ch), i.e. ch <= 0x7f. In that case, control_char() is just iscntrl(3), which is identical to !isprint(3). In !utf_mode, we know that do_append() only gets called with ch <= 0xff, and control_char() is iscntrl(3) || !isprint(3). In that case and expression, the iscntrl(3) is obviously redundant. So we can simply use !isprint(3) for both cases, which is also a logical way of expressing the condition because this "else if" clause is supposed to handle non-printable single-byte characters. 4. The call to the bad function is_ubin_char() intends to handle non-printable Unicode characters, so the right function to use is simply !iswprint(3) from . One part of do_append() still remains broken: It still calls prutfchar() which is broken because it calls is_ubin_char() and control_char(), because it accepts some invalid input, and because it generates some invalid output. But cleaning that up is a separate matter. Here are byte sequences i tested: * ASCII char backspace same ASCII char again -> bold * underscore backspace underscore in between -> bold * ASCII char backspace underscore-> italic * underscore backspace underscore in between -> italic * underscore backspace ASCII char-> italic * UTF-8 character backspace same UTF-8 again -> bold * UTF-8 character backspace undercore-> italic * underscore backspace UTF-8 character -> italic * invalid byte backspace underscore -> <88>^H_ * ASCII char UTF-8 combining accent backspace same ASCII char and combining accent again -> bold * underscore backspace ASCII char UTF-8 combining accent -> italic * UTF-8 character UTF-8 combining accent backspace same UTF-8 character and combining accent again -> bold * underscore backspace UTF-8 character UTF-8 combining accent -> italic * ASCII char UTF-8 combining accent backspace non-printable ASCII char -> accented char invisible, overwritten by the encoded representation of the non-printable char * ASCII char UTF-8 combining accent backspace non-printable UTF-8 character -> accented char invisible, overwritten by the encoded representation of the non-printable character * ASCII controls show as caret-letter * UTF-8 controls show as I tested both with LC_CTYPE=en_US.UTF-8 and LC_CTYPE=C. Note the following case is dubious: * ASCII char UTF-8 combining accent backspace underscore Arguably, this should underline the accented character, but actually, the accent disappears with both the old and the new code because it is overwritten by the next character. I'm not trying to fix that with this patch. * same for UTF-8 character UTF-8 combining accent backspace underscore OK? Ingo Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.27 diff -u -p -r1.27 line.c --- line.c 12 Mar 2019 11:59:24 - 1.27 +++ line.c 12 Mar 2019 14:33:55 - @@ -16,6 +16,7 @@ */ #include +#include #include "charset.h" #include "less.h" @@ -774,8 +775,8 @@ retry: static int do_append(LWCHAR ch, char *rep, off_t pos) { + wchar_t prev_ch; int a; - LWCHAR prev_ch; a = AT_NORMAL; @@ -813,7 +814,10 @@ do_append(LWCHAR ch, char *rep,
delete useless setlocale(3) in spell(1)
Hi, looking through my trees for diffs that were forgotten, i found this one. Spell uses nothing locale-dependent, and it will never need support for non-English characters because the basic algorithms used are specific to the English language in the first place. So delete and setlocale(3) without functional change. While here, make usage() static and return() from main rather than exit(3). Minimially tweaked diff originally from Jan Stary. OK? Ingo Index: spellprog.c === RCS file: /cvs/src/usr.bin/spell/spellprog.c,v retrieving revision 1.12 diff -u -p -r1.12 spellprog.c --- spellprog.c 10 Oct 2015 19:11:04 - 1.12 +++ spellprog.c 28 Oct 2016 07:53:39 - @@ -72,7 +72,6 @@ #include #include #include -#include #include #include #include @@ -108,7 +107,7 @@ char*estrdup(const char *); voidise(void); voidprint_word(FILE *); voidztos(char *); -__dead void usage(void); +static void __dead usage(void); /* from look.c */ int look(unsigned char *, unsigned char *, unsigned char *); @@ -247,8 +246,6 @@ main(int argc, char **argv) struct stat sb; FILE *file, *found; - setlocale(LC_ALL, ""); - if (pledge("stdio rpath wpath cpath", NULL) == -1) err(1, "pledge"); @@ -316,7 +313,7 @@ main(int argc, char **argv) if (ch == EOF) { if (found != NULL) fclose(found); - exit(0); + return (0); } } for (cp = word, dp = original; cp < ep; ) @@ -345,7 +342,7 @@ lcase: file = stdout; } - exit(0); + return (0); } void @@ -788,7 +785,7 @@ dict(char *bp, char *ep) return (rval); } -__dead void +static void __dead usage(void) { extern char *__progname;
delete useless setlocale(3) in uuencode(1)
Hi, uuencode(1) and uudecode(1) do nothing locale-related and i don't see how they ever might. So delete and setlocale(3). While here, sort headers, make usage() static, return from main rather than exit(3), and drop two redundant case statements. This is also a minimally tweaked version of a patch from Jan Stary. OK? Ingo Index: uudecode/uudecode.c === RCS file: /cvs/src/usr.bin/uudecode/uudecode.c,v retrieving revision 1.23 diff -u -p -r1.23 uudecode.c --- uudecode/uudecode.c 3 Jan 2016 14:43:20 - 1.23 +++ uudecode/uudecode.c 28 Oct 2016 08:34:42 - @@ -43,20 +43,19 @@ #include #include #include -#include +#include #include #include #include #include #include #include -#include static const char *infile, *outfile; static FILE *infp, *outfp; static int base64, cflag, iflag, oflag, pflag, rflag, sflag; -static voidusage(void); +static void __dead usage(void); static int decode(void); static int decode2(void); static int uu_decode(void); @@ -83,7 +82,6 @@ main(int argc, char *argv[]) pmode = MODE_B64DECODE; } - setlocale(LC_ALL, ""); while ((ch = getopt(argc, argv, optstr[pmode])) != -1) { switch(ch) { case 'c': @@ -154,7 +152,7 @@ main(int argc, char *argv[]) infp = stdin; rval = decode(); } - exit(rval); + return (rval); } static int @@ -442,7 +440,7 @@ base64_decode(void) "error decoding base64 input stream")); } -static void +static void __dead usage(void) { switch (pmode) { Index: uuencode/uuencode.c === RCS file: /cvs/src/usr.bin/uuencode/uuencode.c,v retrieving revision 1.13 diff -u -p -r1.13 uuencode.c --- uuencode/uuencode.c 9 Oct 2015 01:37:09 - 1.13 +++ uuencode/uuencode.c 28 Oct 2016 08:34:42 - @@ -40,7 +40,6 @@ #include #include -#include #include #include #include @@ -49,7 +48,7 @@ void encode(void); void base64_encode(void); -static void usage(void); +static void __dead usage(void); FILE *output; int mode; @@ -81,7 +80,6 @@ main(int argc, char *argv[]) pmode = MODE_B64ENCODE; } - setlocale(LC_ALL, ""); while ((ch = getopt(argc, argv, optstr[pmode])) != -1) { switch (ch) { case 'm': @@ -90,7 +88,6 @@ main(int argc, char *argv[]) case 'o': outfile = optarg; break; - case '?': default: usage(); } @@ -118,7 +115,6 @@ main(int argc, char *argv[]) #defineRW (S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) mode = RW & ~umask(RW); break; - case 0: default: usage(); } @@ -137,7 +133,7 @@ main(int argc, char *argv[]) encode(); if (ferror(output)) errx(1, "write error"); - exit(0); + return 0; } /* ENC is the basic 1 character encoding function to make a char printing */ @@ -220,7 +216,7 @@ encode(void) (void)fprintf(output, "%c\nend\n", ENC('\0')); } -static void +static void __dead usage(void) { switch (pmode) {
delete useless setlocale(3) from touch(1)
Hi, i initially thought touch(1) might be tricky because it is concerned with times, but it turned out is does almost all of the parsing by hand, supporting only specific numeric formats and nothing like month names. It does use strptime(3), but only for %F and %T neither of which is locale-dependent. Careful line-by-line review of the code turned up nothing that might use the locale, not even LC_TIME. OK for Jan Stary's patch? Ingo Index: touch.c === RCS file: /cvs/src/usr.bin/touch/touch.c,v retrieving revision 1.25 diff -u -p -r1.25 touch.c --- touch.c 9 Oct 2015 01:37:09 - 1.25 +++ touch.c 28 Oct 2016 09:28:29 - @@ -41,7 +41,6 @@ #include #include #include -#include #include #include @@ -49,7 +48,7 @@ void stime_arg1(char *, struct timespec void stime_arg2(char *, int, struct timespec *); void stime_argd(char *, struct timespec *); void stime_file(char *, struct timespec *); -__dead voidusage(void); +static void __dead usage(void); int main(int argc, char *argv[]) @@ -58,8 +57,6 @@ main(int argc, char *argv[]) int aflag, cflag, mflag, ch, fd, len, rval, timeset; char*p; - (void)setlocale(LC_ALL, ""); - if (pledge("stdio rpath wpath cpath fattr", NULL) == -1) err(1, "pledge"); @@ -145,7 +142,7 @@ main(int argc, char *argv[]) warn("%s", *argv); } } - exit(rval); + return rval; } #defineATOI2(s)((s) += 2, ((s)[-2] - '0') * 10 + ((s)[-1] - '0')) @@ -324,7 +321,7 @@ terr: errx(1, tsp[1] = tsp[0]; } -__dead void +static void __dead usage(void) { (void)fprintf(stderr,
Re: games/fortune translation fix
Hi Scott, Scott Cheloha wrote on Tue, Feb 05, 2019 at 12:02:39AM -0600: > Oof, folks I think we've missed the forest for the trees here. > > By focussing on the minutiae of the Latin translation we've discarded > the English motto ("through to the stars") that imho > anchored the whole thing. > > As noted previously, this phrase was a modern invention with a hasty > Latin translation conscripted to it to add some credence so you could > sew it onto a vest or stamp it onto a medal. > > The intent of the whole thing was a poetic English verse, not an > accurate Latin translation. The Latin was subordinate, as evidenced > by the difficult translation. The intended audience spoke English. > > But okay, because it isn't attributed to anyone or anything in our > file I can see how we focussed in on that and worked from the Latin > toward a more correct English translation. And translation is > puzzle-solving and puzzle-solving is fun. > > I empathize. > > But... I humbly suggest we partially revert and instead add an > attribution. I tried to find the original source but couldn't. So unless somebody finds out who originally coined the phrase and for which purpose, giving an attribution looks like a thoroughly bad idea to me. > Some quick googling suggests "Per ardua ad astra" That's not even the same motto. Are you arguing that a different phrase should be cited? If so, why? If anything, "per aspera" appears to be more widely used, and less exclusively focussed on a military meaning. > -Per aspera ad astra. (Through hardship to immortality.) > +Per ardua ad astra. (Through adversity to the stars.) > + -- Motto of the Royal Air Force I dislike that for the reasons stated above. If somebody wants to continue this discussion, it should probably be moved to misc@. Yours, Ingo
Re: install(1) could fail due to race
Ted Unangst wrote on Sun, Jan 27, 2019 at 10:37:52AM -0500: > Ingo Schwarze wrote: >> If people here agree with the general direction of making -S the >> default and removing the fragile non-S mode (see the patch below), >> i'll run a full make build and make release and then ask for OKs. > Just checking we didn't forget about this. Seems the right thing to do. Committed now, hoping that it will *not* cause fireworks in deraadt@'s NFS-based build infrastructure. Yours, Ingo
Re: install(1) could fail due to race
Hi Ted, Ted Unangst wrote on Sun, Jan 27, 2019 at 10:37:52AM -0500: > Ingo Schwarze wrote: >> If people here agree with the general direction of making -S the >> default and removing the fragile non-S mode (see the patch below), >> i'll run a full make build and make release and then ask for OKs. > Just checking we didn't forget about this. Seems the right thing to do. So, i finally came round to doing a full make build and make release for both base and xenocara on amd64, and everything succeeded. Consequently, i'm now looking for an explicit OK. Appending the full patch again for ease of reference. Yours, Ingo Index: install.1 === RCS file: /cvs/src/usr.bin/xinstall/install.1,v retrieving revision 1.30 diff -u -p -r1.30 install.1 --- install.1 13 May 2016 17:51:15 - 1.30 +++ install.1 6 Feb 2019 23:50:46 - @@ -101,7 +101,7 @@ Create directories. Missing parent directories are created as required. This option cannot be used with the .Fl B , b , C , c , -.Fl f , p , S , +.Fl f , p , or .Fl s options. @@ -141,15 +141,12 @@ except if the target file doesn't alread then preserve the modification time of the file. .It Fl S Safe copy. -Normally, -.Nm -unlinks an existing target before installing the new file. -With the -.Fl S -flag a temporary file is used and then renamed to be -the target. -The reason this is safer is that if the copy or -rename fails, the existing target is left untouched. +This is the default. +This option has no effect and is supported only for compatibility. +When installing a file, a temporary file is created and written first +in the destination directory, then atomically renamed. +This avoids both race conditions and the destruction of existing +files in case of write failures. .It Fl s .Nm exec's the command @@ -186,18 +183,8 @@ Default is .Sh FILES .Bl -tag -width INS@XX -compact .It Pa INS@XX -If either -.Fl S -option is specified, or the -.Fl C -or -.Fl p -option is used in conjunction with the -.Fl s -option, temporary files named INS@XX, -where XX is decided by -.Xr mkstemp 3 , -are created in the target directory. +Temporary files created in the target directory by +.Xr mkstemp 3 . .El .Sh EXIT STATUS .Ex -std install Index: xinstall.c === RCS file: /cvs/src/usr.bin/xinstall/xinstall.c,v retrieving revision 1.67 diff -u -p -r1.67 xinstall.c --- xinstall.c 16 Sep 2018 02:44:07 - 1.67 +++ xinstall.c 6 Feb 2019 23:50:46 - @@ -60,7 +60,7 @@ #define NOCHANGEBITS (UF_IMMUTABLE | UF_APPEND | SF_IMMUTABLE | SF_APPEND) #define BACKUP_SUFFIX ".old" -int dobackup, docompare, dodest, dodir, dopreserve, dostrip, safecopy; +int dobackup, docompare, dodest, dodir, dopreserve, dostrip; int mode = S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH; char pathbuf[PATH_MAX], tempfile[PATH_MAX]; char *suffix = BACKUP_SUFFIX; @@ -73,7 +73,6 @@ void install(char *, char *, u_long, u_i void install_dir(char *, int); void strip(char *); void usage(void); -intcreate_newfile(char *, struct stat *); intcreate_tempfile(char *, char *, size_t); intfile_write(int, char *, size_t, int *, int *, int); void file_flush(int, int); @@ -129,7 +128,7 @@ main(int argc, char *argv[]) docompare = dopreserve = 1; break; case 'S': - safecopy = 1; + /* For backwards compatibility. */ break; case 's': dostrip = 1; @@ -148,17 +147,13 @@ main(int argc, char *argv[]) argv += optind; /* some options make no sense when creating directories */ - if ((safecopy || docompare || dostrip) && dodir) + if ((docompare || dostrip) && dodir) usage(); /* must have at least two arguments, except when creating directories */ if (argc < 2 && !dodir) usage(); - /* need to make a temp copy so we can compare stripped version */ - if (docompare && dostrip) - safecopy = 1; - /* get group and owner id's */ if (group != NULL && gid_from_group(group, ) == -1) { gid = strtonum(group, 0, GID_MAX, ); @@ -265,54 +260,30 @@ install(char *from_name, char *to_name, err(1, "%s", from_name); } - if (safecopy) { - to_fd = create_tempfile(to_name, tempfile, sizeof(tempfile)); - if (to_fd < 0) - err(1, "%s", tempfile); - } else if (docompare && !dostrip) { - if ((to_fd = open(to_name, O_RDONLY, 0)) < 0) - err(1, "%s", to_name); - } else { -
Re: ASN1_get_object.3
Hi Kristaps, Kristaps Dzonsons BSD.LV wrote on Wed, Jan 30, 2019 at 12:17:16PM +0100: >> Attached is a manpage for ASN1_get_object(3), ASN1_put_object(3), and >> ASN1_put_eoc(3). > This is not complete without ASN1_object_size(), as otherwise one can't > scan over children, which is necessary when parsing anything but > trivially nested objects. This is because the actual length of an ASN.1 > object depends upon... a lot of stuff. Are you sure that you shouldn't be using the documented functions d2i_ASN1_TYPE(1) d2i_ASN1_OBJECT(1) instead? I'm not at all an ASN.1 wizard, but from a casual look, it seems to me that maybe d2i_ASN1_TYPE(3) handles some of the complications that ASN1_get_object(1) apparently causes - even though d2i_ASN1_TYPE(3) is most definitely horribly designed, too, given that CAVEATS and BUGS combined are longer than the DESCRIPTION. Maybe ASN1_get_object() is intentionally undocumented and only contained in by accident? I'm not saying i'm opposed to documenting the function(s), but i'd like input from a LibreSSL ASN.1 wizard which interface(s) application software should use. Maybe tedu@ qualifies, not sure, he himself will know... :-) *If* we come to the conclusion that these functions should be documented rather than deprecated, then i think from just reading the first paragraph of the DESCRIPTION in d2i_ASN1_TYPE(3) and ASN1_get_object(3), it ought to be crystal clear what the difference is - whereas right now, both say about the same. *If* we come to the conclusion that these functions should be documented, you should probably send an updated draft taking tedu@'s feedback into account. Yours, Ingo
Re: games/fortune translation fix
Hi Jason, oh well, these files are a mess, a random collection of funny and not so funny stuff... I dislike this one, too, for several reasons. 1. While "ad astra per aspera" sometimes occurs, the word order "per aspera ad astra" is much more commonly used. It sounds much better - not only because it respects the logical chronological order, but even more so for metric reasons: "per aspera ad astra" follows a loosely trochaic rhythm " - | X- (x)-X- " while"ad astra per aspera" has no discernible metric whatsoever: " - | X- | - X-- ". 2. It doesn't actually appear to be an antique or even medieval latin proverb but merely a modern invention. 3. According to my latin dictionary, "ad astra" did occur in antiquity, in a strongly metaphoric sense where "astra = heaven, immortality, sphere and home of the gods" rather than "stars" - though the only source cited there is Vergilius, Aeneis: Macte nova virtute, puer, sic itur ad astra. literal (hence somewhat misleading) translation: Blessings on your young courage, boy; that's the way to the stars. free translation, capturing the meaning and register better: Go on and increase in valor, O boy! this is the path to immortality. See e.g. https://en.wikiquote.org/wiki/Aeneid Seneca also appears to use it in the context of a demi-god (Hercules) interacting with the Olympians. So "ad astra" appears to be a rather unsual phrase, highly poetic, suited to heroic legends about gods and super-human men interacting directly with the greatest gods. 4. According to my latin dictionary, "asper" is a very common adjective with a wide range of everyday concrete meanings: rough, uneven, sharp, coarse, gruff, wild, disagreeable, sorrowful, severe, rancorous, ... and more. It was sometimes - but much less commonly, it appears - used as a noun, though usually with a qualification in the genitive, e.g. "aspera maris" = tempests (Tacitus). When used alone as a noun, it appears to have a relatively narrow, figurative, relatively mild meaning of "tribulations, annoying trouble, misfortune"; translations like "hardship" or "adversity" appear to exceed the severity of "asper(a)", making it sound too much like serious emergency and distress. So a fitting translation of "per aspera ad astra", approximating the meaning and stylistic register of both parts, might be "overcoming annoying problems to be elevated immortality" which sounds, yes, ridiculous. The word "aspiration" is definitely a blatant mistranslation. So if you want to keep the entry, i'd recommend Per aspera ad astra. (Through hardship to immortality.) because that is probably how it is commonly understood today, with a weakened sense of "immortality" = "greatness, being remembered for one's achievements after death", and it is also a compromise not too far deviating from the actual meaning of the latin words, even if sharpening "aspera" a bit and weakening "astra" somewhat to smooth out content and and stylistic register. I think while the literal translation "to the stars" might work in a heroic legend, it is quite misleading out of context and ought to be fixed. Yours, Ingo P.S. I don't know why i looked at this so closely given my disdain for these files - but from time to time, it appears i fail to sufficiently tame my appetite for literature. > i agree "aspiration" looks like a mistake. i suspect the intention > was "asperity", which means harshness and rigour. there is a verb, > asperate, but i think it's a bit obsolete. "Asperity" certainly matches the *adjective* "asper", but it matches the *noun* "aspera" much less than "hardship" or "adversity" or simply "trouble". > i'm a bit reluctant to just follow wikipedia blindly. i think the > latin is pural, and "hardships" doesn;t sound awesome when plural. The grammatical form is unimportant for the translation in this case. "aspara" (pl.) = "hardship, adversity" (sing.) is OK, just like you can translate "glasses" (pl.) to "Brille" (german, sing.). The idea in the Latin word is that it is plural because more than one unfortunate element is required to cause hardship, but that is already adequately expressed in the singular form "hardship". > also i prefer "by" to "through": since it's latin, a little > archaicism is good. Not really. It is fake latin, not a real proverb, but a modern invention. So it should sound natural and modern. > "adversity" would be easily understood and have the correct meaning > (i think). but the translation you recommend (by Finn) restructures > the phrase. i think it should begin "To the stars" (that's a minus > for wikipedia too). so "To the stars by adversity." You should really invert the order to improve the metric and follow chronologic order as well as the usual wording. > though i suspect "asperity" would have most of us reaching for our >
Re: games/fortune translation fix
Hi Pascal, Pascal Stumpf wrote on Sun, Feb 03, 2019 at 11:13:12AM +0100: > That is not correct because [...] Which illustrates yet again that getting good grades at school doesn't imply real understanding; i dared to try and research the matter anyway because i forgot that with pascal@, we have somebody in our group who really understands classical languages and literature. Thanks for your insight, Pascal. >> So if you want to keep the entry, i'd recommend >> >> Per aspera ad astra. (Through hardship to immortality.) > ok pascal@ Committed. Ingo
Re: update to PF pfctl(8) and pf.conf(5) manpages
Hi Alexandr, Alexandr Nedvedicky wrote on Wed, Apr 17, 2019 at 12:09:10AM +0200: > my oracle fellow pointed out [1] a PF documentation can be improved > a bit, when it comes to newly introduced 'pfctl -FR' (a reset flush > modifier). I've decided to make manpage changes in separate diff as > I expect some discussion on how much detailed the manpage should be. > The diff here is my proposal. First off, thank you for working on the pf.conf(5) manual page. I appreciate that very much. To me, the pf.conf(5) manual page feels like a manual page of considerably above-average importance, yet i have often felt it could probably be improved. Improving it isn't easy though: the manual is long and complex, and the subject matter is more complicated than for most other manual pages. So this is a place where expert attention from somebody who actively works on the code is particularly helpful. [...] > From 'pfctl -FR' perspective the most important is to document whether > particular option is parameter for PF driver (a.k.a. runtime option) > or is just interpreted by parser as rules are being loaded. > Documenting default values is just bonus, which I think might be > useful for administrators. On the other hand it adds more work to > keep pf.conf(5) up-to-date when doing changes in code. > I full understand the trade here between being precise and up-to-date. I don't feel strongly about mentioning the defaults either way. But i tend to think that if something is important enough to provide users with a knob to tweak it, then they will probably need to know what the default is before they can even start to figure out whether they might need to tweak it. So my guess would be that every time people read the description of the option, they will also profit from seeing the default right away. If others feel the risk is too high that these defaults might get outdated in the manual page, is there a central place in the code you can point readers to for looking up the defaults? If so, pointing to that place might be a viable compromise. If no such place exists (which might imply the defaults are scattered / hard to find in the code?) that would strengthen the argument to document them. The below nits are not objections, nor conditions, nor reproaches. You are welcome to incorporate them; then again, such details can also be fixed after commit when needed. The main job, in particular in a tricky page like this, is getting the content correct! [...] > diff --git a/share/man/man5/pf.conf.5 b/share/man/man5/pf.conf.5 > index 247ceef40a5..ab6b8c0447e 100644 > --- a/share/man/man5/pf.conf.5 > +++ b/share/man/man5/pf.conf.5 > @@ -1129,12 +1129,24 @@ can be used. > .Xr pf 4 > may be tuned for various situations using the > .Ic set > -command. > +command. Two sorts of options should be distinguished. New sentence, new line: this needs to be command. +Two sorts... The same small problem occurs again below, several times. I'm not annotating all the instances. For spotting such trivial mistakes, mandoc -T lint pf.conf.5 can be useful. The word "should" is often somewhat confusing because it can leave doubt whether something is a syntax requirement, or a stylistic recommendation, or even a plan to change the code and the rules in the future. I'd prefer stating facts, for example: command. +There are two kinds of options: +.Em runtime > +.Em runtime A capital letter would be needed at the beginning of the sentence, but instead, i'd prefer the colon as shown above. > +options, which define parameters for The word "the" is missing here. > +.Xr pf 4 > +driver and +driver, and > +.Em parser > +options, which fine-tune interpretation of rules, while > +they are being loaded from file. The runtime options s/from file/from the file/ ... and again, new sentence, new line. > +may be restored to their default values using: > +.Pp > +.Dl # pfctl -FReset > +.Pp > + Please show a blank between an option and its argument, unless the syntax *requires* that there be no blank. Blank lines are not allowed in mdoc(7), except inside .Bd -unfilled, and .Pp is needless before .Bl without -compact, so: +may be restored to their default values using: +.Pp +.Dl # pfctl -F Reset .Bl -tag -width Ds Alternatively, you could say, interrupting the text flow less: +may be restored to their default values using the +.Xr pfctl 8 +.Fl F Cm Reset option. > .Bl -tag -width Ds > .It Ic set Cm block-policy drop | return > The > .Cm block-policy > -option sets the default behaviour for the packet > +parser option sets the default behaviour for the packet That's indeed a nice, concise way to make the distinction. > .Ic block > action: > .Pp > @@ -1146,8 +1158,13 @@ A TCP RST is returned for blocked TCP packets, > an ICMP UNREACHABLE is returned for blocked UDP packets, > and all other packets are silently dropped. > .El > +.Pp > +The default value is > +.Cm drop . > .It Ic set Cm
Re: 63 bit certificate ID is libressl affected?
Hi Tom, Tom Smyth wrote on Wed, Mar 13, 2019 at 08:32:20PM +: > Just saw the following article and i was wondering if libressl > Might be affected by the bug also > Top bit being set to 0 always making an effective 63 bits rather than 64 > bits If i understand the article you quote correctly, is is about a minor bug in some CA softwares (i.e. softwares used to operate certificate authorities). As far as i am aware, LibreSSL does *not* include any software that can be used to operate a certificate authority. The "openssl ca" subcommand of openssl(1) definitely does not count. The openssl(1) utility is a *testing tool* and must not be used for any kind of production purposes. So i don't see how LibreSSL could possibly be affected. If you still think it might be, please consider stating more precisely which part of LibreSSL (i.e. which library function) you fear might be broken in precisely which way. > https://www.theregister.co.uk/2019/03/13/tls_cert_revoke_ejbca_config/ My impression is that the most important sentence from that article is the following: While the serial number security issue is largely theoretical - 63 bits leaves plenty of space to fend off collision attacks, even if it's not compliant with the spec [...] That means this is unlikely to be a security issue in the first place but looks more like a minor bug where some software is gratutiosly violating a specification. Sure, specifications should not be set aside without a good reason, and certainly not accidentally, and bugs ought to be fixed, but i fail to see any indication that this bug might be more important than other run-of-the-mill bugs. Please take this with a grain of salt: while i did occasionally work on LibreSSL documentation in the past, my knowledge and experience in matters of cryptography and PKI is very limited. But i thought quick feedback might help to discourage people from panicking. Also, if you want to continue this discussion, i suggest moving to misc@. You didn't include a patch! ;-) Yours, Ingo
Re: caesar(6) to accept negative argument
Hi, tleguern wrote on Wed, May 15, 2019 at 03:36:57PM +0100: > This little patch makes caesar(6) useful at both encrypting and > decrypting texts by allowing a negative rotation. Committed, thanks. > A similar patch was proposed by Dieter Rauschenberger in 2008 with > little response Well, it's a game, so people sometimes forget how critical it is to publish errata for it as fastly as possible. ;-) Yours, Ingo CVSROOT:/cvs Module name:src Changes by: schwa...@cvs.openbsd.org2019/05/15 09:59:24 Modified files: games/caesar : caesar.c Log message: patch from to support backward rotation, hoping to save somebody's life from the Leather Goddesses of Phobos
Re: intro.8: .Xr rcctl 8 -- http://man.openbsd.org/intro.8
Hi Christian, Christian Weisgerber wrote on Sun, May 26, 2019 at 01:19:01AM -: > Not sure what do about "ELF(3)" in elf(3). Index: elf.3 === RCS file: /cvs/src/lib/libelf/elf.3,v retrieving revision 1.2 diff -u -p -r1.2 elf.3 --- elf.3 19 Mar 2019 02:31:35 - 1.2 +++ elf.3 26 May 2019 01:46:30 - @@ -615,8 +615,11 @@ descriptor itself. .Xr ar 5 , .Xr elf 5 .Sh HISTORY -The original ELF(3) API was developed for Unix System V. -The current implementation of the ELF(3) API appeared in +The original +.Nm +API was developed for +.At V . +The current implementation of the API appeared in .Fx 7.0 . .Sh AUTHORS The ELF library was written by Index: elf_flagdata.3 === RCS file: /cvs/src/lib/libelf/elf_flagdata.3,v retrieving revision 1.1 diff -u -p -r1.1 elf_flagdata.3 --- elf_flagdata.3 1 Feb 2019 05:27:37 - 1.1 +++ elf_flagdata.3 26 May 2019 01:46:30 - @@ -34,7 +34,7 @@ .Nm elf_flagphdr , .Nm elf_flagscn , .Nm elf_flagshdr -.Nd manipulate flags associated with ELF(3) data structures +.Nd manipulate flags associated with ELF data structures .Sh LIBRARY .Lb libelf .Sh SYNOPSIS @@ -65,7 +65,9 @@ and .Ar scn denote the data structures whose flags need to be changed. These values should have been returned by prior calls to -functions in the ELF(3) API set: +functions in the +.Xr elf 3 +API set: .Bl -bullet -compact .It Argument @@ -175,7 +177,9 @@ function and the .Dv ELF_F_ARCHIVE and .Dv ELF_F_ARCHIVE_SYSV -flags are an extension to the ELF(3) API. +flags are an extension to the +.Xr elf 3 +API. .Sh ERRORS These functions may fail with the following errors: .Bl -tag -width "[ELF_E_RESOURCE]" Index: elf_open.3 === RCS file: /cvs/src/lib/libelf/elf_open.3,v retrieving revision 1.1 diff -u -p -r1.1 elf_open.3 --- elf_open.3 1 Feb 2019 05:27:37 - 1.1 +++ elf_open.3 26 May 2019 01:46:30 - @@ -43,7 +43,9 @@ The functions .Fn elf_open and .Fn elf_openmemory -are extensions to the ELF(3) API, for the internal use of the +are extensions to the +.Xr elf 3 +API, for the internal use of the Elftoolchain project. Portable applications should not use these functions. .Pp @@ -71,7 +73,9 @@ specifies the size of the memory area in The function returns a pointer to a ELF descriptor if successful, or NULL if an error occurred. .Sh COMPATIBILITY -These functions are non-standard extensions to the ELF(3) API set. +These functions are non-standard extensions to the +.Xr elf 3 +API set. .Pp The behavior of these functions differs from their counterparts .Xr elf_begin 3 Index: gelf.3 === RCS file: /cvs/src/lib/libelf/gelf.3,v retrieving revision 1.1 diff -u -p -r1.1 gelf.3 --- gelf.3 1 Feb 2019 05:27:37 - 1.1 +++ gelf.3 26 May 2019 01:46:31 - @@ -192,7 +192,10 @@ Copy back an ELF symbol table entry. .Xr elf 3 , .Xr elf 5 .Sh HISTORY -The GELF(3) API first appeared in System V Release 4. +The +.Nm +API first appeared in +.At V.4 . This implementation of the API first appeared in .Fx 7.0 . .Sh AUTHORS Index: gelf_newehdr.3 === RCS file: /cvs/src/lib/libelf/gelf_newehdr.3,v retrieving revision 1.1 diff -u -p -r1.1 gelf_newehdr.3 --- gelf_newehdr.3 1 Feb 2019 05:27:38 - 1.1 +++ gelf_newehdr.3 26 May 2019 01:46:31 - @@ -134,7 +134,9 @@ The function uses a type of .Ft "void *" for its returned value. -This differs from some other implementations of the ELF(3) API, which use an +This differs from some other implementations of the +.Xr elf 3 +API, which use an .Ft "unsigned long" return type. .Sh ERRORS Index: gelf_newphdr.3 === RCS file: /cvs/src/lib/libelf/gelf_newphdr.3,v retrieving revision 1.1 diff -u -p -r1.1 gelf_newphdr.3 --- gelf_newphdr.3 1 Feb 2019 05:27:38 - 1.1 +++ gelf_newphdr.3 26 May 2019 01:46:31 - @@ -103,7 +103,9 @@ The function uses a type of .Ft "void *" for its returned value. -This differs from some other implementations of the ELF(3) API, which use an +This differs from some other implementations of the +.Xr elf 3 +API, which use an .Ft "unsigned long" return type. .Sh ERRORS
Re: proot.1: Formatting fix
Hi Fabio, Fabio Scotoni wrote on Sat, Jun 08, 2019 at 12:15:57PM +0200: > This patch fixes a newline in proot.1 preventing LOCKDIR from > being rendered correctly. Thanks for reporting. I committed the patch shown below. > Note that this causes the input line to be very long, which > seemed okay to me Well, technically, mandoc and groff accept lines of arbitrary length, and we now longer care about very old formatters that may have had length limitations, but it still isn't good style. > because dhcp-options.5 does the same (e.g. on line 184). In that file, using .Xo or backslash line continuation would be better, but it hardly warrants a commit. Yours, Ingo CVSROOT:/cvs Module name:src Changes by: schwa...@cvs.openbsd.org2019/06/08 08:48:53 Modified files: share/man/man1 : proot.1 Log message: Missing macro, reported by Fabio Scotoni . While here, also insert missing whitespace before punctuation. Index: proot.1 === RCS file: /cvs/src/share/man/man1/proot.1,v retrieving revision 1.4 retrieving revision 1.5 diff -u -r1.4 -r1.5 --- proot.1 29 May 2019 19:37:06 - 1.4 +++ proot.1 8 Jun 2019 14:48:53 - 1.5 @@ -225,8 +225,8 @@ .It All the ports specific sub directories if they are defined, namely -.Cm DISTDIR , WRKOBJDIR, LOGDIR, PACKAGE_REPOSITORY , PLIST_REPOSITORY , -LOCKDIR. +.Cm DISTDIR , WRKOBJDIR , LOGDIR , PACKAGE_REPOSITORY , PLIST_REPOSITORY , +.Cm LOCKDIR . .It The .Cm PORTSDIR
Re: [diff] events.html - video links for BSDCan 2019
Hi Ross, Ross L Richardson wrote on Tue, Jun 18, 2019 at 08:34:41PM +1000: > The patch below just adds links for the videos currently available. Committed, thanks. > Notes: > - Lines were already > 80 characters, so I haven't folded. I took the liberty to fold the lines touched. I think that the habit of disregarding the 80 character rule of thumb in HTML code is not a good habit - and extending beyond column 240 even less so. Reading diffs of HTML code is already harder than reading diffs of C code in the first place, no need to make it even worse. > - Links are to the context of the BSDCan 2019 playlist (which may not > be what is wanted). Right, i deleted the list, index, and time attributes. Youtube channels tend to edit their playlists, and then such links get outdated. So i doubt mentioning the playlist warrants any additional maintenance effort. Besides, YT typically suggests related content anyway, sometimes even related playlists. Yours, Ingo > Index: events.html > === > RCS file: /cvs/www/events.html,v > retrieving revision 1.1173 > diff -u -p -r1.1173 events.html > --- events.html 1 Jun 2019 23:12:47 - 1.1173 > +++ events.html 18 Jun 2019 10:25:47 - > @@ -69,13 +69,13 @@ May 15-18, 2019, Ottawa, Canada. > Bob Beck - > https://github.com/bob-beck/libtls/blob/master/TUTORIAL.md;>libtls > for beginners conference tutorial > Theo Buehler - Design and verification of the TLS 1.3 handshake state > machine in LibreSSL (slides) > -Florian Obser - https://man.openbsd.org/unwind.8;>unwind(8) > a privilege-separated, validating DNS recursive nameserver for every laptop > (slides) > -Stefan Sperling - Building an accessible OpenBSD laptop (Enabling secure > and functional computing for a person with severe disabilities) ( href="papers/bsdcan2019-accessible-openbsd-laptop.pdf">slides) > +Florian Obser - https://man.openbsd.org/unwind.8;>unwind(8) > a privilege-separated, validating DNS recursive nameserver for every laptop > (slides, href="https://www.youtube.com/watch?v=88SoI49nO4olist=PLeF8ZihVdpFegPoAKppaDSoYmsBvpnSZvindex=8t=0s;>video) > +Stefan Sperling - Building an accessible OpenBSD laptop (Enabling secure > and functional computing for a person with severe disabilities) ( href="papers/bsdcan2019-accessible-openbsd-laptop.pdf">slides, href="https://www.youtube.com/watch?v=Ma_Y1hVmK8olist=PLeF8ZihVdpFegPoAKppaDSoYmsBvpnSZvindex=6t=0s;>video) > Antoine Jacoutot - href="https://man.openbsd.org/syspatch.8;>syspatch(8), The Boring Healing > Potion (slides) > -Alexander Bluhm - Measuring Performance on OpenBSD ( href="papers/bsdcan2019-perform-slides.pdf">slides) > +Alexander Bluhm - Measuring Performance on OpenBSD ( href="papers/bsdcan2019-perform-slides.pdf">slides, href="https://www.youtube.com/watch?v=s6rAXaHylFMlist=PLeF8ZihVdpFegPoAKppaDSoYmsBvpnSZvindex=15t=0s;>video) > Bob Beck > Unveil in OpenBSD > -Jan Klemkow - Network booted OpenBSD Workstations ( href="papers/bsdcan2019_netboot.pdf">slides) > +Jan Klemkow - Network booted OpenBSD Workstations ( href="papers/bsdcan2019_netboot.pdf">slides, href="https://www.youtube.com/watch?v=kFqHXfWEB4olist=PLeF8ZihVdpFegPoAKppaDSoYmsBvpnSZvindex=19t=0s;>video) > > >
Re: sftp(1) manual diff
Hi, Jason McIntyre wrote on Mon, Jun 17, 2019 at 04:58:26PM +0100: > On Mon, Jun 17, 2019 at 05:31:16PM +0200, Tim van der Molen wrote: >> Jason McIntyre (2019-06-17 15:02 +0200): >>> On Mon, Jun 17, 2019 at 02:47:09PM +0200, Tim van der Molen wrote: sftp(1) has this: reput [-Ppr] [local-path] remote-path Resume upload of [local-path]. Equivalent to put with the -a flag set. remote-path should be marked optional, not local-path. Probably a pasto from reget. >>> ok by me, but i have another question - shouldn;t the usage for reput >>> and reget show the -f flag too? i know it's unrelated to your diff, but >>> two birds and all that. >> Yes, you're right. >> I see there's also a -R flag (equivalent to -r), so I added that one as >> well (it's also listed in sftp's help command). > ok, please wait for a bit of feedback. i suspect -R is deliberately > undocumented (maybe to mirror cp). I seriously doubt that. See here: $ sftp localhost schwarze@localhost's password: Connected to localhost. sftp> help Available commands: [...] get [-afPpRr] remote [local] Download file reget [-fPpRr] remote [local] Resume download file reput [-fPpRr] [local] remote Resume upload file Both the -Pp and the -Rr pairs are consistently documented in the help() string in sftp.c, so the omission of -R in the manual page looks like an oversight. It makes little sense to me to consistently document -P but to be inconsistent about -R. >> And then I noticed the help command also has the original issue. While >> there I might as well restore alphabetical order. >> (It's not two birds, but a can of worms!) > it's always a can of worms. Right, taming the worms is hard work. In reality, they won't just grab their mothers and fly away. Anyway, OK for the diff below. Feel free to reconsider the alphabetical order of "reget reput rename" if you want to. ;-) Yours, Ingo >> Index: sftp.1 >> === >> RCS file: /cvs/src/usr.bin/ssh/sftp.1,v >> retrieving revision 1.125 >> diff -p -u -r1.125 sftp.1 >> --- sftp.1 22 Jan 2019 06:58:31 - 1.125 >> +++ sftp.1 17 Jun 2019 15:27:11 - >> @@ -404,7 +404,7 @@ extension. >> Quit >> .Nm sftp . >> .It Xo Ic get >> -.Op Fl afPpr >> +.Op Fl afPpRr >> .Ar remote-path >> .Op Ar local-path >> .Xc >> @@ -446,9 +446,11 @@ or >> flag is specified, then full file permissions and access times are >> copied too. >> .Pp >> -If the >> +If either the >> +.Fl R >> +or >> .Fl r >> -flag is specified then directories will be copied recursively. >> +flag is specified, then directories will be copied recursively. >> Note that >> .Nm >> does not follow symbolic links when performing recursive transfers. >> @@ -545,7 +547,7 @@ Create remote directory specified by >> .It Ic progress >> Toggle display of progress meter. >> .It Xo Ic put >> -.Op Fl afPpr >> +.Op Fl afPpRr >> .Ar local-path >> .Op Ar remote-path >> .Xc >> @@ -588,9 +590,11 @@ or >> flag is specified, then full file permissions and access times are >> copied too. >> .Pp >> -If the >> +If either the >> +.Fl R >> +or >> .Fl r >> -flag is specified then directories will be copied recursively. >> +flag is specified, then directories will be copied recursively. >> Note that >> .Nm >> does not follow symbolic links when performing recursive transfers. >> @@ -600,7 +604,7 @@ Display remote working directory. >> Quit >> .Nm sftp . >> .It Xo Ic reget >> -.Op Fl Ppr >> +.Op Fl fPpRr >> .Ar remote-path >> .Op Ar local-path >> .Xc >> @@ -612,12 +616,12 @@ with the >> .Fl a >> flag set. >> .It Xo Ic reput >> -.Op Fl Ppr >> -.Op Ar local-path >> -.Ar remote-path >> +.Op Fl fPpRr >> +.Ar local-path >> +.Op Ar remote-path >> .Xc >> Resume upload of >> -.Op Ar local-path . >> +.Ar local-path . >> Equivalent to >> .Ic put >> with the >> Index: sftp.c >> === >> RCS file: /cvs/src/usr.bin/ssh/sftp.c,v >> retrieving revision 1.192 >> diff -p -u -r1.192 sftp.c >> --- sftp.c 7 Jun 2019 03:47:12 - 1.192 >> +++ sftp.c 17 Jun 2019 15:27:11 - >> @@ -263,8 +263,6 @@ help(void) >> " filesystem containing 'path'\n" >> "exit Quit sftp\n" >> "get [-afPpRr] remote [local] Download file\n" >> -"reget [-fPpRr] remote [local] Resume download file\n" >> -"reput [-fPpRr] [local] remote Resume upload file\n" >> "help Display this help text\n" >> "lcd path Change local directory to >> 'path'\n" >> "lls [ls-options [path]]Display local directory >> listing\n" >> @@ -278,6 +276,8 @@ help(void) >> "put [-afPpRr] local [remote] Upload file\n" >> "pwd
Re: [patch] improve strptime(3) %z timezone parsing
Hi, > That one looks correct. OK millert@ Committed, thanks for checking! While here, i noticed ugly preprocessor macros. Let's make our future life easier by unifdefing a bit. When compiling with -g0, there is no object change. Note that if TM_ZONE is not defined, wcsftime.c doesn't currently even compile. Talk about code in the tree that is never tested. OK? Ingo Index: localtime.c === RCS file: /cvs/src/lib/libc/time/localtime.c,v retrieving revision 1.59 diff -u -p -r1.59 localtime.c --- localtime.c 19 Sep 2016 12:48:21 - 1.59 +++ localtime.c 10 May 2019 22:17:14 - @@ -46,7 +46,7 @@ ** in which Daylight Saving Time is never observed. ** 4. They might reference tzname[0] after setting to a time zone ** in which Standard Time is never observed. -** 5. They might reference tm.TM_ZONE after calling offtime. +** 5. They might reference tm.tm_zone after calling offtime. ** What's best to do in the above cases is open to debate; ** for now, we just set things up so that in any of the five cases ** WILDABBR is used. Another possibility: initialize tzname[0] to the @@ -214,14 +214,8 @@ DEF_WEAK(tzname); static struct tm tm; -#ifdef USG_COMPAT long timezone = 0; intdaylight = 0; -#endif /* defined USG_COMPAT */ - -#ifdef ALTZONE -time_t altzone = 0; -#endif /* defined ALTZONE */ static long detzcode(const char *codep) @@ -255,13 +249,8 @@ settzname(void) tzname[0] = wildabbr; tzname[1] = wildabbr; -#ifdef USG_COMPAT daylight = 0; timezone = 0; -#endif /* defined USG_COMPAT */ -#ifdef ALTZONE - altzone = 0; -#endif /* defined ALTZONE */ if (sp == NULL) { tzname[0] = tzname[1] = (char *)gmt; return; @@ -273,16 +262,10 @@ settzname(void) const struct ttinfo *ttisp = >ttis[sp->types[i]]; tzname[ttisp->tt_isdst] = >chars[ttisp->tt_abbrind]; -#ifdef USG_COMPAT if (ttisp->tt_isdst) daylight = 1; if (!ttisp->tt_isdst) timezone = -(ttisp->tt_gmtoff); -#endif /* defined USG_COMPAT */ -#ifdef ALTZONE - if (ttisp->tt_isdst) - altzone = -(ttisp->tt_gmtoff); -#endif /* defined ALTZONE */ } /* ** Finally, scrub the abbreviations. @@ -1274,9 +1257,7 @@ localsub(const time_t *timep, long offse result = timesub(, ttisp->tt_gmtoff, sp, tmp); tmp->tm_isdst = ttisp->tt_isdst; tzname[tmp->tm_isdst] = >chars[ttisp->tt_abbrind]; -#ifdef TM_ZONE - tmp->TM_ZONE = >chars[ttisp->tt_abbrind]; -#endif /* defined TM_ZONE */ + tmp->tm_zone = >chars[ttisp->tt_abbrind]; return result; } @@ -1325,21 +1306,19 @@ gmtsub(const time_t *timep, long offset, } _THREAD_PRIVATE_MUTEX_UNLOCK(gmt); result = timesub(timep, offset, gmtptr, tmp); -#ifdef TM_ZONE /* ** Could get fancy here and deliver something such as ** "UTC+" or "UTC-" if offset is non-zero, ** but this is no time for a treasure hunt. */ if (offset != 0) - tmp->TM_ZONE = wildabbr; + tmp->tm_zone = wildabbr; else { if (gmtptr == NULL) - tmp->TM_ZONE = (char *)gmt; + tmp->tm_zone = (char *)gmt; else - tmp->TM_ZONE = gmtptr->chars; + tmp->tm_zone = gmtptr->chars; } -#endif /* defined TM_ZONE */ return result; } @@ -1508,9 +1487,7 @@ timesub(const time_t *timep, long offset idays -= ip[tmp->tm_mon]; tmp->tm_mday = (int) (idays + 1); tmp->tm_isdst = 0; -#ifdef TM_GMTOFF - tmp->TM_GMTOFF = offset; -#endif /* defined TM_GMTOFF */ + tmp->tm_gmtoff = offset; return tmp; } Index: private.h === RCS file: /cvs/src/lib/libc/time/private.h,v retrieving revision 1.38 diff -u -p -r1.38 private.h --- private.h 24 Oct 2015 18:13:18 - 1.38 +++ private.h 10 May 2019 22:17:14 - @@ -9,12 +9,9 @@ */ /* OpenBSD defaults */ -#define TM_GMTOFF tm_gmtoff -#define TM_ZONEtm_zone #define PCTS 1 #define ALL_STATE 1 #define STD_INSPIRED 1 -#define USG_COMPAT 1 /* ** This header is for use ONLY with the time conversion code. Index: strftime.c === RCS file: /cvs/src/lib/libc/time/strftime.c,v retrieving revision 1.30 diff -u -p -r1.30 strftime.c --- strftime.c 21 Sep 2016 04:38:57 - 1.30 +++ strftime.c 10 May 2019 22:17:14 - @@ -456,12 +456,9 @@ label:
Re: [patch] remove reference to HOSTALIASES in hostname(7)
Hi, Theo de Raadt wrote on Mon, Apr 15, 2019 at 12:56:35PM -0600: > Hiltjo Posthuma wrote: >> I noticed the man page hostname(7) still references the environment >> variable HOSTALIASES. This functionality seems to be removed in the >> commit: >> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/asr/asr.c?rev=1.50=text/x-cvsweb-markup > Your deletion is highly excessive. I believe the following is the right amount of text to remove. The description of the file format can go because it does not appear to be used for anything else. In particular, hosts(5) contains its own file format description, which is different. OK? Ingo Index: hostname.7 === RCS file: /cvs/src/share/man/man7/hostname.7,v retrieving revision 1.10 diff -u -p -r1.10 hostname.7 --- hostname.7 29 May 2017 12:13:50 - 1.10 +++ hostname.7 9 May 2019 17:17:47 - @@ -52,19 +52,6 @@ which must generally translate the name Hostnames are resolved by the Internet name resolver in the following fashion. .Pp -If the name consists of a single component, i.e., contains no dot, -and if the environment variable -.Ev HOSTALIASES -is set to the name of a file, -that file is searched for any string matching the input hostname. -The file should consist of lines made up of two whitespace separated strings, -the first of which is the hostname alias, -and the second of which is the complete hostname -to be substituted for that alias. -If a case-insensitive match is found between the hostname to be resolved -and the first field of a line in the file, the substituted name is looked -up with no further processing. -.Pp If the input name ends with a trailing dot, the trailing dot is removed, and the remaining name is looked up with no further processing.
Re: [patch] improve strptime(3) %z timezone parsing
Hi, Theo de Raadt wrote on Sun, Mar 24, 2019 at 12:48:03PM -0600: > Hiltjo Posthuma wrote: >> On Sun, Feb 24, 2019 at 01:11:39PM +0100, Hiltjo Posthuma wrote: >>> 2. The military/nautical UTC offsets are [...] broken by design. >> As discussed previously for point 2 I think it is fine to either >> return NULL and not handle military zones like most libc's do > My position is we should delete support. So here is a diff to delete support for the military time zones. Just error out when encountering them. That's the best thing we can do because their meaning is ill-defined in the first place. I'm not mixing anything else into this diff. The other bugs should be handled separately. OK? Ingo Index: strptime.3 === RCS file: /cvs/src/lib/libc/time/strptime.3,v retrieving revision 1.27 diff -u -p -r1.27 strptime.3 --- strptime.3 21 Feb 2019 19:10:32 - 1.27 +++ strptime.3 9 May 2019 18:29:19 - @@ -242,15 +242,7 @@ time or .Ql Standard .Pq Dq S -time; -a single letter military timezone specified as: -.Dq A -through -.Dq I -and -.Dq K -through -.Dq Y . +time. .It Cm \&%Z timezone name or no characters when timezone information is unavailable. .It Cm \&%% Index: strptime.c === RCS file: /cvs/src/lib/libc/time/strptime.c,v retrieving revision 1.25 diff -u -p -r1.25 strptime.c --- strptime.c 21 Feb 2019 19:10:32 - 1.25 +++ strptime.c 9 May 2019 18:29:19 - @@ -520,25 +520,6 @@ literal: bp = ep; continue; } - - if ((*bp >= 'A' && *bp <= 'I') || - (*bp >= 'L' && *bp <= 'Y')) { -#ifdef TM_GMTOFF - /* Argh! No 'J'! */ - if (*bp >= 'A' && *bp <= 'I') - tm->TM_GMTOFF = - ('A' - 1) - (int)*bp; - else if (*bp >= 'L' && *bp <= 'M') - tm->TM_GMTOFF = 'A' - (int)*bp; - else if (*bp >= 'N' && *bp <= 'Y') - tm->TM_GMTOFF = (int)*bp - 'M'; -#endif -#ifdef TM_ZONE - tm->TM_ZONE = NULL; /* XXX */ -#endif - bp++; - continue; - } return NULL; } offs = 0;
Re: [patch] improve strptime(3) %z timezone parsing
Hi Todd, Todd C. Miller wrote on Fri, May 10, 2019 at 02:08:45PM -0600: > On Fri, 10 May 2019 16:52:35 +0200, Ingo Schwarze wrote: >> Here is a patch to fix the code. > OK millert@ for that part. Thanks, committed. >> The change to %Z is exactly what Hiltjo sent. >> The current code for %z is unnecessarily complicated. >> Rather than fixing it, i simply rewrote it from scratch. >> I like it when a bugfix results in -28 +11 LOC and better readability. > I don't believe this handles the "[+-]hh" form. Ouch. No, it does not. Thanks for spotting the regression. The following patch preserves the parsing behaviour and correctly stores the number of seconds into tm_gmtoff. OK? Ingo $ ./z +1 NULL $ ./z +02 7200 $ ./z +02x 7200 (rest "x") $ ./z +02: 7200 $ ./z +02:x 7200 (rest "x") $ ./z +02:1 NULL $ ./z +02:01 7260 $ ./z +02:013 7260 (rest "3") Index: strptime.c === RCS file: /cvs/src/lib/libc/time/strptime.c,v retrieving revision 1.27 diff -u -p -r1.27 strptime.c --- strptime.c 10 May 2019 20:24:58 - 1.27 +++ strptime.c 10 May 2019 20:43:35 - @@ -519,32 +519,17 @@ literal: } return NULL; } - offs = 0; - for (i = 0; i < 4; ) { - if (isdigit(*bp)) { - offs = offs * 10 + (*bp++ - '0'); - i++; - continue; - } - if (i == 2 && *bp == ':') { - bp++; - continue; - } - break; - } - switch (i) { - case 2: - offs *= 100; - break; - case 4: - i = offs % 100; - if (i >= 60) - return NULL; - /* Convert minutes into decimal */ - offs = (offs / 100) * 100 + (i * 50) / 30; - break; - default: + if (!isdigit(bp[0]) || !isdigit(bp[1])) return NULL; + offs = ((bp[0]-'0') * 10 + (bp[1]-'0')) * SECSPERHOUR; + bp += 2; + if (*bp == ':') + bp++; + if (isdigit(*bp)) { + offs += (*bp++ - '0') * 10 * SECSPERMIN; + if (!isdigit(*bp)) + return NULL; + offs += (*bp++ - '0') * SECSPERMIN; } if (neg) offs = -offs;
less(1) UTF-8 cleanup: cmd_putstr()
Hi, let's start tackling the last file which is using stuff from charset.c: the file cmdbuf.c. The first dirty function is cmd_putstr(). Fortunately, it doesn't actually need to handle non-ASCII characters. The function cmd_putstr() is called only from command.c. Almost all callers pass literal ASCII strings. The only exception is start_mca(), which passes on its second argument. Again, almost all callers of start_mca() pass literal ASCII strings. The only exception is mca_opt_char(), which passes the return value of opt_prompt(), defined in option.c. The argument of opt_prompt() is the static variable curropt, which is only set to non-NULL from findopt_name() or findopt() in opttbl.c. Both of these functions only ever return literal ASCII strings contained in the static struct loption option[]. To summarize, cmd_putstr() is only ever called with ASCII strings, so it can be radically simplified, deleting two LWCHAR variables, one call to step_char(), is_composing_char(), is_combining_char(), and is_wide_char() each. OK? Ingo Index: cmdbuf.c === RCS file: /cvs/src/usr.bin/less/cmdbuf.c,v retrieving revision 1.16 diff -u -p -r1.16 cmdbuf.c --- cmdbuf.c17 Sep 2016 15:06:41 - 1.16 +++ cmdbuf.c9 May 2019 12:22:51 - @@ -117,29 +117,16 @@ clear_cmd(void) } /* - * Display a string, usually as a prompt for input into the command buffer. + * Display an ASCII string, usually as a prompt for input, + * into the command buffer. */ void cmd_putstr(char *s) { - LWCHAR prev_ch = 0; - LWCHAR ch; - char *endline = s + strlen(s); while (*s != '\0') { - char *ns = s; - ch = step_char(, +1, endline); - while (s < ns) - putchr(*s++); - if (!utf_mode) { - cmd_col++; - prompt_col++; - } else if (!is_composing_char(ch) && - !is_combining_char(prev_ch, ch)) { - int width = is_wide_char(ch) ? 2 : 1; - cmd_col += width; - prompt_col += width; - } - prev_ch = ch; + putchr(*s++); + cmd_col++; + prompt_col++; } }
Re: grep ".one\|.two" doesn't work on OpenBSD. Is it expected?
Hi, Todd C. Miller wrote on Mon, May 20, 2019 at 01:22:21PM -0600: > On Mon, 20 May 2019 20:01:12 +0200, Juan Francisco Cantero Hurtado wrote: >> The grep command works with GNU, NetBSD, FreeBSD and BusyBox. It fails >> on OpenBSD and Solaris 11. I'm suggesting upstream to change the command >> to "grep -e ".datapack" -e ".histpack"" but I would like to know if this >> is a bug or we just don't support the | in the grep patterns. > That's GNU regexp format, not POSIX. The standard way to do this > is with an extended regular expression using egrep. E.g. > > cat test.txt | egrep ".datapack|.histpack" > > though you should escape the '.' if you want it to match literally. I just checked whether the meaning of "\|" is properly documented, and it is: re_format(7): POSIX leaves some aspects of RE syntax and semantics open; '**' marks decisions on these aspects that may not be fully portable to other POSIX implementations. [...] An atom is [...] a '\' followed by one of the characters '^.[$()|*+?{\' (matching that character taken as an ordinary character, as if the '\' had not been present**), [...] So, there is nothing to fix in the manual, and besides, our choice does not violate POSIX: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03 9.3.2 BRE Ordinary Characters An ordinary character is a BRE that matches itself: any character in the supported character set, except for the BRE special characters listed in BRE Special Characters. The interpretation of an ordinary character preceded by an unescaped ( '\\' ) is undefined, except for: [...] Since '|' is an ordinary character, "\|" causes undefined behaviour. Yours, Ingo
Re: cp(1) add -l hard link -s symlink options
Hi Tracey, Tracey Emery wrote on Thu, May 23, 2019 at 02:35:10PM -0600: > Attached is a proposed diff for cp(1). It adds the -l (hard link) > and -s (symlink) options. I don't like that. That's exactly what can be done with ln(1) in a standard way. There is no value in making every tool do everything - quite to that contrary, that only causes gratuitous complication and confusion. > These options are available in GNU cp, FreeBSD cp, > and the -l option is at least in NetBSD and Dragonfly. > > I needed the -l option to use the system cp for rsnapshots, instead of > their native_cl_al function. Hopefully, this will speed up my backups. If the rsnapshots uses such options, i think you should send patches to the rsnapshots project instead. Tell them to use POSIX features instead of relying on pointless GNUisms. Yours, Ingo
Re: update to PF pfctl(8) and pf.conf(5) manpages
Hi Alexandr, Alexandr Nedvedicky wrote on Wed, May 08, 2019 at 04:55:57PM +0200: > below is third iteration of pf.conf.5 manpage. OK schwarze@ A few final nits inline. Thank you again for doing this work. Ingo > diff --git a/share/man/man5/pf.conf.5 b/share/man/man5/pf.conf.5 [...] > @@ -1237,6 +1244,22 @@ Various limits can be combined on a single line: > .Bd -literal -offset indent > set limit { states 2, frags 2000, src-nodes 2000 } > .Ed > +.Pp > +.Xr pf 4 > +has the following defaults: > +.Bl -column table-entries PFR_KENTRY_HIWAT_SMALL platform_dependent > +.It states Ta Dv PFSTATE_HIWAT Ta Pq 10 > +.It tables Ta Dv PFR_KTABLE_HIWAT Ta Pq 1000 > +.It table-entries Ta Dv PFR_KENTRY_HIWAT Ta Pq 20 > +.It table-entries Ta Dv PFR_KENTRY_HIWAT_SMALL Ta Pq 10 > +.It frags Ta Dv NMBCLUSTERS/32 Ta Pq platform dependent Ta Dv NMBCLUSTERS Ns /32 Ta would be even better. > +.El > +.Pp > +.Cm NMBCLUSTERS Please make this line .Dv NMBCLUSTERS > +defines the total number of packets which can exist in-system at any one > time. > +Refer to > +.In machine/param.h > +to find out a value for system sepcific values. s/sepcific/specific/ The following would sound better; is it accurate? Refer to .In machine/param.h for the platform-specific value. > @@ -1393,12 +1425,13 @@ set syncookies adaptive (start 25%, end 12%) > .It Ic set Cm timeout Ar variable value > .Bl -tag -width "src.track" -compact > .It Cm frag > -Seconds before an unassembled fragment is expired. > +Seconds before an unassembled fragment is expired (60 by default). > .It Cm interval > -Interval between purging expired states and fragments. > +Interval between purging expired states and fragments (10 seconds by > default). > .It Cm src.track > Length of time to retain a source tracking entry after the last state > -expires. > +expires (0 by default, which means there is no global limit. > +The value is defined by rule which creates state.). I think articles are needed: The value is defined by the rule which creates the state). > @@ -1493,6 +1526,10 @@ set limit states 10 > .Pp > With 9000 state table entries, the timeout values are scaled to 50% > (tcp.first 60, tcp.established 43200). > +.Pp > +.Dq pfctl -F Reset > +restores default values for those options: debug, all limit options, > +loginterface, reassemble, skip, syncookies, all timeouts. > .El > .Sh QUEUEING > Packets can be assigned to queues for the purpose of bandwidth I think this would sound better: restores default values for the following options: debug, all limit options, loginterface, reassemble, skip, syncookies, and all timeouts. Also, the new lines should go after the .El, not before it, like this: .Pp With 9000 state table entries, the timeout values are scaled to 50% (tcp.first 60, tcp.established 43200). .El .Pp .Dq pfctl -F Reset restores default values for the following options: debug, all limit options, loginterface, reassemble, skip, syncookies, and all timeouts. .Sh QUEUEING Either way, this seems mature enough for getting it in. IMHO, no need to resend it again. Yours, Ingo
Re: [patch] improve strptime(3) %z timezone parsing
Hi Ted, Ted Unangst wrote on Thu, May 09, 2019 at 04:16:40PM -0400: > Ingo Schwarze wrote: >> I'm not mixing anything else into this diff. The other bugs should >> be handled separately. > Works for me. (with additional comment removal) Thanks for checking, committed. Now let's get to the more serious part. Hiltjo observed that %Z and %z produce wrong results. First of all, neither POSIX nor XPG define tm_gmtoff nor %Z nor %z: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/time.h.html https://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html So i think the best way to find out what tm_gmtoff should be is to understand how programs in our own tree use it. Here is a (hopefully) complete list of the users in OpenBSD base. The following files expect it to contain seconds: - lib/libc/time/strftime.c - lib/libc/time/wcsftime.c - usr.sbin/smtpd/to.c - usr.sbin/snmpd/mib.c - usr.sbin/cron/cron.c - usr.bin/ftp/util.c - usr.bin/cvs/entries.c - usr.bin/rcs/rcstime.c - gnu/usr.bin/perl/time64.c I failed to find any users that do *not* expect seconds. So my conclusion is that the documentation is right and what the code in strptime.c does is wrong. Here is a patch to fix the code. OK? The change to %Z is exactly what Hiltjo sent. The current code for %z is unnecessarily complicated. Rather than fixing it, i simply rewrote it from scratch. I like it when a bugfix results in -28 +11 LOC and better readability. Yours, Ingo Index: strptime.c === RCS file: /cvs/src/lib/libc/time/strptime.c,v retrieving revision 1.26 diff -u -p -r1.26 strptime.c --- strptime.c 10 May 2019 12:49:16 - 1.26 +++ strptime.c 10 May 2019 14:40:49 - @@ -497,7 +497,7 @@ literal: ep = _find_string(bp, , nast, NULL, 4); if (ep != NULL) { #ifdef TM_GMTOFF - tm->TM_GMTOFF = -5 - i; + tm->TM_GMTOFF = (-5 - i) * SECSPERHOUR; #endif #ifdef TM_ZONE tm->TM_ZONE = (char *)nast[i]; @@ -509,7 +509,7 @@ literal: if (ep != NULL) { tm->tm_isdst = 1; #ifdef TM_GMTOFF - tm->TM_GMTOFF = -4 - i; + tm->TM_GMTOFF = (-4 - i) * SECSPERHOUR; #endif #ifdef TM_ZONE tm->TM_ZONE = (char *)nadt[i]; @@ -519,33 +519,16 @@ literal: } return NULL; } - offs = 0; - for (i = 0; i < 4; ) { - if (isdigit(*bp)) { - offs = offs * 10 + (*bp++ - '0'); - i++; - continue; - } - if (i == 2 && *bp == ':') { - bp++; - continue; - } - break; - } - switch (i) { - case 2: - offs *= 100; - break; - case 4: - i = offs % 100; - if (i >= 60) - return NULL; - /* Convert minutes into decimal */ - offs = (offs / 100) * 100 + (i * 50) / 30; - break; - default: + if (!isdigit(bp[0]) || !isdigit(bp[1])) return NULL; - } + offs = ((bp[0]-'0') * 10 + (bp[1]-'0')) * SECSPERHOUR; + bp += 2; + if (*bp == ':') + bp++; + if (!isdigit(bp[0]) || !isdigit(bp[1])) + return NULL; + offs += ((bp[0]-'0') * 10 + (bp[1]-'0')) * SECSPERMIN; + bp += 2; if (neg) offs = -offs; tm->tm_isdst = 0; /* XXX */
Re: mdoc.7: fix Dd example
Hi Klemens, Jason McIntyre wrote on Sun, May 19, 2019 at 08:58:35PM +0100: > On Sun, May 19, 2019 at 09:52:39PM +0200, Klemens Nanni wrote: >> On Sun, May 19, 2019 at 07:31:19PM +0100, Jason McIntyre wrote: >>> no. just use $Mdocdate$ and it gets expanded on commit. just like >>> when you add $OpenBSD$: you don;t fill that in, right? >> OK, will do so. No, I'll just wite `$OpenBSD$'. >>> use $Mdocdate$ and let expansion happen, or just use the >>> old format of "month day, year". >>> is that clear, or did the docs mislead you somehow? >> Now it is, thanks. I had no idea of old vs. new, so it confused me >> to see warnings in both variations. Thanks for the report. It was indeed a bad idea that mandoc(1) complained about style when your style was actually perfectly fine, so i just committed the patch below. >> To be clear: what is the "new" format? > originally we wrote the date every time we bumped the page, in the > format: > > month day, year > > now you just add an Mdocdate id before *first* commit, and thereafter > there's no need to touch it. it updates every commit. like any other > rcsid. >> I think I also misunderstood the `$Mdocdate$ | month day, year' syntax. >> This means either `$Mdocdate$' *or* `May 19, 2019`, not both together >> like I did. > correct. I also checked the mdoc(7) manual page and came to the conclusion that the description of .Dd is already very good. MANUAL STRUCTURE shows the canonical form for a new page. MACRO OVERVIEW correctly shows all forms that authors are supposed to type in. MACRO REFERENCE describes the details correctly and provides a perfect set of examples. So it seems to me the bad message was the only problem here. Yours, Ingo Log Message: --- Do not print the style message "missing date" when the date is given as "$Mdocdate$" without an actual date. That is the canonical way to write a new manual page and not bad style at all. Misleading message reported by kn@ on tech@. Modified Files: -- mandoc: mandoc.c Revision Data - Index: mandoc.c === RCS file: /home/cvs/mandoc/mandoc/mandoc.c,v retrieving revision 1.114 retrieving revision 1.115 diff -Lmandoc.c -Lmandoc.c -u -p -r1.114 -r1.115 --- mandoc.c +++ mandoc.c @@ -541,10 +541,10 @@ mandoc_normdate(struct roff_man *man, ch /* No date specified: use today's date. */ - if (in == NULL || *in == '\0' || strcmp(in, "$" "Mdocdate$") == 0) { + if (in == NULL || *in == '\0') mandoc_msg(MANDOCERR_DATE_MISSING, ln, pos, NULL); + if (in == NULL || *in == '\0' || strcmp(in, "$" "Mdocdate$") == 0) return time2a(time(NULL)); - } /* Valid mdoc(7) date format. */
Re: print regress results
Hi Alexander, Alexander Bluhm wrote on Wed, May 15, 2019 at 12:01:00AM -0400: > Regress prints FAILED in the middle of the make output, this is > hard to watch. I agree this is a nuisance. I have often wondered whether the result was "PASS" or "FAIL" after doing longer regression runs in the past. > tb@ asked me to print a PASSED at the end. That would indeed be an improvement. > As the make processes cannot hold state over several targets or > directories, I create a regress log. It is placed in the top > directory where you invoke make regress. All failures are logged > there. if there are none, PASSED is printed as final output. That concept feels somewhat messy and fragile. The /^FAIL/ regular expression might misfire - at least test writers must be aware that tests must not frint FAIL at the beginning of output lines, so maybe you should document this? In principle, it would be cleaner to use the exit status of recursive make invocations. Then again, there are multiple places where make is invoked recursively, so i'm not sure that is feasible. If it isn't, or if a cleaner way would cause excessive complication, i'm not totally opposed to grepping log files, even though i don't particularly like it. Apart from the general approach, i'm not convinced the details are quite right just yet. For starters, it feels even more messy that the REGRESS_LOG variable invades bsd.subdir.mk - not sure whether that can somehow be avoided. Here are some examples of what i consider undesirable side effects: $ cd /usr/src/bin/ $ make -n obj rm -f -- /usr/src/bin/regress.log [...] I'm not convinced that "make obj" should attempt to change anything in the source tree, apart from creating "obj" symlinks. In particular, having "make obj" delete anything doesn't feel quite right. $ make -n rm -f -- /usr/src/bin/regress.log [...] While i see the point of deleting regress.log at the beginning of a regression run, it doesn't feel quite right at the beginning of a make run that has nothing to do with regression testing. Also, it doesn't feel quite right that the regress.log file is deleted from the *src* directory. Shouldn't it be deleted from the *obj* directory instead? I mean, like this: $ cd /usr/src/regress/usr.bin/mandoc/mdoc/Ad/ $ make -n regress rm -f -- /usr/src/regress/usr.bin/mandoc/mdoc/Ad/obj/regress.log [...] Then, what about this: $ make -dl regress mandoc -Ios=OpenBSD -Tascii [...] diff -au [...] [...] PASSED $ make -dl clean rm -f noarg.mandoc_ascii font.mandoc_ascii rm -f noarg.in_man noarg.mandoc_man font.in_man font.mandoc_man rm -f noarg.mandoc_lint rm -f noarg.mandoc_markdown font.mandoc_markdown $ ls obj/ regress.log $ make cleandir $ ls obj/ regress.log I would expect that both "make clean" and "make cleandir" remove the file regress.log, but they don't appear to. Evem if i remove "obj" and run "make regress" without running "make obj" before it, the file regress.log survives "make clean" and "make cleandir", this time not even below obj, but in the source tree itself. (Et ceterum censeo support for running without "make obj" esse delendam to simplify stuff, but that's a different matter.) $ cd /usr/src/regress/usr.bin/mandoc/tbl/ $ make -dl regress rm -f -- /usr/src/regress/usr.bin/mandoc/tbl/regress.log for entry in opt layout mod data macro; do [...] ===> opt diff -au [...] [...] PASSED ===> layout diff -au [...] [...] PASSED [...] ===> macro diff -au [...] [...] PASSED $ vi opt/box.out_ascii # bad guy manipulating the output $ make -dl regress rm -f -- /usr/src/regress/usr.bin/mandoc/tbl/regress.log for entry in opt layout mod data macro; do [...] ===> opt diff -au [...] *** Error 1 in opt (../../Makefile.inc:82 'box.diff_ascii') FAILED *** Error 1 in target 'regress' (ignored) diff -au [...] [...] FAIL usr.bin/mandoc/tbl/opt box ===> layout diff -au [...] [...] FAIL usr.bin/mandoc/tbl/opt box [...] ===> macro diff -au [...] [...] FAIL usr.bin/mandoc/tbl/opt box So in a recursive run, the ^FAIL lines are re-printed for each subsequent directory. Isn't the goal to print the summary once, at the end? Also, the file regress.log is created in the source tree in this case, even though i ran "make obj" before. Not sure how to prevent that because usr.bin/mandoc/tbl/ neither has nor needs an obj/ symlink because in contains only subdirs, no tests of its own. All the same, it doesn't feel quite right. These are not strong objections, you are maintaining this subsystem, but i have some doubts whether this particular patch is mature enough for commit. Yours, Ingo
Re: remove regress maxtime
Hi Alexander, Alexander Bluhm wrote on Tue, May 14, 2019 at 02:16:19AM -0400: > Does anyone use the REGRESS_MAXTIME feature? Not me. > I would like to remove it. > > - The timeout based on CPU seconds is pretty useless. > Most hanging tests sleep and do not spin. > - A timeout cannot be distinguished from failure. > - I have never triggered it. > - Only 3 tests set it. > - It comes in my way for a new fail/pass reporting feature. I like the deletion. Getting rid of features that work poorly, are fragile, are rarely used and get in the way of maintenance and development is good. Simplicity is good, and having one robust way of doing things rather than some rarely-tested options is good. Please delete the "BUGS AND LIMITATIONS" section in bsd.regress.mk(5), too, which had a wrong name, anyway. > Also do _SKIP_FAIL cleanup. Sure, why not. Yours, Ingo > Index: regress/lib/libpthread/Makefile.inc > === > RCS file: /data/mirror/openbsd/cvs/src/regress/lib/libpthread/Makefile.inc,v > retrieving revision 1.10 > diff -u -p -r1.10 Makefile.inc > --- regress/lib/libpthread/Makefile.inc 19 Aug 2012 18:55:16 - > 1.10 > +++ regress/lib/libpthread/Makefile.inc 14 May 2019 05:56:28 - > @@ -10,5 +10,3 @@ CFLAGS+=-Wall # -Werror > #DEBUG= -ggdb > CFLAGS+= -DSRCDIR='"${.CURDIR}"' > CFLAGS+= -I${.CURDIR}/../include > - > -REGRESS_MAXTIME?=30 > Index: regress/lib/libpthread/pthread_specific/Makefile > === > RCS file: > /data/mirror/openbsd/cvs/src/regress/lib/libpthread/pthread_specific/Makefile,v > retrieving revision 1.2 > diff -u -p -r1.2 Makefile > --- regress/lib/libpthread/pthread_specific/Makefile 22 Dec 2013 11:08:31 > - 1.2 > +++ regress/lib/libpthread/pthread_specific/Makefile 14 May 2019 05:56:28 > - > @@ -2,6 +2,4 @@ > > PROG=pthread_specific > > -REGRESS_MAXTIME=60 > - > .include > Index: regress/lib/libpthread/stdarg/Makefile > === > RCS file: > /data/mirror/openbsd/cvs/src/regress/lib/libpthread/stdarg/Makefile,v > retrieving revision 1.5 > diff -u -p -r1.5 Makefile > --- regress/lib/libpthread/stdarg/Makefile26 Dec 2013 16:22:55 - > 1.5 > +++ regress/lib/libpthread/stdarg/Makefile14 May 2019 05:56:28 - > @@ -4,6 +4,4 @@ PROG= stdarg > > CFLAGS+= -I${.CURDIR}/../include > > -REGRESS_MAXTIME=10 > - > .include > Index: share/man/man5/bsd.regress.mk.5 > === > RCS file: /data/mirror/openbsd/cvs/src/share/man/man5/bsd.regress.mk.5,v > retrieving revision 1.18 > diff -u -p -r1.18 bsd.regress.mk.5 > --- share/man/man5/bsd.regress.mk.5 2 Dec 2018 11:46:31 - 1.18 > +++ share/man/man5/bsd.regress.mk.5 14 May 2019 05:33:37 - > @@ -80,9 +80,6 @@ Points to the fully-qualified path of a > results are appended. > Defaults to > .Pa /dev/null . > -.It Ev REGRESS_MAXTIME > -Maximum limit of CPU seconds to spend on the regression test. > -Exceeding this time will result in a failure being logged. > .It Ev REGRESS_ROOT_TARGETS > Targets for which root access is required to run the test. > The > Index: share/mk/bsd.regress.mk > === > RCS file: /data/mirror/openbsd/cvs/src/share/mk/bsd.regress.mk,v > retrieving revision 1.17 > diff -u -p -r1.17 bsd.regress.mk > --- share/mk/bsd.regress.mk 3 Dec 2018 22:30:04 - 1.17 > +++ share/mk/bsd.regress.mk 14 May 2019 05:43:43 - > @@ -39,14 +39,14 @@ REGRESS_SKIP_TARGETS=run-regress-${PROG} > . endif > .endif > > -.if defined(REGRESS_SLOW_TARGETS) && ${REGRESS_SKIP_SLOW} != no > +.if defined(REGRESS_SLOW_TARGETS) && ${REGRESS_SKIP_SLOW:L} != no > REGRESS_SKIP_TARGETS+=${REGRESS_SLOW_TARGETS} > .endif > > -.if ${REGRESS_FAIL_EARLY} != no > -_SKIP_FAIL= > +.if ${REGRESS_FAIL_EARLY:L} != no > +_REGRESS_IGNORE_FAIL= > .else > -_SKIP_FAIL=- > +_REGRESS_IGNORE_FAIL=- > .endif > > .if defined(REGRESS_ROOT_TARGETS) > @@ -102,26 +102,13 @@ regress: .SILENT > @echo -n "SKIP " ${_REGRESS_OUT} > @echo SKIPPED > . else > -# XXX - we need a better method to see if a test fails due to timeout or just > -# normal failure. > -. if !defined(REGRESS_MAXTIME) > - ${_SKIP_FAIL}if cd ${.CURDIR} && ${MAKE} ${RT}; then \ > + ${_REGRESS_IGNORE_FAIL} if ${MAKE} -C ${.CURDIR} ${RT}; then \ > echo -n "SUCCESS " ${_REGRESS_OUT} ; \ > else \ > echo -n "FAIL " ${_REGRESS_OUT} ; \ > echo FAILED ; \ > false; \ > fi > -. else > - ${_SKIP_FAIL}if cd ${.CURDIR} && \ > - (ulimit -t ${REGRESS_MAXTIME} ; ${MAKE} ${RT}); then \ > - echo -n "SUCCESS " ${_REGRESS_OUT} ; \ > - else \ > - echo -n "FAIL (possible timeout) "
Re: Don't use '.Sx' to refer to external sections
Hi Stephen, Stephen Gregoratto wrote on Tue, May 14, 2019 at 01:08:00PM +1000: > It was mentioned previously that the use of .Sx to refer to a section in > a different manual page is incorrect. I grepped through the src tree to > see if there were any cases of this and I found a couple. Thanks, i committed the relevant parts of the patch. Try not to mix unrelated, needless changes into patches, though; i did not commit the parts cited below. Yours, Ingo > diff --git a/share/man/man4/ip6.4 b/share/man/man4/ip6.4 > index 8acb7a762..a40b78f3a 100644 > --- a/share/man/man4/ip6.4 > +++ b/share/man/man4/ip6.4 > @@ -94,7 +94,8 @@ struct ip6_hdr { > All fields are in network-byte order. > Any options specified (see > .Sx Options > -below) must also be specified in network-byte order. > +below) > +must also be specified in network-byte order. > .Pp > .Va ip6_flow > specifies the flow ID. That change provides no benefit. > diff --git a/share/man/man9/syscall.9 b/share/man/man9/syscall.9 > index 6a20b052e..1eb78609f 100644 > --- a/share/man/man9/syscall.9 > +++ b/share/man/man9/syscall.9 > @@ -175,9 +175,9 @@ else > .Pp > The > .Dq SYSCALL_DEBUG > -parts of the code are explained in the section > +parts of the code are explained in the > .Sx Debugging > -later in the document. > +section below. > For the > .Dq KTRACE > portions of the code refer to the I committed that one separately. > diff --git a/usr.sbin/vmctl/vmctl.8 b/usr.sbin/vmctl/vmctl.8 > index 144d774c8..0a7195a3c 100644 > --- a/usr.sbin/vmctl/vmctl.8 > +++ b/usr.sbin/vmctl/vmctl.8 > @@ -208,8 +208,8 @@ configure a gateway address on the VM host side, > and run a simple DHCP/BOOTP server for the VM. > See > .Sx LOCAL INTERFACES > -below for more information on how addresses are calculated and assigned when > -using the > +below for more information on how addresses are calculated and assigned > +when using the > .Fl L > option. > .It Fl m Ar size That change provides no benefit.
clean up LC_COLLATE and LC_CTYPE in sort(1)
Hi, after my LC_NUMERIC cleanup for sort(1) went in (thanks to tb@ for the review), i'd like to adress the rest of locale dependency. Large amounts of extremely ugly code in sort(1) - many hundreds of lines - deal with LC_COLLATE, which we don't support now and have no intention to support in the future. The code is very repetitive and currently written to handle three cases: 1. byte_sort == true && sort_mb_cur_max == 1 That is the only mode currently supported on OpenBSD. It means everything uses the POSIX locale and ASCII. 2. byte_sort == false && sort_mb_cur_max == 1 That will never be supported on OpenBSD. It handles 8-bit single byte character encodings which are incompatible with UTF-8, for example ISO-LATIN-1. 3. byte_sort == false && sort_mb_cur_max > 1 Even though i doubt we will ever do it, that could theoretically happen on OpenBSD in the remote future, if we ever choose to implement collation support for UTF-8 locales. Handling case 3 would be a massive undertaking - not just a matter of improving Unicode support, but also forcing us to maintain many different UTF-8 locales for many different languages, which means extremely messy stuff invading the C library. During the Belgrade EuroBSDCon a few years ago, i talked to Baptiste Daroussin who had just implemented LC_COLLATE in FreeBSD libc and who was utterly scared by the complexity. Knowing ourselves, we would be scared even more once we got there. So it will definitely not happen quickly. Then again, ruling that out for good is maybe not a decision to make in this particular patch. Consequently, the byte_sort variable can be deleted immediately, killing case 2 for good, but i'm keeping the sort_mb_cur_max variable as a global constant for now, even though more than half of the code it controls is currently dead code. Since none of our single-byte character and string functions are locale dependent, we can also zap LC_CTYPE while here. After committing this patch, i shall re-indent bwscoll() properly in a separate commit, but i'm not including that in the patch sent out here because it would make the patch unreadable. OK? Ingo Index: bwstring.c === RCS file: /cvs/src/usr.bin/sort/bwstring.c,v retrieving revision 1.7 diff -u -p -r1.7 bwstring.c --- bwstring.c 1 Apr 2015 22:38:08 - 1.7 +++ bwstring.c 14 May 2019 15:19:54 - @@ -40,9 +40,6 @@ #include "bwstring.h" #include "sort.h" -bool byte_sort; -size_t sort_mb_cur_max = 1; - static wchar_t **wmonths; static char **cmonths; @@ -686,22 +683,20 @@ bwscoll(const struct bwstring *bws1, con if (len1 <= offset) return (len2 <= offset) ? 0 : -1; - else { + if (len2 <= offset) return 1; - else { + len1 -= offset; len2 -= offset; if (sort_mb_cur_max == 1) { const unsigned char *s1, *s2; + int res; s1 = bws1->data.cstr + offset; s2 = bws2->data.cstr + offset; - if (byte_sort) { - int res = 0; - if (len1 > len2) { res = memcmp(s1, s2, len2); if (!res) @@ -714,66 +709,6 @@ bwscoll(const struct bwstring *bws1, con res = memcmp(s1, s2, len1); return res; - - } else { - int res = 0; - size_t i, maxlen; - - i = 0; - maxlen = len1; - - if (maxlen > len2) - maxlen = len2; - - while (i < maxlen) { - /* goto next non-zero part: */ - while ((i < maxlen) && - !s1[i] && !s2[i]) - ++i; - - if (i >= maxlen) - break; - - if (s1[i] == 0) { - if (s2[i] == 0) - /* NOTREACHED */ - err(2, "bwscoll error 01"); - else -
Re: update to PF pfctl(8) and pf.conf(5) manpages
Hi Alexandr, here are a few additional minor remarks... Alexandr Nedvedicky wrote on Mon, Apr 29, 2019 at 08:59:55PM +0200: > diff --git a/sbin/pfctl/pfctl.8 b/sbin/pfctl/pfctl.8 > index b7e941991ba..5e2c57f6bc2 100644 > --- a/sbin/pfctl/pfctl.8 > +++ b/sbin/pfctl/pfctl.8 > @@ -198,7 +198,11 @@ Flush the tables. > .It Fl F Cm osfp > Flush the passive operating system fingerprints. > .It Fl F Cm Reset > -Reset limits, timeouts and options back to default settings. > +Reset limits, timeouts and other options back to default settings. > +See > +.Xr pf.conf 5 > +.Sx 'OPTIONS' > +for details. I know that jmc@ suggested this, but unfortunately, it's incorrect. You cannot use .Sx to refer to a section in a *different* manual page. What the above code creates is a link to a section in the pfctl(8) manual page itself that would have to start with .Sh 'OPTIONS' and that doesn't exist, so it results in a dead link. Besides, kn@ already mentioned that combining .Sx with quoting is not a good idea. I would simply write: See the OPTIONS section in .Xr pf.conf 5 for details. > diff --git a/share/man/man5/pf.conf.5 b/share/man/man5/pf.conf.5 > index 247ceef40a5..b043a6ed725 100644 > --- a/share/man/man5/pf.conf.5 > +++ b/share/man/man5/pf.conf.5 [...] > @@ -1235,6 +1245,25 @@ Various limits can be combined on a single line: > .Bd -literal -offset indent > set limit { states 2, frags 2000, src-nodes 2000 } > .Ed > +.Pp > +.Xr pf 4 > +has the following defaults > +.Bl -column table-entries PFR_KENTRY_HIWAT_SMALL platform_dependent > +.It states Ta Dv PFSTATE_HIWAT Ta Pq 10 > +.It tables Ta Dv PFR_KTABLE_HIWAT Ta Pq 1000 > +.It table-entries Ta Dv PFR_KENTRY_HIWAT Ta Pq 20 > +.It table-entries Ta Dv PFR_KENTRY_HIWAT_SMALL Ta Pq 10 > +.It frags Ta Dv NMBCLUSTERS/32 Ta Pq platform dependent > +.El > +.Pp > +.Cm NMBCLUSTERS > +defines the total number of packets which can exist in-system at any one > time. > +Refer to > +.Sy sys/arch/`uname -p`/param.h I'm still suggesting .In machine/param.h It's easier for users to refer to installed header files than to search the kernel source tree. [...] > @@ -1290,6 +1329,10 @@ instead of being dropped; > the reassembled packet will have the > .Dq dont-fragment > bit cleared. > +The default value is yes. The default value is .Cm yes . [...] > @@ -1388,15 +1435,20 @@ The thresholds for entering and leaving syncookie > mode can be specified using: > set syncookies adaptive (start 25%, end 12%) > .Ed > .El > +.Pp > +.Xr pfctl 8 > +.Fl F Cm Reset > +restores customized setting back to default. > .It Ic set Cm timeout Ar variable value > .Bl -tag -width "src.track" -compact > .It Cm frag > -Seconds before an unassembled fragment is expired. > +Seconds before an unassembled fragment is expired (60 by default). > .It Cm interval > -Interval between purging expired states and fragments. > +Interval between purging expired states and fragments (10 by default). Maybe "(10 seconds by default)"? Just in this one case, the others seem fine. [...] > @@ -1408,13 +1460,13 @@ Tuning these values may improve the performance of the > firewall at the risk of dropping valid idle connections. > .Pp > .Bl -tag -width Ds -compact > -.It Cm tcp.closed > +.It Cm tcp.closed No (90 seconds by default) .It Cm tcp.closed Pq 90 seconds by default would seem simpler and more natural to me and is shorter. Not a big deal though, your usage of .No isn't incorrect. Yours, Ingo
Re: [PATCH] [www] innovations.html - add unwind(8) to the list
Hi Raf, Raf Czlonka wrote on Tue, Apr 30, 2019 at 12:18:17PM +0100: > unwind(8) looks like a good candidate for inclusion into > innovations.html[0]. > > While there, add "released with ..." to the preceding entry. > > [0] https://www.openbsd.org/innovations.html Thanks, committed. Ingo > Index: innovations.html > === > RCS file: /cvs/www/innovations.html,v > retrieving revision 1.72 > diff -u -p -r1.72 innovations.html > --- innovations.html 11 Apr 2019 01:14:07 - 1.72 > +++ innovations.html 30 Apr 2019 11:16:32 - > @@ -804,7 +804,12 @@ > > https://man.openbsd.org/rad.8;>rad(8): > Written and maintained by Florian Obser. > -Imported July 10, 2018. > +Imported July 10, 2018; released with OpenBSD 6.4. > + > + > +https://man.openbsd.org/unwind.8;>unwind(8): > +Written and maintained by Florian Obser. > +Imported January 23, 2019; released with OpenBSD 6.5. > > >
Re: update to PF pfctl(8) and pf.conf(5) manpages
Hi, Jason McIntyre wrote on Mon, Apr 29, 2019 at 09:53:03PM +0100: > ah, so singular is correct. but it needs an article of some sort. how > about: > > .Xr pfctl 8 > .Fl F Cm Reset > restores this value to its default. > > to be honest, i don;t like it when we Xr like this, or mark up so much. > i would go for > > .Dq pfctl -F Reset > restores this value to its default. I'm OK with both forms. Note that there is value in having "Cm Reset" in at least one place in pf.conf.5, such that the following commands bring up pf.conf(5): $ man -k Cm=Reset $ man -k any~Reset > or .Li instead of .Dq maybe Please don't. We just deprecated .Li (to simplify the language, such that authors need to learn fewer macros) and mdoc(7) now says: Li word ... Request a typewriter (literal) font. Deprecated because on terminal output devices, this is usually indistinguishable from normal text. For literal displays, use Ql (in-line), Dl (single line), or Bd -literal (multi-line) instead. Typewriter font isn't critical in such a case, so either .Ql or .Dq is fine (or the full markup with Xr Fl Cm). +.Xr pfctl 8 +.Fl F Cm Reset +restores customized settings back to default. +.sp >>> what's the .sp for? if you want vertical blank space, use .Pp >> I think it was suggested by 'mandoc -T lint', I suspect you have seen the message "blank line in fill mode, using .sp" with an earlier version of your patch. That message does *not* intend to ask you to insert an .sp request (you should never use .sp in a manual page, it is low level roff) - it merely explains how a blank line (which shouldn't be there) is treated by mandoc. I just improved the description of the message in the mandoc(1) manual page to avoid the misunderstanding you succumbed to: Warnings related to plain text blank line in fill mode, using .sp (mdoc) The meaning of blank input lines is only well-defined in non-fill mode: In fill mode, line breaks of text input lines are not supposed to be significant. However, for compatibility with groff, blank lines in fill mode are formatted like sp requests. To request a paragraph break, use Pp instead of a blank line. > ok, .sp, which i always forget about, is basically roff for .Pp. just > use .Pp if you need the vertical spacing. Correct. Right before .It in a non-compat list, it's redundant, though. (tcp.first 60, tcp.established 43200). +.Pp +.Xr pfctl 8 +.Fl F Cm Reset +restores customized any timeout setting back to default. >>> this reads weird! >> I've fixed that to: >> restores any customized timeout setting back to default. > maybe > restores this setting to its default value. The sentence applies to the whole, long "set timeout variable value" list item, so plural (settings) is needed. Maybe: restores all .Cm timeout settings to their respective defaults. Yours, Ingo
less(1) UTF-8 cleanup: pshift()
Hi, thanks for checking the pappend() and filename.c patches, those are now committed. Here is basic cleanup of the last major function in line.c, pshift(). Several minor issues still remain in the file, but those are for later. This gets rid of two LWCHAR variables, one call to utf_len(), get_wchar(), is_composing_char(), is_combining_char(), and control_char() each and two calls to is_wide_char(). Note the change from "continue" to "break" in the "width == 2" block. It is an improvement because the double-width character did not move off-screen completely, its second half is still there, represented as a single-width space. So whatever follows it does absolutely not move off-screen, not even if what follows is zero width, so it must not be inspected by the loop. Since the variable names are far from intuitive, improve and add comments. Also, the comment above pshift() is simply wrong. The function does not discard a specific number of characters. Tested by running LC_CTYPE=en_US.UTF-8 less -# 1 LC_CTYPE=en_US.UTF-8 less -# 1 -R LC_CTYPE=C less -# 1 on my set of test files containing a wide variety of combinations of valid and invalid UTF-8 and control sequences, then scrolling horizontally with the arrow keys. OK? Ingo Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.29 diff -u -p -r1.29 line.c --- line.c 7 May 2019 14:16:16 - 1.29 +++ line.c 7 May 2019 20:52:26 - @@ -32,11 +32,11 @@ int ntabstops = 1; /* Number of tabstop int tabdefault = 8;/* Default repeated tabstops */ off_t highest_hilite; /* Pos of last hilite in file found so far */ -static int curr; /* Index into linebuf */ -static int column; /* Printable length, accounting for backspaces, etc. */ +static int curr; /* Total number of bytes in linebuf */ +static int column; /* Display columns needed to show linebuf */ static int overstrike; /* Next char should overstrike previous char */ static int is_null_line; /* There is no current line */ -static int lmargin;/* Left margin */ +static int lmargin;/* Index in linebuf of start of content */ static char pendc; static off_t pendpos; static char *end_ansi_chars; @@ -202,20 +202,21 @@ plinenum(off_t pos) /* * Shift the input line left. - * This means discarding N printable chars at the start of the buffer. + * Starting at lmargin, some bytes are discarded from the linebuf, + * until the number of display columns needed to show these bytes + * would exceed the argument. */ static void pshift(int shift) { - LWCHAR prev_ch = 0; - unsigned char c; - int shifted = 0; - int to; - int from; - int len; - int width; - int prev_attr; - int next_attr; + int shifted = 0; /* Number of display columns already discarded. */ + int from; /* Index in linebuf of the current character. */ + int to; /* Index in linebuf to move this character to. */ + int len; /* Number of bytes in this character. */ + int width = 0;/* Display columns needed for this character. */ + int prev_attr;/* Attributes of the preceding character. */ + int next_attr;/* Attributes of the following character. */ + unsigned char c; /* First byte of current character. */ if (shift > column - lmargin) shift = column - lmargin; @@ -241,44 +242,41 @@ pshift(int shift) } continue; } - - width = 0; - if (!IS_ASCII_OCTET(c) && utf_mode) { - /* Assumes well-formedness validation already done. */ - LWCHAR ch; - - len = utf_len(c); - if (from + len > curr) - break; - ch = get_wchar(linebuf + from); - if (!is_composing_char(ch) && - !is_combining_char(prev_ch, ch)) - width = is_wide_char(ch) ? 2 : 1; - prev_ch = ch; + wchar_t ch; + /* +* Before this point, UTF-8 validity was already +* checked, but for additional safety, treat +* invalid bytes as single-width characters +* if they ever make it here. Similarly, treat +* non-printable characters as width 1. +*/ + len = mbtowc(, linebuf + from, curr - from); + if (len == -1) + len = width = 1; + else if ((width = wcwidth(ch)) == -1) + width
clean up LC_NUMERIC in sort(1)
Hi, it was noticed years ago that our implementation of sort(1) is very messy. It contains huge amounts of dead code -- many hundreds of lines -- and many wrapper functions around standard C library functions that are in very bad taste and mostly pointless. However, nobody ever came round to cleaning it up. As a first simple step, this patch deletes LC_NUMERIC support. LC_NUMERIC is an absolutely terrible idea that we should never support anywhere, neither now nor in the future. The benefit is getting rid of * Four global variables (extern, not even file static). * Two static variables in sort.c. * One pointless wrapper function, conv_mbtowc(). * Substantial parts of the code in set_locale(). * One obscure environment variable. * Some so-called "debug" output that is always the same. In order to not do too much in a single diff, this diff leaves LC_NUMERIC and LC_CTYPE alone even though both are NOOPs on OpenBSD. That's work for another day. However, let's already drop LC_MESSAGES and LC_TIME from the manual page together with LC_NUMERIC: they are of course NOOPs and don't even seem to have dead tentacles in the code. The test suite is still fine. OK? Ingo Index: coll.c === RCS file: /cvs/src/usr.bin/sort/coll.c,v retrieving revision 1.11 diff -u -p -r1.11 coll.c --- coll.c 11 Dec 2015 21:41:51 - 1.11 +++ coll.c 6 May 2019 11:59:44 - @@ -46,12 +46,6 @@ struct key_specs *keys; size_t keys_num = 0; -wint_t symbol_decimal_point = L'.'; -/* there is no default thousands separator in collate rules: */ -wint_t symbol_thousands_sep = 0; -wint_t symbol_negative_sign = L'-'; -wint_t symbol_positive_sign = L'+'; - static int wstrcoll(struct key_value *kv1, struct key_value *kv2, size_t offset); static int gnumcoll(struct key_value*, struct key_value *, size_t offset); static int monthcoll(struct key_value*, struct key_value *, size_t offset); @@ -701,7 +695,7 @@ read_number(struct bwstring *s0, int *si while (iswblank(bws_get_iter_value(s))) s = bws_iterator_inc(s, 1); - if (bws_get_iter_value(s) == (wchar_t)symbol_negative_sign) { + if (bws_get_iter_value(s) == L'-') { *sign = -1; s = bws_iterator_inc(s, 1); } @@ -716,16 +710,13 @@ read_number(struct bwstring *s0, int *si smain[*main_len] = bws_get_iter_value(s); s = bws_iterator_inc(s, 1); *main_len += 1; - } else if (symbol_thousands_sep && - (bws_get_iter_value(s) == (wchar_t)symbol_thousands_sep)) - s = bws_iterator_inc(s, 1); - else + } else break; } smain[*main_len] = 0; - if (bws_get_iter_value(s) == (wchar_t)symbol_decimal_point) { + if (bws_get_iter_value(s) == L'.') { s = bws_iterator_inc(s, 1); while (iswdigit(bws_get_iter_value(s)) && *frac_len < MAX_NUM_SIZE) { Index: coll.h === RCS file: /cvs/src/usr.bin/sort/coll.h,v retrieving revision 1.1 diff -u -p -r1.1 coll.h --- coll.h 17 Mar 2015 17:45:13 - 1.1 +++ coll.h 6 May 2019 11:59:44 - @@ -123,14 +123,6 @@ typedef int (*listcoll_t)(struct sort_li extern struct key_specs *keys; extern size_t keys_num; -/* - * Main localised symbols. These must be wint_t as they may hold WEOF. - */ -extern wint_t symbol_decimal_point; -extern wint_t symbol_thousands_sep; -extern wint_t symbol_negative_sign; -extern wint_t symbol_positive_sign; - /* funcs */ cmpcoll_t get_sort_func(struct sort_mods *sm); Index: sort.1 === RCS file: /cvs/src/usr.bin/sort/sort.1,v retrieving revision 1.58 diff -u -p -r1.58 sort.1 --- sort.1 24 Feb 2019 09:57:43 - 1.58 +++ sort.1 6 May 2019 11:59:44 - @@ -177,10 +177,6 @@ Unknown strings are considered smaller t .It Fl n , Fl Fl numeric-sort , Fl Fl sort Ns = Ns Cm numeric An initial numeric string, consisting of optional blank space, optional minus sign, and zero or more digits (including decimal point) -.\" with -.\" optional radix character and thousands -.\" separator -.\" (as defined in the current locale), is sorted by arithmetic value. Leading blank characters are ignored. .It Fl R , Fl Fl random-sort , Fl Fl sort Ns = Ns Cm random @@ -498,18 +494,6 @@ which has no equivalent. .Sh ENVIRONMENT .Bl -tag -width Fl -.It Ev GNUSORT_NUMERIC_COMPATIBILITY -If defined -.Fl t -will not override the locale numeric symbols, that is, thousand -separators and decimal separators. -By default, if we specify -.Fl t -with the same symbol as the thousand separator or decimal point, -the symbol will be treated as the field separator. -Older behavior was less definite:
Re: perldoc: fix man output & formatting
Hi Andrew, Andrew Daugherity wrote on Thu, May 02, 2019 at 10:53:37AM -0500: > Also, their ToMan patch has a previously-included hunk > for MANWIDTH=tty, All that does is suppress a warning message "non-numeric MANWIDTH" when a user has MANWIDTH=tty in their environment. No idea why any user would have that. I see no reason to merge that part of the patch to OpenBSD. > but it doesn't seem to do anything for me?) No, it doesn't do anything on OpenBSD because _get_columns() never gets called when mandoc(1) is being used. Then again, there is no real reason why _get_columns() isn't used with mandoc: mandoc does support setting the output width. Not via "-rLL=" though, that is a groffism. If people want perldoc(1) to respect the MANWIDTH environment variable, we could commit the following patch (and feed it upstream). I don't really care either way. Regarding the patch below, note that i'm deliberately not doing any validation or truncation. People who trust environment variables enough to use them but not enough to use them without validation seem silly to me. On the one hand, the existing code in ToMan.pm already does validation on MANWIDTH which does not need to be repeated. On the other hand, even if it wouldn't, why not leave the validation to mandoc(1), which already validates the -O width= argument? In mandoc(1)/man(1) itself, i decided against supporting the MANWIDTH variable. In general, i hate environment variables. They stealthily change the behaviour of programs, making debugging harder and causing needless headaches about security boundaries when leaking from one process into its children. In man(1) and mandoc(1), supporting MANWIDTH would be pointless because you can already use the -O width= option if you want to, which is clearer and less sneaky. Yours, Ingo Index: ToMan.pm === RCS file: /cvs/src/gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm,v retrieving revision 1.8 diff -u -p -r1.8 ToMan.pm --- ToMan.pm13 Feb 2019 21:15:14 - 1.8 +++ ToMan.pm3 May 2019 02:49:03 - @@ -227,6 +227,10 @@ sub _collect_nroff_switches { push @render_switches, '-rLL=' . (int $c) . 'n' if $cols > 80; } + if( $self->_is_mandoc ) { + push @render_switches, '-Owidth=' . $self->_get_columns; + } + # I hear persistent reports that adding a -c switch to $render # solves many people's problems. But I also hear that some mans # don't have a -c switch, so that unconditionally adding it here
Re: perldoc: fix man output & formatting
Hi Todd & Andrew, Andrew Fresh wrote on Thu, May 02, 2019 at 09:53:29AM -0700: > On Thu, May 02, 2019 at 10:21:15AM -0600, Todd C. Miller wrote: >> On Thu, 02 May 2019 10:53:37 -0500, Andrew Daugherity wrote: >>> I reported this to FreeBSD ports a couple months ago [2], and they >>> provided a fix [3] which repairs the -oMan output, and makes that the >>> default. Their fix applies cleanly to the OpenBSD tree and works, but >>> I have no idea why mandoc requires resetting $?. (Also, their ToMan >>> patch has a previously-included hunk for MANWIDTH=tty, but it doesn't >>> seem to do anything for me?) >> There problem is a missing waitpid() call, so $? is not actually >> set. Something like the following should be better, though I haven't >> tested it yet. > This seems to work fine for me, OK afresh1@ OK schwarze@, too. > I submitted this upstream as well, so hopefully shouldn't have to keep > the patch for too long. > https://github.com/mrallen1/Pod-Perldoc/pull/39 Thanks. > Still looking at defaulting to -oMan, but probably a good idea. I agree in principle that -oMan ought to become the default. Yours, Ingo >> Index: gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm >> === >> RCS file: >> /cvs/src/gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm,v >> retrieving revision 1.8 >> diff -u -p -u -r1.8 ToMan.pm >> --- gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm 13 Feb >> 2019 21:15:14 - 1.8 >> +++ gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm 2 May >> 2019 16:16:05 - >> @@ -352,6 +352,9 @@ sub _filter_through_nroff { >> close $writer; >> $self->debug( "Done writing\n" ); >> >> +# wait for it to exit >> +waitpid( $pid, 0 ); >> + >> # read any leftovers >> $done .= do { local $/; <$reader> }; >> $self->debug( sprintf "Done reading. Output is %d bytes\n", >> > I wish life had an UNDO function. I didn't know you were *that* cruel. All professional Chess and Go players are going to starve because no game will ever get finished.
Re: kqueue.2: formatting fixes and minor HISTORY expansion
Hi Fabio, Fabio Scotoni wrote on Thu, May 02, 2019 at 03:33:42PM +0200: > I've taken a stab at improving kqueue.2 formatting. > Most of the changes are markup fixes. All your formatting decisions are good. > I used ".Dv NULL" over plain "null" in accordance with > lib/libc/stdlib/malloc.3 rev. 1.113. You are right. There isn't much wrong with using other manual pages as examples of good formatting practices (the risk of picking bad ones is relatively small in OpenBSD). The authoritative sources are, in this case: $ LESS=+t man -O tag=Dv mdoc https://mandoc.bsd.lv/mdoc/appendix/markup.html (search for "null pointer") > I also added a note to the HISTORY section that kqueue()/kevent() have > been available in OpenBSD since 2.9; > the wording matches growfs(8). Yes, that is the standard wording. > I'm not sure how to handle the undocumented EPERM that is returned if a > pledge(2) does not include "proc" when an attempt is made to attach to a > process with EVFILT_PROC. > It does feel somewhat non-obvious, but is noted in neither pledge(2) nor > kqueue(2). Seems like a good question to me, but i'll not mix that into a mere formatting cleanup commit. Committed with minor tweaks, thanks. Ingo
Re: perldoc: fix man output & formatting
Hi Andrew, Andrew Fresh wrote on Fri, May 03, 2019 at 06:24:16PM -0700: > I committed this after moving the waitpid down a few lines, after the > last read from the filehandle. Oops. Sorry for missing that, and thanks for committing it, and for catching that additional issue. > I will say that `perldoc -oterm unicook` looks a lot better with the > right fonts and LC_CTYPE=en_US.UTF-8 than the output from -oman, but it > looks a lot worse with LC_CTYPE=C so probably -oman is a better default. Specifically, LC_CTYPE=en_US.UTF-8 LESS=-R perldoc -oterm # produces UTF-8 (fine) LC_CTYPE=en_US.UTF-8 perldoc -oman # produces UTF-8 (fine) # with the patch below LC_CTYPE=C LESS=-R perldoc -oterm unicook# produces UTF-8 (bad) LC_CTYPE=C perldoc -oman unicook # produces ASCII (good) So -oterm is definitely a bad default, and fixing it may be a waste of time. I see no point in maintaining alternative output modes that provide little to no benefit. > Perhaps I can figure out how to get ToMan to do the right thing in the > right locale someday. The first chunk in the patch below seems to be all that is needed. IIRC, we already do a similar thing in pod2man(1). Of course, both chunks can be OK'ed / objected to independently, they are completely independent of each other. I'm merely leaving both in place such that they don't get forgotten. Yours, Ingo Index: ToMan.pm === RCS file: /cvs/src/gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm,v retrieving revision 1.9 diff -u -p -r1.9 ToMan.pm --- ToMan.pm4 May 2019 01:14:34 - 1.9 +++ ToMan.pm5 May 2019 14:39:01 - @@ -144,7 +144,9 @@ sub _get_podman_switches { # # See RT #77465 # -#push @switches, 'utf8' => 1; +# Then again, do *not* comment it out on OpenBSD: +# mandoc handles UTF-8 input just fine. +push @switches, 'utf8' => 1; $self->debug( "Pod::Man switches are [@switches]\n" ); @@ -225,6 +227,10 @@ sub _collect_nroff_switches { my $c = $cols * 39 / 40; $cols = $c > $cols - 2 ? $c : $cols -2; push @render_switches, '-rLL=' . (int $c) . 'n' if $cols > 80; + } + + if( $self->_is_mandoc ) { + push @render_switches, '-Owidth=' . $self->_get_columns; } # I hear persistent reports that adding a -c switch to $render
Re: perldoc: fix man output & formatting
Hi Andrew, Andrew Fresh wrote on Sun, May 05, 2019 at 02:44:58PM -0700: > On Sun, May 05, 2019 at 04:53:05PM +0200, Ingo Schwarze wrote: >> The first chunk in the patch below seems to be all that is needed. >> IIRC, we already do a similar thing in pod2man(1). > This is true, we do similar in pod2man and so the same fix here seems > correct. > > When trying to push this upstream I will probably set it to 1 only if > $self->_is_mandoc, and then someone else can figure out adding it if > they're stuck on groff. Right, let someone else figure out which ancient versions of groff support UTF-8 in which ways... :-) Solaris 11 has 1.22 (2013), Solaris 9 has 1.21 (2009), and rumour has it that MacOS X might still have 1.19 (2004/05)... > (that is assuming that there's not some crappy mandoc that's popular > on some strange system that perl wants to support, I very much doubt that, unless some system uses mandoc that i never heard about. Look at https://mandoc.bsd.lv/ports.html : The oldest mandoc still in use anywhere is 1.14.3, on Alpine and on Ubuntu-oldstable Aardvark. Oh wait, Debian-stable Stretch still has 1.13.3, but even that should be just fine: according to https://mandoc.bsd.lv/devhistory.html , direct UTF-8 input to mandoc works since 1.13.2 (Dec. 2014). Well, DragonFly still has 1.13.1, but they are totally outdated anyway and while they do ship an ancient mandoc, they don't actually use it for anything. > but I will suggest it anyway. Sounds like a safe bet. > It does appear that the -Owidth patch does what Pod::Perldoc wants it > to. So, although I find it harder to read when it spans a wide > terminal, True, but using MANWIDTH and falling back to stty(1) is an upstream user interface decision which we probably shouldn't interfere with. It has nothing to do with the rendering backend in use. If people dislike the upstream default, they can use a narrower window or "export MANWIDTH=80" or "stty columns 80". > both of these are: > OK afresh1@ Both committed, thanks for checking. Ingo > > Index: ToMan.pm > > === > > RCS file: > > /cvs/src/gnu/usr.bin/perl/cpan/Pod-Perldoc/lib/Pod/Perldoc/ToMan.pm,v > > retrieving revision 1.9 > > diff -u -p -r1.9 ToMan.pm > > --- ToMan.pm4 May 2019 01:14:34 - 1.9 > > +++ ToMan.pm5 May 2019 14:39:01 - > > @@ -144,7 +144,9 @@ sub _get_podman_switches { > > # > > # See RT #77465 > > # > > -#push @switches, 'utf8' => 1; > > +# Then again, do *not* comment it out on OpenBSD: > > +# mandoc handles UTF-8 input just fine. > > +push @switches, 'utf8' => 1; > > > > $self->debug( "Pod::Man switches are [@switches]\n" ); > > > > @@ -225,6 +227,10 @@ sub _collect_nroff_switches { > > my $c = $cols * 39 / 40; > > $cols = $c > $cols - 2 ? $c : $cols -2; > > push @render_switches, '-rLL=' . (int $c) . 'n' if $cols > 80; > > + } > > + > > + if( $self->_is_mandoc ) { > > + push @render_switches, '-Owidth=' . $self->_get_columns; > > } > > > > # I hear persistent reports that adding a -c switch to $render
less(1) UTF-8 cleanup: cvt.c
Hi Todd, Todd C. Miller wrote on Wed, May 08, 2019 at 06:36:30AM -0600: > On Tue, 07 May 2019 23:32:20 +0200, Ingo Schwarze wrote: >> Here is basic cleanup of the last major function in line.c, pshift(). > Looks good. OK millert@ Thanks for checking, i will commit the line.c patch later today. Here is a patch cleaning up all major issues in the file cvt.c. They are all in the function cvt_text(), which is used for setting up patterns and for searching. One minor issue remains, usage of the not-so-nice function is_ansi_middle(), but this doesn't seem the right time yet to worry about that detail. This patch deletes the abominable function put_wchar() which accepts many invalid codepoints and happily produces invalid results, replacing its only call with the standard function wctomb(3). Note that uncharacteristically, no error handling is needed after the call to wctomb(3) because the wchar_t passed into it just came out of either mbtowc(3) with a positive return value or out of towlower(3), so we already know it is valid and not L'\0'. The patch also gets rid of one automatic LWCHAR variable and one call to the bad function step_char(). Note that memory allocation before calls to cvt_text() - in file pattern.c, function compile_pattern() and in file search.c, function search_range() - looks suspicious. In theory, the UTF-8 representation of an uppercase character might contain fewer bytes than the UTF-8 representation of the lowercase character towlower() converts it to, in which case cvt_text() would write beyond the end of the dst buffer. I don't know whether uppercase characters with this property actually exist in Unicode or might appear in the future - but no doubt the Unicode character table is dark and full of terrors. Then again, that's a different (potential) issue which i won't try to fix with the present patch. I tested that searching for strings containing non-ASCII UTF-8 characters still works with -I, with -i, and without either, in all cases with the correct case-related behaviour. OK? Ingo Index: charset.c === RCS file: /cvs/src/usr.bin/less/charset.c,v retrieving revision 1.22 diff -u -p -r1.22 charset.c --- charset.c 7 May 2019 14:26:38 - 1.22 +++ charset.c 8 May 2019 14:03:01 - @@ -329,48 +329,6 @@ get_wchar(const char *p) } /* - * Store a character into a UTF-8 string. - */ -void -put_wchar(char **pp, LWCHAR ch) -{ - if (!utf_mode || ch < 0x80) { - /* 0xxx */ - *(*pp)++ = (char)ch; - } else if (ch < 0x800) { - /* 110x 10xx */ - *(*pp)++ = (char)(0xC0 | ((ch >> 6) & 0x1F)); - *(*pp)++ = (char)(0x80 | (ch & 0x3F)); - } else if (ch < 0x1) { - /* 1110 10xx 10xx */ - *(*pp)++ = (char)(0xE0 | ((ch >> 12) & 0x0F)); - *(*pp)++ = (char)(0x80 | ((ch >> 6) & 0x3F)); - *(*pp)++ = (char)(0x80 | (ch & 0x3F)); - } else if (ch < 0x20) { - /* 0xxx 10xx 10xx 10xx */ - *(*pp)++ = (char)(0xF0 | ((ch >> 18) & 0x07)); - *(*pp)++ = (char)(0x80 | ((ch >> 12) & 0x3F)); - *(*pp)++ = (char)(0x80 | ((ch >> 6) & 0x3F)); - *(*pp)++ = (char)(0x80 | (ch & 0x3F)); - } else if (ch < 0x400) { - /* 10xx 10xx 10xx 10xx 10xx */ - *(*pp)++ = (char)(0xF0 | ((ch >> 24) & 0x03)); - *(*pp)++ = (char)(0x80 | ((ch >> 18) & 0x3F)); - *(*pp)++ = (char)(0x80 | ((ch >> 12) & 0x3F)); - *(*pp)++ = (char)(0x80 | ((ch >> 6) & 0x3F)); - *(*pp)++ = (char)(0x80 | (ch & 0x3F)); - } else { - /* 110x 10xx 10xx 10xx 10xx 10xx */ - *(*pp)++ = (char)(0xF0 | ((ch >> 30) & 0x01)); - *(*pp)++ = (char)(0x80 | ((ch >> 24) & 0x3F)); - *(*pp)++ = (char)(0x80 | ((ch >> 18) & 0x3F)); - *(*pp)++ = (char)(0x80 | ((ch >> 12) & 0x3F)); - *(*pp)++ = (char)(0x80 | ((ch >> 6) & 0x3F)); - *(*pp)++ = (char)(0x80 | (ch & 0x3F)); - } -} - -/* * Step forward or backward one character in a string. */ LWCHAR Index: cvt.c === RCS file: /cvs/src/usr.bin/less/cvt.c,v retrieving revision 1.9 diff -u -p -r1.9 cvt.c --- cvt.c 26 Feb 2019 11:01:54 - 1.9 +++ cvt.c 8 May 2019 14:03:01 - @@ -60,7 +60,8 @@ cvt_text(char *odst, char *osrc, int *ch char *edst = odst; char *src; char *src_end; - LWCHAR ch; + wchar_t ch; + int len; if (
Re: rtable_walk(9)
Hi Martin, Martin Pieuchot wrote on Thu, Jul 11, 2019 at 05:18:41PM -0300: > Index: rtable_walk.9 > === > RCS file: rtable_walk.9 > diff -N rtable_walk.9 > --- /dev/null 1 Jan 1970 00:00:00 - > +++ rtable_walk.9 11 Jul 2019 20:16:23 - > @@ -0,0 +1,68 @@ > +.\" $OpenBSD$ > +.\" > +.\" Copyright (c) 2019 Martin Pieuchot > +.\" > +.\" Permission to use, copy, modify, and distribute this software for any > +.\" purpose with or without fee is hereby granted, provided that the above > +.\" copyright notice and this permission notice appear in all copies. > +.\" > +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > +.\" > +.Dd $Mdocdate$ > +.Dt RTABLE_WALK 9 > +.Os > +.Sh NAME > +.Nm rtable_walk > +.Nd iterate over a routing table > +.Sh SYNOPSIS > +.In net/rtable.h > +.Ft int > +.Fn rtable_walk "unsigned int rtableid" "sa_family_t af" \ > +"struct rtentry **prt" "int (*func)(struct rtentry *, void *, unsigned int)" > \ > +"void *arg" While this isn't incorrect, i suggest the more readable .Ft int .Fo rtable_walk .Fa "unsigned int rtableid" .Fa "sa_family_t af" .Fa "struct rtentry **prt" .Fa "int (*func)(struct rtentry *, void *, unsigned int)" .Fa "void *arg" .Fc for functions with long arguments or with more than one or two arguments. > +.Sh DESCRIPTION > +The > +.Fn rtable_walk > +function iterates over the routing table > +.Fa rtableid > +and applies > +.Fa func > +to all entries of address family > +.Fa af . > +.Pp > +The iteration is interrupted as soon as > +.Fa func > +returns a non-zero value. > +If > +.Fa prt > +is non-null Please consider the more usual form: is not .Dv NULL These are not objections but merely suggestions. Yours, Ingo > when the iteration is interrupted, it is set to the current > +routing entry. > +In that case > +.Fn rtfree > +must be called on the routing entry pointed by > +.Fa prt . > +.Sh CONTEXT > +.Fn rtable_walk > +can be called during autoconf or from process context. > +.Sh RETURN VALUES > +.Fn rtable_walk > +returns any non-zero value returned by > +.Fa func . > +It may also fail with: > +.Pp > +.Bl -tag -width Er -compact > +.It Bq Er EAFNOSUPPORT > +A routing table with ID of > +.Fa rtableid > +and address family of > +.Fa af > +doesn't exist. > +.El > +.Sh SEE ALSO > +.Xr rtfree 9
Re: sysupgrade: select sets to install
Hi Theo, Theo de Raadt wrote on Tue, Jul 09, 2019 at 09:23:25AM -0600: > Klemens Nanni wrote: >> I think sysupgrade should, if at all, use the same semantics as the >> installer. That is, something like `sysugprade -S '-* b*'" to upgrade >> nothing but kernels and base. >> >> Such options offer great potential for users to shoot themselves in the >> foot by doing partial upgrades; I am not really sold on the idea, yet. > From time to time I consider merging all the sets into baseXX.tgz. That sounds reasonable to me. Having separate sets was probably useful in the 1980ies, but nowadays, it provides little benefit in a general-purpose operating system, and getting rid of it would reduce maintenance effort and recurring confusion when people shoot themselves in the foot by not installings parts of the operating system they actually want to use, then asking questions why their system isn't working as expected. By the way, on amd64, merging in game65 would make base65 1.3% larger, man65 3.5%, and even comp65 only 35%. It seems similar to avoiding flavours in ports if those flavours provide little benefit: KISS. Then again, merging the sets causes some work and churn and certainly isn't an urgent task, but eventually and at a convenient time, i expect that it should and will happen. Yours, Ingo
Re: ed(1) man page doesn't mention use of single / and ?
Hi Jason, Jason McIntyre wrote on Sun, Jul 07, 2019 at 08:56:43PM +0100: > this is getting silly - we're mixing talking about how /re/ and ?re? > work as addresses and as regular expressions in other places. dropping > the trailing [/?] does not apply to regular expressions everywhere. Actually, it does, though you are right in so far as the three cases (addresses, global searches, and replacements) differ in subtle ways. > even if i have this wrong (to be fair, not that unlikely), we should > just concentrate on one thing at a time. We have now taken care of the case of addresses. The case of replacements is already described correctly: > regarding how dropping two delimiters works in an 's' command, although > i'm not sure really of the reason, you can do: > 1,$s/re > which effectively drops /replacement/ Yes, this is described as follows: If one or two of the last delimiters is omitted, then the last line affected is printed as though the print suffix p were specified. Admittedly, the grammar is awkward ("two of things is omitted"), and it is not said explicitly that omitting both delimiters implies deleting the re without a replacement - but one could argue that is implicitly to be expected because omitting the middle delimiter implies that the replacement can only be empty. If you think this should be improved, we can polish it later. So what remains to be decsribed is the case of global searches (gGvV). Here is what POSIX says about those: If the closing delimiter of an RE or of a replacement string (for example, '/' ) in a g, G, s, v, or V command would be the last character before a , that delimiter can be omitted, in which case the addressed line shall be written. Our implementation appears to comply. While documenting this, also fix a few glitches in the vicinity: * The wording "the current address is set to the last line affected by command-list" can easily be misunderstood. The temporal sense of "last" is intended, but it can easily be read as the last (i.e. highest numbered) line in the file. So let's better talk about the "last command executed", not about the "last line". You can easily see that this is correct with a command like g/.../+p\ -p . which, for each matching line, first prints the following line, then the matching line itself, and leaves the matching line (printed by the last command) as the current line, not the following line, even though the following line was also "affected" and comes later in the file. * Avoid the confusing wording "a newline alone in command-list". The newline is not *in* the command list, but it comes after the command list, terminating it. * For G, explicitly say that an empty command list has a different effect compared to g: for g, it does the same as p, whereas for G, it does nothing at all. There is nothing to do for v and V, which mostly point to g and G only. OK for the patch below? Yours, Ingo Index: ed.1 === RCS file: /cvs/src/bin/ed/ed.1,v retrieving revision 1.72 diff -u -p -r1.72 ed.1 --- ed.112 Jul 2019 19:28:48 - 1.72 +++ ed.112 Jul 2019 22:17:47 - @@ -377,7 +377,7 @@ The current address is set to the line c command-list is executed. At the end of the .Ic g -command, the current address is set to the last line affected by command-list. +command, the current address remains as set by the last command executed. If no lines were matched, the current line number remains unchanged. .Pp @@ -392,22 +392,26 @@ Any commands are allowed, except for .Ic v , and .Ic V . -A newline alone in command-list is equivalent to a +An empty +.Ar command-list +is equivalent to a .Ic p command. +In this case, the trailing slash can be omitted. .Sm off .It (1,$) Ic G No / Ar re No / .Sm on Interactively edits the addressed lines matching a regular expression .Ar re . +The trailing slash after +.Ar re +can be omitted. For each matching line, the line is printed, the current address is set, and the user is prompted to enter a .Ar command-list . At the end of the .Ic G -command, the current address is set to the last line affected by -.Pq the last -command-list. +command, the current address remains as set by the last command executed. If no lines were matched, the current line number remains unchanged. .Pp @@ -415,11 +419,10 @@ The format of .Ar command-list is the same as that of the .Ic g -command. -A newline alone acts as a null command list. +command, but an empty command list does nothing. A single .Sq & -repeats the last non-null command list. +repeats the last non-empty command list. .It Ic H Toggles the printing of error explanations. By default, explanations are not printed.
Re: ldomctl: improve usage
Hi Klemens, Klemens Nanni wrote on Sun, Jun 30, 2019 at 01:04:17PM +0200: > On Sat, Jun 22, 2019 at 11:20:10AM -0600, Theo de Raadt wrote: >> The problem is when I'm on screens that don't have scroll-back, those 9 >> lines have scrolled other information off the top, and then I've had to >> repeat the operations, or if not possible, been more frustrated. > Should we change those programs and refer to manual page or show a > synopsis one-liner instead? Yes, i think programs emitting excessive amounts of text (when called with a typo, and even more so when doing that always on startup, like gdb(1)) should be fixed to shut the hell up. This is Unix. > I get the scroll-back argument, but I dare say it's a weak one given > that this is something users can fix most of the time, e.g. by using > tmux. I find it offensive being told to use tmux(1) merely because some programs like to chew my ears off. I get it that many developers find tmux(1) useful for many purposes, but the availability of add-on tools allowing workarounds is no excuse for programs being annoying in my face. Also, with tmux(1), i might get scrollback. But having the immediate command history remain on screen is still much better than being able to scroll around and having to look for the relevant lines buried among long, irrelevant output. Yes, long usage is mostly irrelvant: a typo doesn't mean i need to re-read a long list of commands provided. > I'd really like to have a valuable usage in ldomctl and haven't > come up with a better and still consistent way of doing so. There is consensus two parts of documentation are useful: * A short usage message shown after typos, typically one or two lines. * The manual page, containing complete and concise reference documentation. Personally, i wouldn't object to exceptionally complicated programs (like openrsync(1) or ldomctl(1)) providing command summaries (typically one line per command) *when the user explicitly asks for it* with an option like -h or --help. But i do see Theo's argument that likely we don't really need three different levels of detail, adding additional maintenance effort and danger of getting outdated. > ssh-keygen is another good example of long usage that is most probably > not going to change back to a simpler one; I just stumbled over it due > to wrong usage and was therefore reminded of this mail thread. > > What do other (ldomctl) users say? This isn't really a question for ldomctl users at this point. Usage should be very short as a matter consistency of the operating system; in that sense, i do consider ssh-keygen(1) an offender. *If* we decide that it would be valuable for selected programs to provide an intermediate -h or --help, only then would it become a question for ldomctl users whether ldomctl(8) is one of them. But so far, i don't see consensus that we want to allow adding such -h or --help options to base programs. I'm mildly in favour if in favour at all, and i have heard strong opposition, with the reasonable argument that it adds maintenance effort for little gain because it's not really that hard to find the information quickly in a real manual page. Yours, Ingo
Re: sleep(1): simplify argument parsing
Hi Scott, Scott Cheloha wrote on Sun, Jun 30, 2019 at 06:49:28AM -0500: > This is cleaner, shorter. > > - Remove the intermediate variables and just build the timespec >directly. > > - Use for-loops to consolidate initialization/incrementation of cp >into one spot in each loop. > > - Use one loop to parse fractions of a second: less duplicate code >keeps the fraction path shorter. Expand the comment to detail why >this is fine. We then explain the magic multiplier at the point >of use instead of making you read the next loop to find out what >happened in the prior loop. I like that. It is easier to understand and survived my code review and testing. > ok? OK schwarze@ Ingo > Index: sleep.c > === > RCS file: /cvs/src/bin/sleep/sleep.c,v > retrieving revision 1.27 > diff -u -p -r1.27 sleep.c > --- sleep.c 10 Jan 2019 16:41:10 - 1.27 > +++ sleep.c 30 Jun 2019 11:22:58 - > @@ -49,9 +49,8 @@ int > main(int argc, char *argv[]) > { > int ch; > - time_t secs = 0, t; > + time_t t; > char *cp; > - long nsecs = 0; > struct timespec rqtp; > int i; > > @@ -71,40 +70,32 @@ main(int argc, char *argv[]) > if (argc != 1) > usage(); > > - cp = *argv; > - while ((*cp != '\0') && (*cp != '.')) { > + timespecclear(); > + > + /* Handle whole seconds. */ > + for (cp = *argv; *cp != '\0' && *cp != '.'; cp++) { > if (!isdigit((unsigned char)*cp)) > errx(1, "seconds is invalid: %s", *argv); > - t = (secs * 10) + (*cp++ - '0'); > - if (t / 10 != secs) /* oflow */ > + t = (rqtp.tv_sec * 10) + (*cp - '0'); > + if (t / 10 != rqtp.tv_sec) /* overflow */ > errx(1, "seconds is too large: %s", *argv); > - secs = t; > + rqtp.tv_sec = t; > } > > - /* Handle fractions of a second */ > + /* > + * Handle fractions of a second. The multiplier divides to zero > + * after nine digits so anything more precise than a nanosecond is > + * validated but not used. > + */ > if (*cp == '.') { > - cp++; > - for (i = 1; i > 0; i /= 10) { > - if (*cp == '\0') > - break; > + i = 1; > + for (cp++; *cp != '\0'; cp++) { > if (!isdigit((unsigned char)*cp)) > errx(1, "seconds is invalid: %s", *argv); > - nsecs += (*cp++ - '0') * i; > - } > - > - /* > - * We parse all the way down to nanoseconds > - * in the above for loop. Be pedantic about > - * checking the rest of the argument. > - */ > - while (*cp != '\0') { > - if (!isdigit((unsigned char)*cp++)) > - errx(1, "seconds is invalid: %s", *argv); > + rqtp.tv_nsec += (*cp - '0') * i; > + i /= 10; > } > } > - > - rqtp.tv_sec = secs; > - rqtp.tv_nsec = nsecs; > > if (timespecisset()) { > if (nanosleep(, NULL) == -1)
Re: new man page: rcsfile.5
Hi Fabio, Fabio Scotoni wrote on Tue, Apr 23, 2019 at 01:45:22PM +0200: > While familiarizing myself with OpenRCS, > I noticed that there's no rcsfile(5). > This may be handy for people whose RCS file got corrupted for whatever > reasons. > Other *NIXes that ship SCCS tend to have a sccsfile(4), > so I figured it could be worthwhile to have something of that nature for > OpenRCS. I like this. Especially when a file format is human-readable, non-trivial, and easy to describe, describing it is worthwhile. I think the source should go to /usr/src/usr.bin/rcs/rcs.5 and it should be installed to /usr/share/man/man5/rcs.5. By the way, i recently had to hand-edit RCS files, for merging the histories from multiple different historical CVS repositories. This manual page is useful for such work. > Feedback would be appreciated, Some minor remarks inline. They are not critical, though; polishing of manual pages can also be done in-tree. > as would be proofreading by someone who's > done more work with RCS files than I have. That wouldn't be me. Either way, an additional OK to import would be appreciated. Yours, Ingo At the top of the file, a Copyright notice and license is missing. Please conisder using /usr/share/misc/license.template with the comment markers changed to .\" . > .Dd $Mdocdate$ > .Dt RCSFILE 5 > .Os > .Sh NAME > .Nm rcsfile > .Nd format of an RCS file > .Sh DESCRIPTION > An RCS file is a text file. > It consists of multiple sections, each separated by two empty lines. > RCS files should not be edited by hand; > the RCS programs should be used instead. the .Xr rcs 1 programs should be used instead. > .Pp > The RCS file consists of four sections: > .Bl -tag -width "deltatexts" -compact > .It admin > Administrative data about the RCS file itself. > .It deltas > List of deltas, i.e. metadata about changes. > .It desc > A description of what the file. > .It deltatexts > List of the changes made by each delta. > .El Please avoid inventing names that don't actually appear in the file. I suggest using: The RCS file consists of four sections: .Bl -enum -with 2n -compact .It Administrative data about the RCS file itself. .It List of deltas, i.e. metadata about changes. .It A description of what the file. .It List of the changes made by each delta. .El In that case, the structure of the rest of the text would remain more or less unchanged. Alternatively, you could shorten the structure as follows: programs should be used instead. .Ss Administrative data The first section contains administrative data about the RCS file itself. ... .Ss List of deltas And so on. > .Pp > The RCS file begins with admin data. > Each entry in this section consists of either one or multiple values. > Entries that take one value are specified on one line: > The keyword and its corresponding value are separated by > a tab character. > Entries that take multiple values begin with a keyword on one line; > the values are specified on subsequent lines, one per line and > prefixed with a tab character. > If the list is empty, the semicolon immediately follows the keyword. > Entries are terminated with a semicolon. > The entries are: > .Bl -tag -width Ds > .It Cm head Ar rev I'd prefer "Ic head" (and similarly in the following) because "head" is a stand-alone keyword, not an argument to something else. > Highest revision number. > .It Cm branch Ar rev > Revision number that specifies the default branch (optional). > .It Cm access > List of users that are allowed to check in new revisions with > .Xr ci 1 . > This keyword is not followed by a tab; > instead, the list of users is specified with one username per line. > The list is terminated by a single semicolon. > .It Cm symbols > List of symbolic names used as aliases for specific revisions. > Each entry consists of a symbolic name followed by a colon and > the revision. > .It Cm locks > List of locked revisions. > Each entry consists of a username followed by a colon and > the locked revision. > .It Cm strict > Indicates that locks are strict; > if missing, locks are not strict. > This entry takes no value and immediately follows the list of locks > without a newline. > .It Cm comment @ Ns Ar leader Ns Cm @ > Contains the comment leader surrounded by at signs. > .It Cm expand @ Ns Ar mode Ns Cm @ > Indicates the keyword substitution mode. > See > .Xr rcs 1 > for the possible keyword substitution modes. > If this entry is not given in the admin section, > .Ar kv > must be assumed. > .El > .Pp > The RCS file continues with a list of deltas. > Deltas are separated by empty lines. > The list begins with the head and ends with the initial revision. > Each entry begins with the revision number, followed by a newline. > After that, the structure follows the admin section. > .Bl -tag -width Ds > .It Cm date Ar .mm.dd.HH.MM.SS > Indicates the date and time the revision was checked in. Indicates the date and UTC time the ... > The date consists of the year, the
Re: new man page: rcsfile.5
Hi Fabio, i applied a very small number of minor tweaks and plan to commit the patch below tomorrow based on OK jmc@ deraadt@ millert@, unless i hear objections. Fabio Scotoni wrote on Tue, Apr 23, 2019 at 05:19:48PM +0200: > I've adjusted the updated man page (inline) .Dt and .Nm to match this. > > However, I would advise you not to. > rcs(5) may be surprising for anyone coming from the other > implementations of single-file version control systems: > - GNU RCS ships rcsfile(5), and > - all versions of SCCS ship sccsfile(4) (even Schily and BitKeeper). Ouch. I didn't intend to suggest changing the filename. Sorry, that was a mistake on my part. I only meant to suggest directories where the file should live. I assume those who responded positively are OK with /usr/src/usr.bin/rcs/rcsfile.5 /usr/share/man/man5/rcsfile.5 as well. I think you are right that it is better to stick to the filename used traditionally. >> At the top of the file, a Copyright notice and license is missing. >> Please conisder using /usr/share/misc/license.template with the >> comment markers changed to .\" . > I intentionally skipped on adding the copyright notice and license > because I felt it would be presumptuous -- I wanted to keep open > the option of a copyright assignment to an OpenBSD contributor open > in case the OpenBSD project wished for it, just in case. If you weren't a lawyer, i would feel tempted to suspect that you misunderstood how Copyright works in general. As things stand, i may be misunderstanding what you intend to say here. Either way, i suspect you misunderstand how ownership of intellectual property works in OpenBSD in particular. By international law (Berne Convention), you cannot give away Copyright. Copyright always resides with the natural person who is the original creator of a Work. It is inalienable. The "Copyright" line is merely a statement of fact. Is it correct that you wrote all the text in the file? If so, your name needs to be there. According to the Berne Convention, even though you cannot give away the moral rights that constitute the core of Copyright, you can give away the economic rights arising from a Work. Some publishers of software (for example the FSF) ask for that. But in OpenBSD, we strongly encourage original authors to retain all economic rights arising from their Works, and simply grant a standardized license themselves, such that OpenBSD can distribute the Works. The OpenBSD project doesn't own any rights and isn't even a legal entity. > Added in the updated man page inline, > including the $OpenBSD$ keyword at the top. Thank you for granting the license on your Work. > Your suggestions and/or subsequent in-tree changes may warrant > a copyright notice of your own. Technically, if anybody changes a file in a way adding a non-trivial amount of text - even one or two sentences will usally be sufficient, arguably even a few words if they are important and difficult to come up with -, they own Copyright on the changes and can make it known that they do by adding a "Copyright" line of their own to the file header. That isn't required though, the Copyright exists whether or not it is stated. To keep file headers concise and because tracking all Copyright-worthy changes in file headers is tedious, we tend to only add such additional lines for massive contributions, or if an authors cares unusually strongly about their contributions to a particular file. Yours, Ingo Index: Makefile === RCS file: /cvs/src/usr.bin/rcs/Makefile,v retrieving revision 1.40 diff -u -p -r1.40 Makefile --- Makefile15 Oct 2010 08:44:12 - 1.40 +++ Makefile23 Apr 2019 17:04:55 - @@ -1,7 +1,9 @@ # $OpenBSD: Makefile,v 1.40 2010/10/15 08:44:12 tobias Exp $ PROG= rcs -MAN= ci.1 co.1 ident.1 merge.1 rcs.1 rcsclean.1 rcsdiff.1 rcsmerge.1 rlog.1 +MAN= ci.1 co.1 ident.1 merge.1 \ + rcs.1 rcsclean.1 rcsdiff.1 rcsmerge.1 rlog.1 \ + rcsfile.5 SRCS= ci.c co.c ident.c merge.c rcsclean.c rcsdiff.c rcsmerge.c rcsparse.c \ rcsprog.c rlog.c rcsutil.c buf.c date.y diff.c diff3.c rcs.c rcsnum.c \ Index: rcsfile.5 === RCS file: rcsfile.5 diff -N rcsfile.5 --- /dev/null 1 Jan 1970 00:00:00 - +++ rcsfile.5 23 Apr 2019 17:04:55 - @@ -0,0 +1,154 @@ +.\" $OpenBSD$ +.\" +.\" Copyright (c) 2019 Fabio Scotoni +.\" +.\" Permission to use, copy, modify, and distribute this software for any +.\" purpose with or without fee is hereby granted, provided that the above +.\" copyright notice and this permission notice appear in all copies. +.\" +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +.\" ANY SPECIAL, DIRECT, INDIRECT, OR
Re: new man page: rcsfile.5
Hi Fabio, Fabio Scotoni wrote on Tue, Apr 23, 2019 at 07:46:31PM +0200: > On 4/23/19 7:15 PM, Ingo Schwarze wrote: >> Is it correct that you wrote all the text in the file? >> If so, your name needs to be there. > It is indeed correct that I wrote all the text in the file (except, by > definition, your suggestions). Thanks for confirming. (Misunderstandings about authorship would be bad.) >> Index: rcsfile.5 [...] >> +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. >> +.\" > Nitpick: The other man pages in the usr.bin/rcs/ tree do not have a > trailing .\" -- except for rcs(1). > You may wish to make this consistent this one way or another, > likely in a separate commit. I put the .\" here because i guess that's the more usual way in OpenBSD, but it's not important enough for changing it in existing pages. [...] >> +.Bl -tag -width Ds >> +.It Ic date Ar . Ar mm . Ar dd . Ar HH . Ar MM . Ar SS > Are you sure you want to do this without Pf/Ns? > You may have inadvertently introduced spaces here and Oops, indeed, thanks for catching that. Fixed in my tree: .It Ic date Ar . Ns Ar mm . Ns Ar dd . Ns Ar HH . Ns Ar MM . Ns Ar SS > Maybe this should be [YY]YY because two-digit years do exist. I think the Oo Ar YY Oc Ns Ar YY . which that would require would be even more complicated than what is there now. The can be considered a placeholder, and the text already makes it clear that the actual value can be shorter. Yours, Ingo
Re: new man page: rcsfile.5
Hi Fabio, Fabio Scotoni wrote on Wed, Apr 24, 2019 at 06:14:07AM +0200: > On 4/23/19 9:30 PM, Ingo Schwarze wrote: >> Fabio Scotoni wrote: >>> Ingo Schwarze wrote: >>>> +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. >>>> +.\" >>> Nitpick: The other man pages in the usr.bin/rcs/ tree do not have a >>> trailing .\" -- except for rcs(1). >>> You may wish to make this consistent this one way or another, >>> likely in a separate commit. >> I put the .\" here because i guess that's the more usual way in >> OpenBSD, but it's not important enough for changing it in existing >> pages. > Duly noted. > I wonder if it might be worth noting that somewhere, though I'm not sure > where. It is neither possible to document such minor details (which i would not even call a style convention) nor desirable to attempt to, lest we drown in unimportant rules. I do document some recommendations that are below the threshold of warranting inclusion into manual pages on https://mandoc.bsd.lv/mdoc/ but this particular detail is way below the threshold for inclusion even in that extended documentation. If people do it differently, it doesn't really matter. > With these changes, rcsfile.5 looks okay to me now. This for writing and checking the page. I just committed it. Yours, Ingo
Re: [diff] events.html - add 2 more BSDCan 2019 videos
Hi Ross, Ross L Richardson wrote on Sat, Jun 22, 2019 at 02:17:26PM +1000: > Probably in acceptable form :-) Committed, thanks. Ingo > Index: events.html > === > RCS file: /cvs/www/events.html,v > retrieving revision 1.1174 > diff -u -p -r1.1174 events.html > --- events.html 18 Jun 2019 15:01:52 - 1.1174 > +++ events.html 22 Jun 2019 04:09:17 - > @@ -68,7 +68,9 @@ May 15-18, 2019, Ottawa, Canada. > (http://bhyvecon.org/;>bhyvecon Ottawa 2019) > Bob Beck - > https://github.com/bob-beck/libtls/blob/master/TUTORIAL.md;>libtls > for beginners conference tutorial > -Theo Buehler - Design and verification of the TLS 1.3 handshake state > machine in LibreSSL (slides) > +Theo Buehler - Design and verification of the TLS 1.3 handshake state > + machine in LibreSSL (slides, > + https://www.youtube.com/watch?v=MCVIBwGOwNY;>video) > Florian Obser - https://man.openbsd.org/unwind.8;>unwind(8) >a privilege-separated, validating DNS recursive nameserver for every laptop >(slides, > @@ -82,7 +84,8 @@ May 15-18, 2019, Ottawa, Canada. >(slides, >https://www.youtube.com/watch?v=s6rAXaHylFM;>video) > Bob Beck > -Unveil in OpenBSD > + Unveil in OpenBSD > + (https://www.youtube.com/watch?v=gvmGfpMgny4;>video) > Jan Klemkow - Network booted OpenBSD Workstations >(slides, >https://www.youtube.com/watch?v=kFqHXfWEB4o;>video)
Re: [diff] www/libressl/papers.html - add video link
Hi Ross, Ross L Richardson wrote on Sat, Jun 22, 2019 at 02:24:46PM +1000: > The following corresponds with the events.html patch... Almost... Committed with the correct YT ID MCVIBwGOwNY instead of gvmGfpMgny4 which is beck@'s unveil(2) talk. Thanks anyway, Ingo > Index: papers.html > === > RCS file: /cvs/www/libressl/papers.html,v > retrieving revision 1.16 > diff -u -p -r1.16 papers.html > --- papers.html 13 Jun 2019 07:34:36 - 1.16 > +++ papers.html 22 Jun 2019 04:20:59 - > @@ -19,7 +19,7 @@ Presentations and Papers > > https://www.openbsd.org/papers/bsdcan2019-tls13.pdf;>Design and > verification of the TLS 1.3 handshake state machine in LibreSSL > by Theo Buehler > - > + ( href="https://www.youtube.com/watch?v=gvmGfpMgny4;>video) > > > Presentation: FSec 2015
Re: add pkg-config files for readline, editline, ncurses
Hi Stephen, Stephen Gregoratto wrote on Fri, Jul 26, 2019 at 01:23:21AM +1000: > mrsh[1], a cross-platform shell, can use readline in interactive mode. > It's configure script detects the presence of readline using > pkg-config(1). Thus, this patch adds a pkg-config file for our readline. It is not obvious to me whether or not that is desirable. It depends on which version of readline mrsh requires. If mrsh builds and is usable with our base system readline (which is quite old), what you propose may help. On the other hand, if it requires a newer readline or doesn't work well with the base readline, adding a library dependency on ports readline might be better. > I just copied over the generate_pkgconfig.sh script/make rules found in > other libraries and edited to fit. > > I also added pkg-config files for libedit and libcurses, I think that ought to be a completely separate change because a program either uses base libedit or base readline or ports readline but never more than one of these. I know relatively little about pkg-config(1); the reason i'm speaking up here is that nobody else spoke up so far and that i occasionally worked on libedit in the past. Now, as far as i understand (take this with a grain of salt), the effect of adding a pkg-config file for libedit would possibly be that additional ports might pick up our base libedit that now think that no libedit is available and consequently compile without it. That might improve some ports - and/or break others. So i suspect that adding libedit.pc to base would probably require a bulk build before commit to make sure nothing breaks. Also - even though it seems less likely that anything in the base system looks at *.pc files - it would probably require a complete make build & release of base and xenocara. For adding readline.pc, the same caveats apply, only even more so. I suspect that the risk of breaking ports by adding readline.pc to base is larger than for editline because probaly more ports use readline than editline and those that do are probably more likely to use pkg-config(1) and those that do both are probably more likely to be unhappy with our base readline, whereas ports looking for editline probably have a better chance to be happy with our base editline. As far as i understand, adding editline.pc does not necessarily require also adding ncurses.pc, nor does it require Requires: lines. So far, we don't have a single non-empty Requires: line in base even though libssl depends on libcrypto. Again, the effect of adding ncurses.pc to base might be that some ports suddenly pick up our base curses, for good or ill. If there is a benefit to that, i suspect it might make sense to test that change independently from libedit.pc. I don't think we want a bulk build with three combined changes and lots of fallout, because then we would be forced to figure out which fallout was caused by which part of the changes, which sounds like gratuitious additional pain. > since libedit requires linking with libcurses. I started getting > build errors before I realised that there are a couple editline > libraries floating around, each differing slightly with each other: *If* we want editline.pc, then i expect we probably just want one, for our own editline. Additional ones would only make sense to me in the context of a port of one of the other editline implementations. But i don't really see the point of such a port. If some port does not work with our editline but does work with a different editline, we should probably either fix our editline or patch the port to work with our editline, depending on what exactly to port is doing. > Some Linux systems (e.g. Arch) generate pkg-config files for the > different "types" of ncurses: > > /usr/lib/pkgconfig/ncurses++.pc -> ncurses++w.pc > /usr/lib/pkgconfig/ncurses++w.pc > /usr/lib/pkgconfig/ncurses.pc -> ncursesw.pc > /usr/lib/pkgconfig/ncursesw.pc > /usr/lib/pkgconfig/tic.pc -> ncursesw.pc > /usr/lib/pkgconfig/tinfo.pc -> ncursesw.pc > > Not sure if this is something I should do here. I doubt it. In any case, the base system should contain one *curses*.pc at most, i think, because we only have one curses in base. Any other *curses*.pc would be a matter of ports - and like for editline, i doubt that having a port of a different version of curses makes sense. That said, even though you are changing the base system, i suspect discussing this kind of patches may be more fruitful on ports@ because that's where all the ports developers are that will be affected by such changes. I suggest that you decide which of the three changes you care about most, then send a patch to ports@ with only that one change, explaining what exactly it is supposed to improve (and how you tested that it indeed does), and what exactly you already tested in addition to that (for example, building which ports that use pkg-config(1) and the library in question), asking for feedback. The feedback you
less(1): utility function to step one multibyte character to the left
Hi, the file less/line.c currently contains three copies of code to step one multibyte character to the left in a char * buffer, and cleaning up the file less/cmdbuf.c will require similar functionality in the future. So let's introduce a new function for that purpose. Keep it as similar as possible to the ANSI C function mbtowc(3). Then again, there is no need to handle the case of the input char * being NULL; for resetting, people should use mbtowc(3) itself, not the new function. The logic in less/cmdbuf.c is so contorted that cleaning that up will still be challenging, but this at least provides one piece to help with the puzzle. Behaviour may change in LC_CTYPE=C mode if linebuf[curr-1] contains an explicit '\0' byte. I think in that case, jumping out of the backc() function, just like it is already done for UTF-8 mode, i.e. not trying to back up across the NUL byte, is more robust behaviour. Either way, linebuf[] is not supposed to contain NUL bytes except in -r mode, and -r mode is almost unusable in the first place. No other functional change is intended. Minus 16 lines of code, by the way... OK? Ingo Index: charset.c === RCS file: /cvs/src/usr.bin/less/charset.c,v retrieving revision 1.24 diff -u -p -r1.24 charset.c --- charset.c 15 May 2019 19:36:20 - 1.24 +++ charset.c 30 Aug 2019 13:02:01 - @@ -146,6 +146,27 @@ init_charset(void) } /* + * Like mbtowc(3), except that it converts the multibyte character + * preceding ps rather than the one starting at ps. + */ +int +mbtowc_left(wchar_t *pwc, const char *ps, size_t psz) +{ + size_t sz = 0; + int len; + + do { + if (++sz > psz) + return -1; + } while (utf_mode && IS_UTF8_TRAIL(ps[-sz])); + if ((len = mbtowc(pwc, ps - sz, sz)) == -1) { + (void)mbtowc(NULL, NULL, 0); + return -1; + } + return len == sz || (len == 0 && sz == 1) ? len : -1; +} + +/* * Is a given character a "control" character? */ static int Index: funcs.h === RCS file: /cvs/src/usr.bin/less/funcs.h,v retrieving revision 1.23 diff -u -p -r1.23 funcs.h --- funcs.h 15 May 2019 19:36:20 - 1.23 +++ funcs.h 30 Aug 2019 13:02:02 - @@ -55,6 +55,7 @@ void ch_set_eof(void); void ch_init(int, int); void ch_close(void); int ch_getflags(void); +int mbtowc_left(wchar_t *, const char *, size_t); void init_charset(void); char *prchar(LWCHAR); char *prutfchar(LWCHAR); Index: line.c === RCS file: /cvs/src/usr.bin/less/line.c,v retrieving revision 1.31 diff -u -p -r1.31 line.c --- line.c 15 May 2019 19:06:01 - 1.31 +++ line.c 30 Aug 2019 13:02:02 - @@ -437,44 +437,20 @@ backc(void) wchar_t ch, prev_ch; int i, len, width; - i = curr - 1; - if (utf_mode) { - while (i >= lmargin && IS_UTF8_TRAIL(linebuf[i])) - i--; - } - if (i < lmargin) + if ((len = mbtowc_left(, linebuf + curr, curr)) <= 0) return (0); - if (utf_mode) { - len = mbtowc(, linebuf + i, curr - i); - if (len == -1 || i + len < curr) { - (void)mbtowc(NULL, NULL, MB_CUR_MAX); - return (0); - } - } else - ch = linebuf[i]; + curr -= len; /* This assumes that there is no '\b' in linebuf. */ - while (curr > lmargin && column > lmargin && - (!(attr[curr - 1] & (AT_ANSI|AT_BINARY { - curr = i--; - if (utf_mode) { - while (i >= lmargin && IS_UTF8_TRAIL(linebuf[i])) - i--; - } - if (i < lmargin) + while (curr >= lmargin && column > lmargin && + !(attr[curr] & (AT_ANSI|AT_BINARY))) { + if ((len = mbtowc_left(_ch, linebuf + curr, curr)) <= 0) prev_ch = L'\0'; - else if (utf_mode) { - len = mbtowc(_ch, linebuf + i, curr - i); - if (len == -1 || i + len < curr) { - (void)mbtowc(NULL, NULL, MB_CUR_MAX); - prev_ch = L'\0'; - } - } else - prev_ch = linebuf[i]; width = pwidth(ch, attr[curr], prev_ch); column -= width; if (width > 0) return (1); + curr -= len; if (prev_ch == L'\0') return (0); ch = prev_ch; @@ -554,21 +530,8 @@ store_char(LWCHAR ch, char a, char *rep, } if (w == -1) { wchar_t prev_ch; - - if
Re: delete ligature support for Arabic "la" from the less(1) command line
Hello Mohammadreza, Mohammadreza Abdollahzadeh wrote on Sun, Sep 01, 2019 at 09:40:16AM +0430: > Persian is my native language and I think that the major problem that > all RTL (Right-To-Left) languages like Persian and Arabic currentlly suffer > from is the lack of BiDi (Bidirectionality) support in console and terminal > environment like xterm(1). KDE konsole(1) support bidi and that's why it > show ligatures correctly. > I think any attempt to fix such problems must first start with adding bidi > support to xterm and other terminal environment. Thank you for your feedback! If i understand correctly, xterm(1) does indeed have that problem. I prepared a test file that contains, in this order, - some Latin characters - the Arabic word "la" ("no"), i.e. first LAM, then ALEF - some more Latin characters - the Arabic word "al" ("the"), i.e. first ALEF, then LAM - some final Latin characters And indeed, xterm(1) does not respect the writing direction of the individual words. When cat(1)'ing the file to stdout, both xterm(1) and konsole(1) show all the words from left to right, but *inside* each word, konsole(1) uses the correct writing direction: right to left for Arabic and left to right for Latin. For example, in the Arabic word "al", konsole(1) correctly shows the ALEF right of the LAM, whereas xterm(1) wrongly shows the ALEF left of the LAM. I'm not entirely sure this has much to do with ligatures, though. What matters for building ligatures is only the logical ordering, the ordering in *time* so to speak, i.e. what comes before and what comes after. LAM before ALEF has to become the ligature glyph "al", whereas ALEF before LAM remains two glyphs. Technically, the question of ordering in space, whether glyphs are painted onto the screen right to left or left to right, only comes into play after characters have already been combined into glyphs. Actually, now that you bring up the topic, i see another situation where less(1) causes an issue. Let's use konsole(1) and not xterm(1) such that we get the correct writing direction, and let's put the word "al" onto the screen. No ligature here, so that part of the topic is suspended for a moment. Now let's slowly scroll right in one-column steps. All is fine as long as the word "al" is completely visible on screen. But when the final letter LAM of "al" is in the last (leftmost) column of the screen and you scroll right one more column, something weird happens, even in konsole(1). You would expect the final letter LAM to scroll off screen first and the initial letter ALEF to remain on the screen for a little longer. Instead, less(1) incorrectly thinks the *initial* letter of the word scrolls off screen first, and it tells xterm(1) to display the ALEF in the leftmost column of the screen while the LAM just went off-screen. That looks weird because there is no word in that text beginning with ALEF. This means that being able to properly view Arabic or Farsi text with the default OpenBSD terminal emulator and parser would require 1. bidi support in xterm(1) to render Farsi words with the correct writing direction 2. ligature support in xterm(1) to correctly connect letters 3. bidi support in less(1) to correctly scroll parts of words on and off screen, horizontally 4. ligature support in less(1) for correct columnation As far as i understand, you are saying that the extremely fragmentary support for item 4 which we happen to have right now is not really useful without items 1-3, and even when using konsole(1), which does have items 1 and 2, implementing item 3 before item 4 would make sense because item 3 is more importrant. So my understanding is that you are not objecting to the patch because the fragmentary support for item 4 is practically useless in isolation. The following is not related to this patch, but i think it makes sense to mention it here: regarding the future, i think items 1 and 3 are much easier to support than items 2 and 4 because bidi support, if i understand correctly, only needs one bit of information per character because it only needs to know whether the character is part of a right to left or left to right script, so the complexity on the libc level, where we want complexity least of all places, is comparable to other boolean character properties like those listed in the iswalnum(3) manual page. Realistically, though, bidi support would still be a large project, and i don't think it makes sense to tackle it any time soon. Ligature support feels much worse than bidi support because the mapping required is not merely character -> boolean but (character + character) -> character, which is more complicated than even the (character + character) -> -1/0/+1 mapping required for collation support - and we decided that we don't want collation support in libc because it would cause excessive complexity. Admittedly, collations are strongly locale-dependent, while i'm not sure ligatures are
Re: libedit: Does not run input command in vi mode
Hello Masato-san, Masato Asou wrote on Tue, Sep 03, 2019 at 02:39:11PM +0900: > Does not run input command by vi editor with vi mode. > > I do the following: > > 1. set vi mode. >$ echo "bind -v" > ~/.editrc > > 2. launch /usr/bin/ftp command. >$ ftp > > 3. launch vi editor with ESC + v. >ftp> ESC + v > > 4. input "help" in vi editor. >i + help + ESC + :wq > > 5. then 'help' command does not run. > > I fix this problem with following patch. This fix is come from NetBSD > lib/libedit/vi.c 1.46 and 1.47. > > ok? Indeed, limiting the length of the command created with vi(1) to the length of the string the command line contained when vi(1) was started is wrong. Both code inspection and testing confirm that this is a bug and that your patch is the correct fix. IK schwarze@; please commit. Thank you, Ingo > Index: vi.c > === > RCS file: /cvs/src/lib/libedit/vi.c,v > retrieving revision 1.27 > diff -u -p -r1.27 vi.c > --- vi.c 3 Sep 2019 02:28:25 - 1.27 > +++ vi.c 3 Sep 2019 05:34:31 - > @@ -1058,12 +1058,12 @@ vi_histedit(EditLine *el, wint_t c __att > while (waitpid(pid, , 0) != pid) > continue; > lseek(fd, (off_t)0, SEEK_SET); > - st = read(fd, cp, TMP_BUFSIZ); > + st = read(fd, cp, TMP_BUFSIZ - 1); > if (st > 0) { > - len = (size_t)(el->el_line.lastchar - > - el->el_line.buffer); > + cp[st] = '\0'; > + len = (size_t)(el->el_line.limit - el->el_line.buffer); > len = mbstowcs(el->el_line.buffer, cp, len); > - if (len > 0 && el->el_line.buffer[len -1] == '\n') > + if (len > 0 && el->el_line.buffer[len - 1] == '\n') > --len; > } > else
Re: typo in man page of sysupgrade option -s
Hi Florian, Florian Obser wrote on Fri, Aug 23, 2019 at 02:43:22PM +0200: > This is not a typo as such, it describes the default. It is a bit > awkward since it uses the same text for -r and -s. I think we can just > remove it from -s: The following seems more straightforward. I don't feel strongly about it, though; you patch is OK with me, too. Yours, Ingo Index: sysupgrade.8 === RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.8,v retrieving revision 1.8 diff -u -r1.8 sysupgrade.8 --- sysupgrade.89 May 2019 21:09:37 - 1.8 +++ sysupgrade.823 Aug 2019 12:54:38 - @@ -62,16 +62,10 @@ but do not reboot. .It Fl r Upgrade to the next release. -The default is to find out if the system is running a release or a snapshot. -In case of release -.Nm -downloads the next release. +This is the default if the system is currently running a release. .It Fl s Upgrade to a snapshot. -The default is to find out if the system is running a release or a snapshot. -In case of release -.Nm -downloads the next release. +This is the default if the system is currently running a snapshot. .El .Sh FILES .Bl -tag -width "/home/_sysupgrade" -compact
less(1): replace the last step_char(-1)
Hi, The command line handling code in less/cmdbuf.c is very complicated. >From the top level function cmd_char(), the stack can go down nine levels before finally reaching the bottom level function cmd_step_common(). One of the functions traversed during that descent is the recursive function cmd_repaint(), and the call tree has many different branches. UTF-8 handling occurs both in the top level function and in the bottom level function, and also at some places in between. So when cleaning up the UTF-8 handling in here, i'm trying to proceed very slowly and in minimal steps. Given that i have little hope of finding a system for testing each and every possible code path, my plan is to make sure that individual changes do not change behaviour locally (or only in intended ways). So, lets get to the first, small step. Right now, only one place remains where the function step_char() is called with dir = -1, i.e. moving the cursor leftward: from cmd_step_left() near the bottom of the call stack. That function is used when moving the cursor left by one character (with or without erase), by one word, or all the way back to the prompt. It is also used to delete leftover characters after repainting the line when the new text is shorter than the old text. And it is also used when scrolling forward. Given the complexity of the code, there may be even more situations. The obvious replacement for step_char(-1) is the new function mbtowc_left() that i recently introduced. Using that allows dropping the "dir" argument from step_char(). Of course, the plan still is to ultimately get rid fo step_char() altogether. Let's make sure the new behaviour makes sense: * When at the beginning of the buffer, do not move, and use L'\0' as a replacement for the character that isn't there. Same behaviour as before. * When there is no valid UTF-8 character to the left (return value -1), back up one byte and use L'\0'. This _is_ a change of behaviour. Current behaviour is to back up across all UTF-8 continuation bytes (no matter how many) until finding a byte that is not a continuation byte, then take the length given in that byte at face value (even if invalid) and construct a codepoint using that number of bytes (even if that implies ignoring additional garbage bytes in between, and even it it means extending the character that is supposedly to the left of the cursor to bytes that are to the right of the current cursor position). I believe making multibyte steps only for valid characters, and stepping byte-by-byte across invalid bytes, is safer than the reckless jumping done right now. Then again, the practical difference may be small (if any) because i'm not quite sure how to get invalid bytes into the command buffer in the first place. * When there is a NUL byte to the left (return value 0), back up by this one byte, and use it as L'\0'. Same behaviour as before. Tested by typing UTF-8 into the command buffer, moving the cursor left and right across it (with and without deleting), inserting UTF-8 before it, and scrolling the UTF-8 off the screen both to the right and to the left. OK? Ingo Index: charset.c === RCS file: /cvs/src/usr.bin/less/charset.c,v retrieving revision 1.26 diff -u -p -r1.26 charset.c --- charset.c 2 Sep 2019 14:07:45 - 1.26 +++ charset.c 2 Sep 2019 15:00:50 - @@ -353,7 +353,7 @@ get_wchar(const char *p) * Step forward or backward one character in a string. */ LWCHAR -step_char(char **pp, int dir, char *limit) +step_char(char **pp, char *limit) { LWCHAR ch; int len; @@ -361,11 +361,8 @@ step_char(char **pp, int dir, char *limi if (!utf_mode) { /* It's easy if chars are one byte. */ - if (dir > 0) - ch = (LWCHAR) ((p < limit) ? *p++ : 0); - else - ch = (LWCHAR) ((p > limit) ? *--p : 0); - } else if (dir > 0) { + ch = (LWCHAR) ((p < limit) ? *p++ : 0); + } else { len = utf_len(*p); if (p + len > limit) { ch = 0; @@ -374,13 +371,6 @@ step_char(char **pp, int dir, char *limi ch = get_wchar(p); p += len; } - } else { - while (p > limit && IS_UTF8_TRAIL(p[-1])) - p--; - if (p > limit) - ch = get_wchar(--p); - else - ch = 0; } *pp = p; return (ch); Index: cmdbuf.c === RCS file: /cvs/src/usr.bin/less/cmdbuf.c,v retrieving revision 1.20 diff -u -p -r1.20 cmdbuf.c --- cmdbuf.c2 Sep 2019 14:07:45 - 1.20 +++ cmdbuf.c2 Sep 2019 15:00:51 - @@ -141,7 +141,7 @@ len_cmdbuf(void)
delete ligature support for Arabic "la" from the less(1) command line
Hi, i have to admit that i am neither able to speak nor to write nor to understand the Arabic language nor the Arabic script, but here is my current, probably incomplete understanding of what our less(1) program is trying to do with Arabic ligatures. If somebody is reading this who is able to read and write Arabic or an Indian language heavily using ligatures, feedback is highly welcome. Arabic is a cursive script, which means that when writing Arabic, characters do not map 1:1 to glyphs. Instead, there are rules about how adjacent characters attach to each other, forming ligatures. As an extremely simple example, consider the Arabic adverb "la", which means the same as the English adverb "no". It consists of the two letters U+0644 LAM and U+0627 ALEF, the LAM appearing before (i.e. to the right of) the ALEF. However, you do not write both letters separately. Instead, the ALEF leans forward (to the left) and attaches to the LAM, forming the glyph U+FEFB, ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM. When displayed in a fixed width font, that ligature only occupies a single display column just like any other Arabic or Latin glyph. The LAM WITH ALEF glyph is not a double-width glyph like Japanese or Chinese characters typically are. So, when this happens, you have four bytes of UTF-8 forming two Unicode characters, and *together*, these two characters occupy only one single display column. Note that in the default configuration, our xterm(1) is not able to display Arabic characters at all. But even when you run xterm -fa arabic or xterm -fa fixed which uses FreeType support instead of the default X toolkit font support, such that xterm(1) does become able to display single Arabic characters, it still displays the word "la" incorrectly, failing to generate the required ligature and instead displaying the two characters LAM and ALEF separately. So i installed konsole-18.12.0p1 for testing (which pulls in ridiculous amounts of dependencies, dozens of them, but oh well, i guess support for advanced Unicode features isn't trivial). The konsole(1) program does display the word "la" correctly, as a ligature. Now, running less(1) inside konsole(1), i found that columnation is already subtly broken. As long as the "la" ligature is visible on screen, all is fine. Now scroll to the right until the "la" appears in the first screen column. Then scroll one more column to the right by pressing "1 RIGHTARROW". Now you see *half* the ligature, i.e. an isolated ALEF, in the first column of the screen, even though the Arabic word does not contain an isolated ALEF. Besides, we just attempted to scroll the "la" off screen, so the ALEF now appears in the column one to the right of where the "la" should actually be, and all the rest of the line is shifted one column to the right, too, so columnation is now off by one. Scrolling back left, columnation recovers to correct display. I strongly suspect i broke that during my previous UTF-8 cleanup work on less(1). However, LAM WITH ALEF is literally the only ligature that less(1) supports, together with three variations (with MADDA above, with HAMZA above, and with HAMZA below). But there are hundreds of ligatures in Arabic, see https://www.unicode.org/charts/PDF/UFB50.pdf https://www.unicode.org/charts/PDF/UFE70.pdf I have no idea how many of those work in konsole(1) - but i'm sure none of those, except the four LAM WITH ALEF discussed here, work with less(1), so i think support for LAM WITH ALEF provided no value in the first place. The way it is implemented, with an ad-hoc table inside less(1) of character combinations that form ligatures, is just wrong and not sustainable by any stretch of the imagination, i think. On top of that, how characters combine in Arabic is strongly context dependent; even the syllable "la" forms a different ligature depending on whether it is isolated or at the end of a longer word, and none of the context dependencies are implemented in less(1) anyway. And finally, people say the situation in many Indian languages is even more dire than in Arabic, so what our less(1) tries to do is almost certainly completely useless for those languages, even if we would expand the ad-hoc table. So, i propose to delete support for combining characters into ligatures from our less(1): at this point, it is only used for typing at the less prompt anyway (and not for the file displayed), only for Arabic, and only for the single ligature "la". If we ever want better ligature support in the future, i think we would have to make a fresh start anyway - and i think there are many other things to do before that. Note that this only removes support for combining characters into ligatures that can also stand on their own; support for purely combining accents like U+300 COMBINING GRAVE ACCENT and U+3099 COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK remains intact. OK? Ingo Index: charset.c
Re: remove chroot(2) from spamd(8)
Hi, Theo de Raadt wrote on Wed, Jul 31, 2019 at 09:48:57AM -0600: > Ricardo Mestre wrote: >> By now we are already confident that pledge(2) "just works(tm)" >> and that it can be used to effectively remove filesystem access. >> >> That being said, in spamd(8) when I pledge(2)d it the main priv process got >> "stdio inet" which means there's no fs access at all so calling >> chroot(2)/chdir(2) here doesn't make sense anymore. Just remove them. > I'm fine with removing chroot+chdir where unveil or pledge provide the > same (or better) isolation. > > But let's do it program by program, carefully considering the case. > > I need to mention this change makes such programs unsafe to operate > in portable environments which lack pledge/unveil. An extra safety > belt is being removed we don't need, but other people need that > safety belt a lot. That doesn't change my view we should make > the programs simpler without chroot; the impact is for downstreams > to handle. Maybe put a comment above the pledge() call? Something like: /* * When porting this program to a platform lacking pledge(2), * don't forget to at least properly chroot(2) the child instead. */ The spamd(8) utility does look like a plausible candidate for porting, but i'm not sure how familiar the porters are with OpenBSD ways, so the removal of chroot(2) might easily cause misunderstandings. Consider e.g. projects like http://greyd.org/ https://aur.tuna.tsinghua.edu.cn/packages/greyd/ which appear to have very little connection to OpenBSD. Yours, Ingo
Re: remove chroot(2) from spamd(8)
Hi, Ricardo Mestre wrote on Wed, Jul 31, 2019 at 07:41:08PM +0100: > On 11:22 Wed 31 Jul , Theo de Raadt wrote: >> Ingo Schwarze wrote: >>> /* >>> * When porting this program to a platform lacking pledge(2), >>> * don't forget to at least properly chroot(2) the child instead. >>> */ >> But I really don't see the benefit of such lines of text. >> Outsiders won't read the text, and it wastes our screen real estate. > Regarding downstream I'm "not too worried" since if you follow us either > you are already using OpenBSD or you should apply best policies already, Fair enough. I'm hardly the right person to provide an OK here, but i have no objection. Yours, Ingo
Re: printf(1) man page has a small omission
Hi Andras, please do not cross-post on OpenBSD lists, choose whatever list fits best. I trimmed bugs@ for this followup. On Mon, Jun 3, 2019 at 2:12 PM Andras Farkas wrote: > https://man.openbsd.org/man1/printf.1 > The section on the b format (%b) neglects to mention that for that > format, it's \0num rather than \num > Because of the way OpenBSD's printf is made, both \0num and \num work > for %b, but, \0num is more correct and portable when using printf(1)'s > %b format. > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html I agree that deserves describing, just in case some system ever implements the standard and only the standard. Andras Farkas wrote on Mon, Jul 29, 2019 at 11:54:18PM -0400: > I have a diff attached which fixes the man page. I don't quite agree with your patch. In practice, both \0num and \num work; i inspected the code of FreeBSD and NetBSD which both appear to support \num, too, even though they don't document it, and i tested on Linux, and on Solaris 9, 10, and 11, and both forms work everywhere: $ printf '%b\n' '\0176x' ~x $ printf '%b\n' '\176x' ~x So here is an alternative patch. OK? Ingo Index: printf.1 === RCS file: /cvs/src/usr.bin/printf/printf.1,v retrieving revision 1.32 diff -u -r1.32 printf.1 --- printf.12 Jun 2019 06:16:37 - 1.32 +++ printf.11 Aug 2019 21:53:11 - @@ -319,6 +319,15 @@ Characters from the string .Ar argument are printed with backslash-escape sequences expanded. +In the +.Ar argument , +ASCII characters can be octally encoded either as +.Cm \e0 Ns Ar num +or as +.Cm \e Ns Ar num +like in the +.Ar format +string. If the .Ar argument contains the special escape sequence @@ -373,7 +382,17 @@ .Ev LC_ALL Ns =C were set. .Pp -The escape sequences \ee and \e' are extensions to that specification. +The escape sequences +.Cm \ee +and +.Cm \e' , +as well as omitting the leading digit +.Cm 0 +from +.Cm \e0 Ns Ar num +octal escape sequences in +.Cm %b +arguments, are extensions to that specification. .Sh HISTORY The .Nm
Re: Remove some sigvec Xr's
Hi Philip, Philip Guenther wrote on Thu, Jul 25, 2019 at 07:21:48PM -0900: > Hmm: sh(1) and ksh(1) have *nothing* from sections 2 or 3 in their SEE > ALSO. That doesn't seem like a wrong choice, Indeed. Jason generally discourages linking from section 1 to sections 2 and 3, arguing that neither interactive use of a program nor shell programming ought to require worrying about C interfaces. There are exceptions, for example to avoid duplicating lists, e.g. pointing from sysctl(8) to sysctl(2) or from kill(1) to sigaction(2). > albeit inconsistent with csh(1). My feeling is csh(1) is the odd one out here. The amount of cross references from csh(1) to section 2 looks excessive. For the umask builtin command, i kind of see the point of pointing to section 2 - but to chmod(2) where the bits are explained, umask(2) being less relevant, and it would also seem more useful close to the explanation of the umask builtin rather than below SEE ALSO. > Part of me feels like _if_ they're going to mention umask(2), > setrlimit(2), and sigaction(2), then they should mention chdir(2), > as the other classic "must be in the shell" syscall. The question to ask is: is it likely that somebody who decided to look up csh(1) will also benefit from reading chdir(2), in the same context? I don't quite see yet why that might help. I'd rather move into the opposite direction, see below. By the way, i suspect that the reason why the .Xr to ksh(1) is missing is that ksh(1) and sh(1) historically used to be the same manual page, until Jason rewrote a clean version of sh(1) from scratch a few years ago. OK? Ingo Index: csh.1 === RCS file: /cvs/src/bin/csh/csh.1,v retrieving revision 1.82 diff -u -r1.82 csh.1 --- csh.1 26 Jul 2019 12:08:18 - 1.82 +++ csh.1 28 Jul 2019 23:07:13 - @@ -2219,6 +2219,9 @@ the mask are 002 giving all access to the group and read and execute access to others or 022 giving all access except write access for users in the group or others. +The +.Xr chmod 2 +manual page provides the complete list of bits that can be set. .Pp .It Ic unalias Ar pattern All aliases whose names match the specified pattern are discarded. @@ -2702,16 +2705,8 @@ .Dq ~name .El .Sh SEE ALSO +.Xr ksh 1 , .Xr sh 1 , -.Xr access 2 , -.Xr execve 2 , -.Xr fork 2 , -.Xr pipe 2 , -.Xr setrlimit 2 , -.Xr sigaction 2 , -.Xr umask 2 , -.Xr wait 2 , -.Xr killpg 3 , .Xr tty 4 , .Xr environ 7 , .Xr script 7
Re: SIGSEGV in libedit
Hello Yasuoka-san, YASUOKA Masahiko wrote on Mon, Aug 05, 2019 at 04:20:41PM +0900: > This is the diff which fixes the problem by replacing malloc by calloc. > > Initialize the line buffer by zero when allocation. This fixes the > problem a crash happens after the window size change. Actually, only the two "b[i] = calloc(..." changes are needed. The two "b = calloc(..." are pointless because b[] is already explicitly and completely initialized four lines below. Then again, even though i doubt that we still regard NetBSD as "upstream" for libedit, we do still compare code occasionally, so avoiding gratuitious differences still makes sense. In this case, the "b = calloc(..." is harmless precisely because the initializarion has absolutely no effect, being completely overwritten four lines below. Performance is irrelevant here because this function is called quite rarely and certainly not in a loop. So the diff is OK schwarze@ as is. Thanks for tracking this down and finding the best fix. Yours, Ingo > Index: lib/libedit/terminal.c > === > RCS file: /cvs/src/lib/libedit/terminal.c,v > retrieving revision 1.18 > diff -u -p -r1.18 terminal.c > --- lib/libedit/terminal.c12 Apr 2017 18:24:37 - 1.18 > +++ lib/libedit/terminal.c5 Aug 2019 07:13:08 - > @@ -413,11 +413,11 @@ terminal_alloc_display(EditLine *el) > wchar_t **b; > coord_t *c = >el_terminal.t_size; > > - b = reallocarray(NULL, c->v + 1, sizeof(*b)); > + b = calloc(c->v + 1, sizeof(*b)); > if (b == NULL) > goto done; > for (i = 0; i < c->v; i++) { > - b[i] = reallocarray(NULL, c->h + 1, sizeof(**b)); > + b[i] = calloc(c->h + 1, sizeof(**b)); > if (b[i] == NULL) { > while (--i >= 0) > free(b[i]); > @@ -428,11 +428,11 @@ terminal_alloc_display(EditLine *el) > b[c->v] = NULL; > el->el_display = b; > > - b = reallocarray(NULL, c->v + 1, sizeof(*b)); > + b = calloc(c->v + 1, sizeof(*b)); > if (b == NULL) > goto done; > for (i = 0; i < c->v; i++) { > - b[i] = reallocarray(NULL, c->h + 1, sizeof(**b)); > + b[i] = calloc(c->h + 1, sizeof(**b)); > if (b[i] == NULL) { > while (--i >= 0) > free(b[i]);
Re: SIGSEGV in libedit
Hello Yasuoka-san, YASUOKA Masahiko wrote on Thu, Aug 01, 2019 at 08:42:35PM +0900: > I noticed the upstream NetBSD recently replaced almost all malloc(3)s > by calloc(3) in libedit. > > https://github.com/NetBSD/src/commit/b91b3c48e0edb116bd797586430cb426b575d717 > > This also fixes the problem. I'll create a diff which does the same > thing. Might i suggest that you first send a diff changing only the one place required to fix the bug asou@ reported? That can easily be reviewed, and i expect we can get it in quickly, and it allows a useful commit message explaining why exactly it is needed. Regarding all the other places in the library: it is true that in many situations, zeroing buffers when allocating them provides additional safety. But there may be exceptions; in some cases, it may hide or rarely even cause bugs. In any case, i hate it when people go "i have no idea whether it is needed and what it does, but let's just change this globally anyway". Even if it's about something as seemingly unconspicious as zeroing buffers. The NetBSD commit message provides no indication that the changed code was inspected for consequences of the change, and as far as i know Christos (admittedly, i only met him two or three times and didn't talk all *that* much to him) i guess it might indeed be his style to commit changes without actually inspecting the code. So i think for committing to OpenBSD, each function changed needs to be inspected and it needs to be confirmed that zeroing each buffer in question does not cause new problems or hide other bugs. That may be somewhat painful work, libedit code is not very audit friendly, but i don't think cutting corners is a good idea. In general, code quality in libedit is somewhat below OpenBSD quality standards, so the time spent auditing it is certainly not wasted but spent at a place needing it. Also, the code quality implies that zeroing some additional buffers likely improves security at least at some of the places - if it is checked properly. Yours, Ingo
Re: less -S does not "truncate" lines
Hi George, George Brown wrote on Sat, Jul 20, 2019 at 01:29:37PM +0100: > When viewing a file with "less -S" that has lines longer than $COLUMNS > said lines are simply not wrapped. The contents of said lines is still > available, one simply needs to scroll horizontally. I would have > expected the contents displayed within less to be as if I had run > something like "cut -c 1-$COLUMNS file". > > I came across this as I was wondering why "less -FS" was still > paginating on inputs with long lines that would have otherwise fit on > the screen. > > I have observed this behaviour on the following platforms. > > * OpenBSD 6.5 > * OpenBSD 6.5 - current (snapshot 2019-07-19) > * FreeBSD 12 > * Ubunutu 19.04 > * macOS 10.14.5 > > Given the consistent behaviour I would think it is my expectation that > is wrong rather than the observed result. > > It could be argued this is a "visual truncation" from the current view. > However in my mind the word "truncation" has a very specific meaning of > dropping a remainder such that it is no longer available at all. As such > I would suggest an alteration to the description in the man page. > > For reference the GNU less man page is the only one that doesn't use the > word "truncate", instead describing it as "chopped rather than folded". Both of the words "truncate" and "chop" seem more likely to be understood as permanant deletion rather than temporary hiding, so both are best avoided. I committed the following patch which is shorter and simpler, and which also describes the option in a positive rather than a negative way. Yours, Ingo CVSROOT:/cvs Module name:src Changes by: schwa...@cvs.openbsd.org2019/07/20 07:21:10 Modified files: usr.bin/less : less.1 Log message: Correct misleading description of -S; problem reported by George Brown <321 dot george at gmail dot com> on tech@. Index: less.1 === RCS file: /cvs/src/usr.bin/less/less.1,v retrieving revision 1.54 retrieving revision 1.55 diff -u -r1.54 -r1.55 --- less.1 26 May 2019 01:16:09 - 1.54 +++ less.1 20 Jul 2019 13:21:09 - 1.55 @@ -499,10 +499,8 @@ Thus, various display problems may result, such as long lines being split in the wrong place. .It Fl S | -chop-long-lines -Causes lines longer than the screen width to be -chopped (truncated) rather than wrapped. -That is, the portion of a long line that does not fit in -the screen width is not shown. +Display only the beginning of lines that exceed the screen width. +The remainder can be shown by scrolling horizontally. The default is to wrap long lines; that is, display the remainder on the next line. .It Fl s | -squeeze-blank-lines
Re: [patch] Re-add 'proc' to vi(1) when running in secure mode.
Hi, Bryan Steele wrote on Fri, Jul 19, 2019 at 06:14:56PM -0400: > On Sat, Jul 20, 2019 at 12:03:03AM +0200, Jesper Wallin wrote: >> On Fri, Jul 19, 2019 at 05:14:03PM -0400, Bryan Steele wrote: >>> I suspect that in secure/-S mode, the :pre[serve] should either be >>> disabled, or modified to stop calling sendmail. The mail it is sending >>> is purely advisory, and should be easy to disable. See common/recover.c. >> Oh, you're right. A bit ironic that I didn't notice the exec violation >> due to the fork being permitted now. Thanks for pointing this out! >> Scrap my old patch, here's a better proposal: > ok brynet@ No, that patch is not OK either. It breaks :pre in -S mode because rcv_mailfile() is a blantant misnomer that actually has two purposes: 1. Create the mail file to indicate that there is something recoverable 2. and then actually send the mail. While step 2 must be skipped, step 1 for creating the recover.* file is still needed, or "vi -r" will later complain "vi: No files to recover" even though a vi.* file exists in /tmp/vi.recover/. Yes, i freely admit that the design is horribly contorted. So here is a better patch avoiding that problem. It is also safer because it is easier to see that fork(2) can no longer be reached. Otherwise, you would need to understand that even though rcv_init() calls rcv_mailfile() and rcv_mailfile() calls rcv_email(), fork(2) cannot be reached in that way because isync is 0 in rcv_mailfile(). Yours, Ingo P.S. See below the patch for my analysis of the code, which you may or may not find helpful while verifying the patch. Index: common/recover.c === RCS file: /cvs/src/usr.bin/vi/common/recover.c,v retrieving revision 1.29 diff -u -p -r1.29 recover.c --- common/recover.c10 Nov 2017 18:25:48 - 1.29 +++ common/recover.c21 Jul 2019 15:45:59 - @@ -821,6 +821,15 @@ rcv_email(SCR *sp, int fd) struct stat sb; pid_t pid; + /* +* In secure mode, our pledge(2) includes neither "proc" +* nor "exec". So simply skip sending the mail. +* Later vi -r still works because rcv_mailfile() +* already did all the necessary setup. +*/ + if (O_ISSET(sp, O_SECURE)) + return; + if (_PATH_SENDMAIL[0] != '/' || stat(_PATH_SENDMAIL, ) == -1) msgq_str(sp, M_SYSERR, _PATH_SENDMAIL, "not sending email: %s"); There are three call paths to rcv_mailfile() and rcv_email(): * rcv_init() -> rcv_mailfile() purpose: set up .rcv_fd and .rcv_mpath for locking purposes and in case of later dying from a signal calls to rcv_init() are always safe because isync == 0, i.e. rcv_email() is never called * rcv_sync() -> rcv_email() directly purpose: emergency saving while dying from SIGTERM; RCV_EMAIL is not set in any other situation so the direct call to rcv_email() must be neutered note: RCV_SNAPSHOT is NOT set in this case * rcv_sync() -> rcv_mailfile() -> rcv_email() purpose: manual "pre" command RCV_SNAPSHOT is not set in any other situation note: RCV_EMAIL is NOT set in this case here, the rcv_email() inside rcv_mailfile() must be neutered
Re: make msgsnd(2) more posix
Hi Moritz, Moritz Buhl wrote on Thu, Jul 18, 2019 at 01:38:49PM +0200: > bluhm@ wrote: >> Moritz, can you create a man page ERRORS diff? > Is the attached diff ok? Yes, committed with some additional tweaks. Thank you, Ingo CVSROOT:/cvs Module name:src Changes by: schwa...@cvs.openbsd.org2019/07/18 06:57:12 Modified files: lib/libc/sys : msgsnd.2 Log message: State that mtype < 1 causes EINVAL as required by POSIX and as implemented by OpenBSD since sysv_msg.c rev. 1.35. Diff from Moritz Buhl requested by bluhm@. While here, add STANDARDS, improve HISTORY, and use the customary .Fa for struct fields rather than .Va. Index: msgsnd.2 === RCS file: /cvs/src/lib/libc/sys/msgsnd.2,v retrieving revision 1.19 diff -u -r1.19 msgsnd.2 --- msgsnd.210 Dec 2014 19:19:00 - 1.19 +++ msgsnd.218 Jul 2019 12:57:04 - @@ -53,10 +53,10 @@ char mtext[1]; /* body of message */ .Ed .Pp -.Va mtype +.Fa mtype is an integer greater than 0 that can be used for selecting messages (see .Xr msgrcv 2 ) ; -.Va mtext +.Fa mtext is an array of .Fa msgsz bytes, with a size between 0 and that of the system limit @@ -65,7 +65,7 @@ If the number of bytes already on the message queue plus .Fa msgsz is bigger than the maximum number of bytes on the message queue -.Po Va msg_qbytes , +.Po Fa msg_qbytes , see .Xr msgctl 2 .Pc , @@ -105,16 +105,16 @@ queue is updated in the following way: .Bl -bullet .It -.Va msg_cbytes +.Fa msg_cbytes is incremented by the size of the message. .It -.Va msg_qnum +.Fa msg_qnum is incremented by 1. .It -.Va msg_lspid +.Fa msg_lspid is set to the pid of the calling process. .It -.Va msg_stime +.Fa msg_stime is set to the current time. .El .Sh RETURN VALUES @@ -127,9 +127,12 @@ .Fa msqid is not a valid message queue identifier. .Pp +.Fa mtype +is less than 1. +.Pp .Fa msgsz is greater than -.Va msg_qbytes . +.Fa msg_qbytes . .It Bq Er EACCES The calling process does not have write access to the message queue. .It Bq Er EAGAIN @@ -153,6 +156,13 @@ .Xr msgctl 2 , .Xr msgget 2 , .Xr msgrcv 2 +.Sh STANDARDS +The +.Fn msgsnd +function conforms to the X/Open System Interfaces option of +.St -p1003.1-2008 . .Sh HISTORY -Message queues appeared in -.At V.1 . +Message queues first appeared in +.At V.1 +and have been available since +.Nx 1.0 .
Re: Diff to stop using reserved words for smtpd.conf(5) examples
Hi Gilles, Gilles Chehade wrote on Tue, Jul 23, 2019 at 08:27:06AM +0200: > On Mon, Jul 22, 2019 at 05:05:01PM -0400, Kurt Mosiejczuk wrote: >> This is a diff for that changes the example smtpd.conf and smtpd.conf.5 >> so that it no longer uses words that are parts of the configuration >> syntax as labels for actions. A large chunk of my delay to a release >> with the new smtpd.conf configuration syntax was my confusion with the >> examples. Even writing this diff I realized that the quotes were only >> necessary in the examples because configuration grammar was being >> reused as labels. > I'll let ingo and jmc comment on this, I think the rationale for > using the quotes and the reserved words was precisely to show we > should use the quotes at these places I do agree with showing quotes around action names and other user-selected strings and identifiers. I think it helps understanding by making keywords easier to distinguish from other, replaceable strings. Even if in some cases, the syntax does not require the quotes. > and that reserved words in quotes were acceptable. The smtpd.conf(5) manual explains that explicitly, and i think it may make sense to show an example for it, even though i'm unsure whether the practice of re-using reserved words as user-defined identifiers should be recommended. But Theo's argument that it can be confusing to explain too many different aspects in the same example makes sense to me, in particular when the explanation of the example doesn't even mention all the aspects being demonstrated. Also, Kurt's experience confirms that this kind of confusion can indeed arise. > To be honest, ever since we made these examples, no one has ever > mailed me with a broken config due to a reserved keyword whereas > it used to be the case every now and then before. So, don't remove the quotes. Avoiding clashes that *require* quoting in most of the examples as Kurt suggests might make sense. I don't feel strongly about it either way. Also, giving one example (maybe at the end?) that explicitly shows and mentions the possibility of using a quoted reserved word as a user-selected string might make sense, too. Or maybe not, if we do not want to encourage it? Do we? Not sure, no strong feelings about that point either. I think there is a lot of room here for you to decide what you consider good and robust style for writing this particular configuration file. Yours, Ingo
Re: Diff to stop using reserved words for smtpd.conf(5) examples
Hi Kurt, Kurt Mosiejczuk wrote on Wed, Jul 24, 2019 at 09:28:26AM -0400: > On Wed, Jul 24, 2019 at 02:17:47PM +0200, Klemens Nanni wrote: >> On Wed, Jul 24, 2019 at 01:32:52PM +0200, Gilles Chehade wrote: >>> Well I think we should remove the reserved keywords as suggested by Kurt >>> but keep the quotes in all examples to make it very explicit that we are >>> expecting a string literal at this point. >> I concur. > Here's a diff that does that. Looks good to me. Ingo > Index: etc/mail/smtpd.conf > === > RCS file: /cvs/src/etc/mail/smtpd.conf,v > retrieving revision 1.11 > diff -u -p -r1.11 smtpd.conf > --- etc/mail/smtpd.conf 4 Jun 2018 21:10:58 - 1.11 > +++ etc/mail/smtpd.conf 24 Jul 2019 13:20:51 - > @@ -9,11 +9,11 @@ table aliases file:/etc/mail/aliases > # > listen on lo0 > > -action "local" mbox alias > -action "relay" relay > +action "local_mail" mbox alias > +action "outbound" relay > > # Uncomment the following to accept external mail for domain "example.org" > # > -# match from any for domain "example.org" action "local" > -match for local action "local" > -match for any action "relay" > +# match from any for domain "example.org" action "local_mail" > +match for local action "local_mail" > +match for any action "outbound" > Index: usr.sbin/smtpd/smtpd.conf.5 > === > RCS file: /cvs/src/usr.sbin/smtpd/smtpd.conf.5,v > retrieving revision 1.210 > diff -u -p -r1.210 smtpd.conf.5 > --- usr.sbin/smtpd/smtpd.conf.5 22 Dec 2018 08:54:02 - 1.210 > +++ usr.sbin/smtpd/smtpd.conf.5 24 Jul 2019 13:20:51 - > @@ -871,12 +871,12 @@ table secrets file:/etc/mail/secrets > > listen on lo0 > > -action "local" mbox alias > -action "relay" relay host smtp+tls://b...@smtp.example.com \e > +action "local_mail" mbox alias > +action "outbound" relay host smtp+tls://b...@smtp.example.com \e > auth > > -match for local action "local" > -match for any action "relay" > +match for local action "local_mail" > +match for any action "outbound" > .Ed > .Pp > In this second example, > @@ -908,12 +908,12 @@ listen on egress tls pki mail.example.co > > action mda_with_aliases mda "/path/to/mda \-f \-" alias > action mda_without_aliases mda "/path/to/mda \-f \-" > -action "relay" relay > +action "outbound" relay > > match for local action mda_with_aliases > match from any for domain example.com action mda_without_aliases > -match for any action "relay" > -match auth from any for any action "relay" > +match for any action "outbound" > +match auth from any for any action "outbound" > .Ed > .Pp > For sites that wish to sign messages using DKIM, the > @@ -929,13 +929,13 @@ table aliases file:/etc/mail/aliases > listen on lo0 > listen on lo0 port 10028 tag DKIM > > -action "mbox" mbox alias > -action "relay" relay > -action relay_dkim relay host smtp://127.0.0.1:10027 > - > -match for local action "mbox" > -match tag DKIM for any action "relay" > -match for any action relay_dkim > +action "local_mail" mbox alias > +action "outbound" relay > +action "relay_dkim" relay host smtp://127.0.0.1:10027 > + > +match for local action "local_mail" > +match tag DKIM for any action "outbound" > +match for any action "relay_dkim" > .Ed > .Pp > Sites that accept non-local messages may be able to cut down on the > @@ -952,14 +952,14 @@ table other-relays file:/etc/mail/other- > listen on lo0 > listen on egress > > -action "mbox" mbox alias > -action "relay" relay > +action "local_mail" mbox alias > +action "outbound" relay > > -match for local action "mbox" > -match for any action "relay" > +match for local action "local_mail" > +match for any action "outbound" > match !from src mail\-from "@example.com" for any \e >reject > -match from any for domain example.com action "mbox" > +match from any for domain example.com action "local_mail" > .Ed > .Sh SEE ALSO > .Xr mailer.conf 5 ,
Re: ctags.1: Xr vgrind
Hi, Jason McIntyre wrote on Mon, Jul 15, 2019 at 06:43:09PM +0100: > On Mon, Jul 15, 2019 at 04:56:29PM +0200, Klemens Nanni wrote: >> This tool comes from the textproc/vgrind port so base does not have it, >> is that the reason we do not reference it? >> >> I think it is still useful, so here's the Xr addition. > we pretty much try to keep xrs only to base pages in base pages. > yes, there are exceptions, but they are few and far between. > i'm not keen to start adding xrs to pages other people don;t have. I tend to agree with Jason; besides, it would also cause dead hyperlinks on man.openbsd.org. The following may be useful, though. This kind of reference occurs rarely enough that i'm not sure whether we have a standard wording for it. Yours, Ingo Index: ctags.1 === RCS file: /cvs/src/usr.bin/ctags/ctags.1,v retrieving revision 1.33 diff -u -p -r1.33 ctags.1 --- ctags.1 31 Dec 2015 14:01:26 - 1.33 +++ ctags.1 15 Jul 2019 19:33:36 - @@ -171,7 +171,8 @@ default output tags file Duplicate objects are not considered errors. .Sh SEE ALSO .Xr mg 1 , -.Xr vi 1 +.Xr vi 1 , +and the textproc/vgrind port .Sh STANDARDS The .Nm
Re: ctags.1: Xr vgrind
Hi Jason, Jason McIntyre wrote on Mon, Jul 15, 2019 at 09:04:23PM +0100: > On Mon, Jul 15, 2019 at 09:35:26PM +0200, Ingo Schwarze wrote: >> Jason McIntyre wrote on Mon, Jul 15, 2019 at 06:43:09PM +0100: >>> On Mon, Jul 15, 2019 at 04:56:29PM +0200, Klemens Nanni wrote: >>>> This tool comes from the textproc/vgrind port so base does not >>>> have it, is that the reason we do not reference it? >>>> I think it is still useful, so here's the Xr addition. >>> we pretty much try to keep xrs only to base pages in base pages. >>> yes, there are exceptions, but they are few and far between. >>> i'm not keen to start adding xrs to pages other people don;t have. >> I tend to agree with Jason; besides, it would also cause dead >> hyperlinks on man.openbsd.org. >> The following may be useful, though. This kind of reference occurs >> rarely enough that i'm not sure whether we have a standard wording >> for it. > i don;t like it, to be honest. can't you just reference the port > in the main body, if it's really needed? > > ...expected by the textproc/vgrind port is produced... Indeed, that's even better because it is really only relevant in the context of the ctags -v option. OK? Ingo Index: ctags.1 === RCS file: /cvs/src/usr.bin/ctags/ctags.1,v retrieving revision 1.33 diff -u -r1.33 ctags.1 --- ctags.1 31 Dec 2015 14:01:26 - 1.33 +++ ctags.1 15 Jul 2019 20:39:45 - @@ -92,8 +92,9 @@ references to them are regenerated, keeping only the other values in the file. .It Fl v -An index of the form expected by vgrind -is produced on the standard output. +An index of the form expected by the +.Pa textproc/vgrind +port is produced on the standard output. This listing contains the object name, file name, and page number (assuming 64 line pages). Since the output will be sorted into lexicographic order,
Re: [patch] Re-add 'proc' to vi(1) when running in secure mode.
Hi, Bryan Steele wrote on Sun, Jul 21, 2019 at 01:53:49PM -0400: > On Sat, Jul 20, 2019 at 12:03:03AM +0200, Jesper Wallin wrote: >> Oh, you're right. A bit ironic that I didn't notice the exec violation >> due to the fork being permitted now. Thanks for pointing this out! >> Scrap my old patch, here's a better proposal: [ ... and later, on a different patch ] > I somehow missed that. Indeed, moving the check into rcv_email before > fork makes more sense. Sorry for jumping the gun. Some observations, in decreasing order of importance: 1. Jesper, thanks for finding and reporting the regression. It is fixed now. 2. Bryan, thanks for suggesting the direction how to fix it. 3. Jesper, including a patch according to the best of your understanding is always welcome. Even if it turns out to be a bad patch, because often even a bad patch helps to understand what the OP thinks the problem might be: the saying is, bad patches trigger good ones. 4. Merely because something different eventually gets committed does not necessarily mean the original idea was bad. For many problems, there is more than one solution, and it is often a matter of judgement or taste which one is better. Jesper, in this case, re-adding "proc exec" would indeed have been an alternative solution - even though preventing "exec" is kind of the whole point of the -S option. 5. It might seem obvious, but after writing a patch and before sending it out, testing it makes sense. The easy part is testing that it actually solves the original problem. The slightly harder part is testing that is also solves similar problems (in this case, when catching SIGTERM). The hardest part is testing that it causes no regressions. 6. The history of this thread (the original commit of rev. 1.29, the initial suggestion to re-add "proc" only, the suggestion to skip rcv_mailfile()) proves once again that testing is surprisingly hard in practice. Most definitely, i also botched testing for many of the commits that i did in the past, and i also got many OKs for broken patches that i posted prematurely. It happens, even when trying to be careful. 7. Rarely used options rarely get tested. They attracks bugs like rotting fruits attract flies. So think twice before adding a new option to a program. Yours, Ingo CVSROOT:/cvs Module name:src Changes by: schwa...@cvs.openbsd.org2019/07/22 06:39:02 Modified files: usr.bin/vi/common: recover.c Log message: In secure mode (-S), skip sending mail when executing the :pre[serve] command or when dying from SIGTERM. This way, creating the recovery file works again without re-adding "proc exec" to the pledge(2). As reported by Jesper Wallin , this got broken by common/main.c rev. 1.29 (Nov 19, 2015). The general direction of the fix was suggested by brynet@. OK brynet@ and no opposition when shown on tech@
Re: [patch] Re-add 'proc' to vi(1) when running in secure mode.
Hi Jesper, Jesper Wallin wrote on Mon, Jul 22, 2019 at 06:09:03PM +0200: > On Mon, Jul 22, 2019 at 03:23:16PM +0200, Ingo Schwarze wrote: >> 3. Jesper, including a patch according to the best of your >> understanding is always welcome. Even if it turns out to be a >> bad patch, because often even a bad patch helps to understand >> what the OP thinks the problem might be: the saying is, bad >> patches trigger good ones. > I'm sorry for late reply, I've been busy/offline the last few days. No problem at all. > The lessons I take with me from this: > 1. Give myself more time to fully understand both the issue itself > and what the code actually does, before trying to fix it. > 2. Test, test with ktrace, test again and then run more tests. ;-) Sounds reasonable! Testing is important and often neglected. But make sure that doesn't cause bugs to not get reported at all because the process causes too much work or takes too long. :) When running out of time, in particular for bugs that seem important or when it is unclear if and when you might find more time to look at the issue, sending a preliminary report or a preliminary patch with a remark like "i'm trying to work on a patch" or "i only did rudimentary testing so far and ran out of time" might make sense to avoid needless delays. Yours, Ingo
Re: ascii(7): add a reference to the ANSI X3.4
Hello, Masanori Ogino wrote on Tue, Sep 17, 2019 at 11:49:17AM +0900: > Unlike utf8(7), ascii(7) does not contain a reference to its standard. > This patch just adds the reference. Thanks for pointing out the omission, i committed the following version. Yours, Ingo CVSROOT:/cvs Module name:src Changes by: schwa...@cvs.openbsd.org2019/09/21 00:04:22 Modified files: share/man/man7 : ascii.7 Log message: add STANDARDS section; similar to a diff sent in by , but using the up-to-date reference rather than one from 30 years ago; OK jmc@ Index: ascii.7 === RCS file: /cvs/src/share/man/man7/ascii.7,v retrieving revision 1.9 diff -u -r1.9 ascii.7 --- ascii.7 30 May 2017 15:06:00 - 1.9 +++ ascii.7 21 Sep 2019 01:48:58 - @@ -102,6 +102,13 @@ 112 p 113 q 114 r 115 s 116 t 117 u 118 v 119 w 120 x 121 y 122 z 123 { 124 | 125 } 126 ~ 127 del .Ed +.Sh STANDARDS +.Rs +.%T Information Systems - Coded Character Sets - 7-Bit American National\ + Standard Code for Information Interchange (7-Bit ASCII) +.%R INCITS 4-1986[R2017] +.%Q InterNational Committee for Information Technology Standards +.Re .Sh HISTORY An .Nm > Index: share/man/man7/ascii.7 > === > RCS file: /cvs/src/share/man/man7/ascii.7,v > retrieving revision 1.9 > diff -u -r1.9 ascii.7 > --- share/man/man7/ascii.7 30 May 2017 15:06:00 - 1.9 > +++ share/man/man7/ascii.7 17 Sep 2019 02:19:10 - > @@ -102,6 +102,13 @@ > 112 p 113 q 114 r 115 s 116 t 117 u 118 v 119 w > 120 x 121 y 122 z 123 { 124 | 125 } 126 ~ 127 del > .Ed > +.Sh STANDARDS > +.Rs > +.%A American National Standards Institute > +.%D 1986 > +.%R ANSI X3.4 > +.%T Coded Character Set - 7-bit American Standard Code for > Information Interchange > +.Re > .Sh HISTORY > An > .Nm
Re: timeout.9: cite our sources
Hi Scott & Jason, Jason McIntyre wrote on Sat, Nov 02, 2019 at 06:16:49PM +: > On Sat, Nov 02, 2019 at 12:32:30PM -0500, Scott Cheloha wrote: >> Cite the paper describing the timing wheel. PDF here: >> >> http://www.cs.columbia.edu/~nahum/w6998/papers/ton97-timing-wheels.pdf >> >> The authors have an older paper describing the same idea, from 1987: >> >> http://www.cs.columbia.edu/~nahum/w6998/papers/sosp87-timing-wheels.pdf >> >> but the 1997 version contains additional information gathered over ten >> years of praxis. The typesetting is also way nicer in 1997. >> >> Once question about the .Rs block: why is the issue number rendered >> before the volume number? Isn't it usually the other way around? >> IIRC the issue is a subset of a given volume. > the ordering of these blocks is kind of set in stone. so mandoc will do > what groff does. i guess a reordering will involve persuading other > people that it makes sense. That is true. Then again, it does not seem unlikely that we could persuade people in this case. Whether volume and issue are used together at all may depend on the discipline (for example, in high energay physics, it appears that usually only volume and page numbers are used for citing journals). But *if* both are used together, i suspect you are right that the other order makes more sense. If you think it is worth it, tell me off-list and i'll post a trivial patch to change the order to gr...@gnu.org. Changing the order in mandoc is also trivial and would be done at the same time if the groff folks like the idea. Whatever that leads to, given both .%V and .%N only render a bare number, which certainly shouldn't be changed because i suspect using only one at a time may be the dominant use case, either order is not very clear when using both together. So i suggest quoting the numbers in exactly the same way as given in the original paper: .%V vol. 5 .%N no. 6 >> Oh, another question: is it standard to mention the city of publication >> for an academic journal in a citation? ToN is published from Piscataway, >> NJ, fwiw. > i don;t know, but i think the amount of info used in these blocks is > determined by the person adding it - how much info is useful? for rfcs, > for example, we settled on a standard blurb that both gave credit and > allowed readers to find the docs. whether you need all the refs you've > given i guess it decided by you (i think often a title suffices). I agree with Jason. For journals, in particular for well-known ones, the city of publication is often omitted because it is not really needed for getting a copy. In this case, i would probably omit it. For books, the city is often provided in citations, almost as a matter of convention. For more obscure publications, it may even be important. >> ok? > no objection from me. Same here, but i recommend the change explained above. Also, you are deleting information from the source code comment that would probably be misplaced in the manual: "Scheme 7". You might wish to leave that piece of information in the code unless it happens to be incorrect or irrelevant. Yours, Ingo >> Index: share/man/man9/timeout.9 >> === >> RCS file: /cvs/src/share/man/man9/timeout.9,v >> retrieving revision 1.46 >> diff -u -p -r1.46 timeout.9 >> --- share/man/man9/timeout.9 14 Apr 2019 08:51:31 - 1.46 >> +++ share/man/man9/timeout.9 2 Nov 2019 17:14:48 - >> @@ -282,3 +282,15 @@ These functions are implemented in the f >> .Xr splclock 9 , >> .Xr tsleep 9 , >> .Xr tvtohz 9 >> +.Rs >> +.%A George Varghese >> +.%A Anthony Lauck >> +.%B Hashed and hierarchical timing wheels: efficient data structures for \ >> +implementing a timer facility >> +.%I IEEE/ACM >> +.%J Transactions on Networking >> +.%V 5 >> +.%N 6 >> +.%P pp. 824\(en834 >> +.%D December 1997 >> +.Re >> Index: sys/kern/kern_timeout.c >> === >> RCS file: /cvs/src/sys/kern/kern_timeout.c,v >> retrieving revision 1.60 >> diff -u -p -r1.60 kern_timeout.c >> --- sys/kern/kern_timeout.c 2 Nov 2019 16:56:17 - 1.60 >> +++ sys/kern/kern_timeout.c 2 Nov 2019 17:14:48 - >> @@ -45,9 +45,7 @@ >> /* >> * Timeouts are kept in a hierarchical timing wheel. The to_time is the >> value >> * of the global variable "ticks" when the timeout should be called. There >> are >> - * four levels with 256 buckets each. See 'Scheme 7' in >> - * "Hashed and Hierarchical Timing Wheels: Efficient Data Structures for >> - * Implementing a Timer Facility" by George Varghese and Tony Lauck. >> + * four levels with 256 buckets each. >> */ >> #define BUCKETS 1024 >> #define WHEELSIZE 256