Re: bwfm(4): show modulation type for the various chipsets

2022-04-23 Thread Ingo Schwarze
Hi Stuart,

Stuart Henderson wrote on Sat, Apr 23, 2022 at 06:52:46PM +0100:

> saves time if you want to ignore 11n-only devices. ok?

Adding useful information is good in general, but i can't really
comment on the content.

If you add a column to a -column list, usually you also want to update
the .Bl -column line.  Unless there are specific reasons to do it
otherwise, the .Bl -column line should usually repeat the longest
string in each column, for example:

  -.Bl -column "Chipset" "Spectrum" "MIMO" "Bus" -offset 6n
  +.Bl -column BCM43236 2GHz/5GHz Type MIMO SDMMC/USB -offset 6n

Both groff and mandoc use the number of columns specified in
the .Bl -column line to pick a visually pleasing spacing between
columns (in this case, 4n without updating the .Bl -column line
or 3n after updating it; in the case at hand, the effect is partially
compensated by the .Bl -column containing shorter strings rather
than the longest, which seems a bit confusing :).

While there, i suggest dropping the useless quoting.

Yours,
  Ingo


> Index: share/man/man4/bwfm.4
> ===
> RCS file: /cvs/src/share/man/man4/bwfm.4,v
> retrieving revision 1.16
> diff -u -p -r1.16 bwfm.4
> --- share/man/man4/bwfm.4 5 Jan 2022 17:39:24 -   1.16
> +++ share/man/man4/bwfm.4 23 Apr 2022 17:51:24 -
> @@ -35,29 +35,29 @@ as well as the bus attachments recognize
>  .Nm
>  driver:
>  .Bl -column "Chipset" "Spectrum" "MIMO" "Bus" -offset 6n
> -.It Em Chipset Ta Em Spectrum Ta Em MIMO Ta Em Bus
> -.It BCM43143 Ta 2GHz Ta 1x1 Ta SDMMC/USB
> -.It BCM43236 Ta 2GHz/5GHz Ta 2x2 Ta USB
> -.It BCM4324 Ta  2GHz/5GHz Ta 2x2 Ta SDMMC
> -.It BCM43242 Ta 2GHz/5GHz Ta 2x2 Ta USB
> -.It BCM4329 Ta  2GHz/5GHz Ta 2x2 Ta SDMMC
> -.It BCM4330 Ta  2GHz/5GHz Ta 2x2 Ta SDMMC
> -.It BCM4334 Ta  2GHz/5GHz Ta 2x2 Ta SDMMC
> -.It BCM43340 Ta 2GHz/5GHz Ta 1x1 Ta SDMMC
> -.It BCM43341 Ta 2GHz/5GHz Ta 1x1 Ta SDMMC
> -.It BCM4335 Ta  2GHz/5GHz Ta 1x1 Ta SDMMC
> -.It BCM43362 Ta 2GHz Ta 1x1 Ta SDMMC
> -.It BCM43364 Ta 2GHz Ta 1x1 Ta SDMMC
> -.It BCM4339 Ta  2GHz/5GHz Ta 1x1 Ta SDMMC
> -.It BCM43430 Ta 2GHz Ta 1x1 Ta SDMMC
> -.It BCM43455 Ta  2GHz/5GHz Ta 1x1 Ta SDMMC
> -.It BCM43456 Ta  2GHz/5GHz Ta 2x2 Ta SDMMC
> -.It BCM4350 Ta 2GHz/5GHz Ta 2x2 Ta PCI
> -.It BCM4354 Ta  2GHz/5GHz Ta 2x2 Ta SDMMC
> -.It BCM4356 Ta 2GHz/5GHz Ta 2x2 Ta PCI/SDMMC
> -.It BCM43569 Ta 2GHz/5GHz Ta 2x2 Ta USB
> -.It BCM43602 Ta 2GHz/5GHz Ta 3x3 Ta PCI
> -.It BCM4371 Ta 2GHz/5GHz Ta 2x2 Ta PCI
> +.It Em Chipset Ta Em Spectrum Ta Em Type Ta Em MIMO Ta Em Bus
> +.It BCM43143 Ta 2GHz Ta 11n Ta 1x1 Ta SDMMC/USB
> +.It BCM43236 Ta 2GHz/5GHz Ta 11n Ta 2x2 Ta USB
> +.It BCM4324 Ta  2GHz/5GHz Ta 11n Ta 2x2 Ta SDMMC
> +.It BCM43242 Ta 2GHz/5GHz Ta 11n Ta 2x2 Ta USB
> +.It BCM4329 Ta  2GHz/5GHz Ta 11n Ta 2x2 Ta SDMMC
> +.It BCM4330 Ta  2GHz/5GHz Ta 11n Ta 2x2 Ta SDMMC
> +.It BCM4334 Ta  2GHz/5GHz Ta 11n Ta 2x2 Ta SDMMC
> +.It BCM43340 Ta 2GHz/5GHz Ta 11n Ta 1x1 Ta SDMMC
> +.It BCM43341 Ta 2GHz/5GHz Ta 11n Ta 1x1 Ta SDMMC
> +.It BCM4335 Ta  2GHz/5GHz Ta 11ac Ta 1x1 Ta SDMMC
> +.It BCM43362 Ta 2GHz Ta 11n Ta 1x1 Ta SDMMC
> +.It BCM43364 Ta 2GHz Ta 11n Ta 1x1 Ta SDMMC
> +.It BCM4339 Ta  2GHz/5GHz Ta 11ac Ta 1x1 Ta SDMMC
> +.It BCM43430 Ta 2GHz Ta 11n Ta 1x1 Ta SDMMC
> +.It BCM43455 Ta  2GHz/5GHz Ta 11ac Ta 1x1 Ta SDMMC
> +.It BCM43456 Ta  2GHz/5GHz Ta 11ac Ta 2x2 Ta SDMMC
> +.It BCM4350 Ta 2GHz/5GHz Ta 11ac Ta 2x2 Ta PCI
> +.It BCM4354 Ta  2GHz/5GHz Ta 11ac Ta 2x2 Ta SDMMC
> +.It BCM4356 Ta 2GHz/5GHz Ta 11ac Ta 2x2 Ta PCI/SDMMC
> +.It BCM43569 Ta 2GHz/5GHz Ta 11ac Ta 2x2 Ta USB
> +.It BCM43602 Ta 2GHz/5GHz Ta 11ac Ta 3x3 Ta PCI
> +.It BCM4371 Ta 2GHz/5GHz Ta 11ac Ta 2x2 Ta PCI
>  .El
>  .Pp
>  These are the modes the



Re: Security support status of xnf(4) and xbf(4)

2022-03-26 Thread Ingo Schwarze
Hi Demi Marie,

Demi Marie Obenour wrote on Fri, Mar 25, 2022 at 12:13:59PM -0400:

> Linux’s netfront and blkfront drivers recently had a security
> vulnerability (XSA-396) that allowed a malicious backend to potentially
> compromise them.  In follow-up audits, I found that OpenBSD’s xnf(4)
> currently trusts the backend domain.  I reported this privately to Theo
> de Raadt, who indicated that OpenBSD does not consider this to be a
> security concern.
> 
> This is obviously a valid position for the OpenBSD project to take, but
> it is surprising to some (such as myself) from the broader Xen
> ecosystem.  Standard practice in the Xen world is that bugs in frontends
> that allow a malicious backend to cause mischief *are* considered
> security bugs unless there is explicit documentation to the contrary.

I do not have the slightest idea what you are talking about here,
so i am not commenting on any of the above.

> As such, I believe this deserves to be noted in xnf(4) and xbf(4)’s man
> pages.

I can't comment whether this needs mentioning in the manual either,
but others will likely answer that question when you send a patch.

> If the OpenBSD project agrees, I am willing to write a patch,
> but I have no experience with mandoc

You do not need experience with mandoc(1) to write a manual page patch.
To get the formatting right, looking at the mdoc(7) manual page ought
to be sufficient.

But don't worry too much if you have no experience with mdoc(7).
The main job of people submitting manual page patches is to get
the content, wording, and placement of the text right.  The
formatting can easily be tweaked if needed.

> so it might take a few tries.

That's not a problem; the saying here goes "bad patches trigger good
ones".

Yours,
  Ingo



Re: __read_mostly

2022-03-20 Thread Ingo Schwarze
Hi,

>> A downside of this is that it becomes easier to guess the addresses
>> of the tagged variables.

> No kidding.  It partly undoes the effort of KARL.  

I don't feel qualified to comment on the patch,
but i can't resist mentioning that i still love
tedu@'s classical dictum "attack mitigation countermeasures"
which he coined during the aftermath of the heartbleed debacle.

Yours,
  Ingo



Re: mandoc.1: update example to reflect current options

2022-02-08 Thread Ingo Schwarze
Hello Anders,

Anders Damsgaard wrote on Tue, Feb 08, 2022 at 08:36:11AM +0100:

> The mandoc(1) option alias -l for -a was removed from the documentation
> in revision 1.107:
> 
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/mandoc/mandoc.1.diff?r1=1.106=1.107
> 
> The -l option still works, but this diff makes the example consistent
> with the current option description.

Thanks, committed.

Both are correct and reliable, but i agree -a is easier to understand
whereas understanding why -l works is a bit convoluted.

Yours,
  Ingo



Log message:
In the first example, use "mandoc -a" directly rather "mandoc -l".

It feels more natural to me to use -a directly when asking mandoc(1)
to use a pager.  The reason that "mandoc -l" does exactly the same
as "mandoc -a" is that "mandoc" is essentially "man -lc", so the -a
implied by -l negates the -c and the -l has no effect because it is
already the default for mandoc(1).

The more usual command for doing the same is "man -l foo.1 bar.1 ..."
but that's off-topic for the mandoc(1) manual page.

Patch on tech@ from Anders Damsgaard .


> Index: usr.bin/mandoc/mandoc.1
> ===
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
> retrieving revision 1.180
> diff -u -p -u -p -r1.180 mandoc.1
> --- usr.bin/mandoc/mandoc.1   6 Feb 2022 00:29:03 -   1.180
> +++ usr.bin/mandoc/mandoc.1   8 Feb 2022 07:33:42 -
> @@ -735,7 +735,7 @@ output mode implies
>   .Sh EXAMPLES
>   To page manuals to the terminal:
>   .Pp
> -.Dl $ mandoc -l mandoc.1 man.1 apropos.1 makewhatis.8
> +.Dl $ mandoc -a mandoc.1 man.1 apropos.1 makewhatis.8
>   .Pp
>   To produce HTML manuals with
>   .Pa /usr/share/misc/mandoc.css



Re: rev(1): pull MB_CUR_MAX out of the hot loop

2022-01-12 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Sun, Jan 09, 2022 at 08:09:20AM +0100:

> I fully agree with your reasoning and also prefer this one over the
> previous two diff.
> 
> OK martijn@

Thanks for checking; committed.
Also thanks to cheloha@ for his work on this issue
and to milllert@ for his feedback.
  Ingo


> On Sun, 2022-01-09 at 00:45 +0100, Ingo Schwarze wrote:

>> Index: rev.c
>> ===
>> RCS file: /cvs/src/usr.bin/rev/rev.c,v
>> retrieving revision 1.13
>> diff -u -p -r1.13 rev.c
>> --- rev.c10 Apr 2016 17:06:52 -  1.13
>> +++ rev.c8 Jan 2022 23:19:46 -
>> @@ -46,13 +46,14 @@ void usage(void);
>>  int
>>  main(int argc, char *argv[])
>>  {
>> -char *filename, *p = NULL, *t, *u;
>> +char *filename, *p = NULL, *t, *te, *u;
>>  FILE *fp;
>>  ssize_t len;
>>  size_t ps = 0;
>> -int ch, rval;
>> +int ch, multibyte, rval;
>>  
>>  setlocale(LC_CTYPE, "");
>> +multibyte = MB_CUR_MAX > 1;
>>  
>>  if (pledge("stdio rpath", NULL) == -1)
>>  err(1, "pledge");
>> @@ -83,14 +84,16 @@ main(int argc, char *argv[])
>>  if (p[len - 1] == '\n')
>>  --len;
>>  for (t = p + len - 1; t >= p; --t) {
>> -if (isu8cont(*t))
>> -continue;
>> -u = t;
>> -do {
>> -putchar(*u);
>> -} while (isu8cont(*(++u)));
>> +te = t;
>> +if (multibyte)
>> +while (t > p && isu8cont(*t))
>> +--t;
>> +for (u = t; u <= te; ++u)
>> +if (putchar(*u) == EOF)
>> +err(1, "stdout");
>>  }
>> -putchar('\n');
>> +if (putchar('\n') == EOF)
>> +err(1, "stdout");
>>  }
>>  if (ferror(fp)) {
>>  warn("%s", filename);
>> @@ -104,7 +107,7 @@ main(int argc, char *argv[])
>>  int
>>  isu8cont(unsigned char c)
>>  {
>> -return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80;
>> +return (c & (0x80 | 0x40)) == 0x80;
>>  }
>>  
>>  void



Re: rev(1): pull MB_CUR_MAX out of the hot loop

2022-01-08 Thread Ingo Schwarze
Oopsie...

Ingo Schwarze wrote on Sun, Jan 09, 2022 at 12:45:27AM +0100:

> # Demonstate broken error handling in the old code.
> 
>$ ./obj/wrap /obin/rev   
>$ echo $?  
>   42

Stupid me, this is what i meant, of course:

   $ ./obj/wrap /oldbin/rev
   $ echo $?
  0

The point is that the -current rev(1) silently ignores write errors
*and returns indicating success* even though it produced no output or
incomplete output.  But i can't show that if i mistype the path to
the copy of the old binary that i saved earlier...

Yours,
  Ingo



Re: rev(1): pull MB_CUR_MAX out of the hot loop

2022-01-08 Thread Ingo Schwarze
Hi,

Martijn van Duren wrote on Sat, Jan 08, 2022 at 08:30:20AM +0100:

> Why not go for the following diff?
> It has a comparable speed increase, but without the added complexity
> of a second inner loop.

Actually, i like the idea of not duplicating the loop, in cases where
that is possible without a noticeable performance cost.

I admit i OK'ed the original UTF-8 diff almost six years ago,
but looking at the code again, i now dislike how it is testing
every byte twice for no apparent reason.  Even worse, i regard
the lack of I/O error checking on write operations as a bug.
Note that it does make sense to proceed to the next file on *read*
errors, whereas a *write* error should better be fatal.

For those reason, and because i prefer versions of isu8cont()
that do not inspect the locale, i propose the following patch
instead.

As an aside, i tried using fwrite(3) instead of the manual
putchar(*u) loop, but that is almost exactly as slow as repeatedly
calling MB_CUR_MAX, probably due to either locking or orientation
setting or some aspect of iov handling or buffering or more than
one of those; i refrained from profiling it.

 - 8< - schnipp - >8 - 8< - schnapp - >8 -

Index: rev.c
===
RCS file: /cvs/src/usr.bin/rev/rev.c,v
retrieving revision 1.13
diff -u -p -r1.13 rev.c
--- rev.c   10 Apr 2016 17:06:52 -  1.13
+++ rev.c   8 Jan 2022 23:19:46 -
@@ -46,13 +46,14 @@ void usage(void);
 int
 main(int argc, char *argv[])
 {
-   char *filename, *p = NULL, *t, *u;
+   char *filename, *p = NULL, *t, *te, *u;
FILE *fp;
ssize_t len;
size_t ps = 0;
-   int ch, rval;
+   int ch, multibyte, rval;
 
setlocale(LC_CTYPE, "");
+   multibyte = MB_CUR_MAX > 1;
 
if (pledge("stdio rpath", NULL) == -1)
err(1, "pledge");
@@ -83,14 +84,16 @@ main(int argc, char *argv[])
if (p[len - 1] == '\n')
--len;
for (t = p + len - 1; t >= p; --t) {
-   if (isu8cont(*t))
-   continue;
-   u = t;
-   do {
-   putchar(*u);
-   } while (isu8cont(*(++u)));
+   te = t;
+   if (multibyte)
+   while (t > p && isu8cont(*t))
+   --t;
+   for (u = t; u <= te; ++u)
+   if (putchar(*u) == EOF)
+   err(1, "stdout");
}
-   putchar('\n');
+   if (putchar('\n') == EOF)
+   err(1, "stdout");
}
if (ferror(fp)) {
warn("%s", filename);
@@ -104,7 +107,7 @@ main(int argc, char *argv[])
 int
 isu8cont(unsigned char c)
 {
-   return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80;
+   return (c & (0x80 | 0x40)) == 0x80;
 }
 
 void

 - 8< - schnipp - >8 - 8< - schnapp - >8 -

The rest of this mail contains test reports.
These test reports use two test programs:

 - 8< - schnipp - >8 - 8< - schnapp - >8 -

/* makeutf8.c */

#include 
#include 
#include 
#include 
#include 

int
main(void)
{
FILE*in, *out;
char*line = NULL;
size_t   linesize = 0;
unsigned int wc, wcl = 0;

if (setlocale(LC_CTYPE, "en_US.UTF-8") == NULL)
err(1, "setlocale");

if ((in = fopen("/usr/src/gnu/usr.bin/perl/lib/unicore/"
"UnicodeData.txt", "r")) == NULL)
err(1, "in");
if ((out = fopen("utf8.txt", "w")) == NULL)
err(1, "out");

while (getline(, , in) != -1) {
if (sscanf(line, "%x;", ) < 1)
err(1, "scanf: %s", line);
if (wc > wcl + 1)
if (fputwc(L'\n', out) == WEOF)
err(1, "write EOL");
if (fputwc(wc, out) == WEOF) {
if (errno == EILSEQ)
warn("write U+%04x", wc);
else
err(1, "write U+%04x", wc);
}
wcl = wc;
}
if (ferror(in))
errx(1, "read error");
if (fputwc(L'\n', out) == WEOF)
err(1, "write final EOL");

fclose(out);
fclose(in);
return 0;
}

 - 8< - schnipp - >8 - 8< - schnapp - >8 -

/* wrap.c */

#include 

int
main(int argc, char *argv[])
{

Re: rev(1): pull MB_CUR_MAX out of the hot loop

2022-01-07 Thread Ingo Schwarze
Hi Scott,

thank you for spending quite some work on our small utility programs,
and sorry for failing to follow up as often as i should.

Scott Cheloha wrote on Fri, Jan 07, 2022 at 09:45:41AM -0600:

> In rev(1), we call MB_CUR_MAX for every byte in the input stream.
> This is extremely expensive.

That's actually quite ridiculous.

The reason why MB_CUR_MAX is so expensive is -- multi-thread support.
That is of course completely pointless for small utility programs,
on any operating system.  And on OpenBSD in particular, it is
mostly useless even for large application programs that actually
use multiple threads because we basically only support one single
locale in the first place.

So i had a brief look at the implementation of MB_CUR_MAX in our libc
to see whether it could be made less expensive, but i don't see a
simple way without potentially breaking some application programs
(for example, consider a third-party webserver that wants to run
with a POSIX locale in one thread but with a UTF-8 locale in
another thread).

Apart from that, even if it could be made less expensive, avoiding
to call it in tight loops might still be valuable in the interest
of portability.  On other systems, it must necessarily be expensive:
that web server might want to run one thread with UTF-8, one with
UTF-16, one with UTF-32, and one with Shift JIS...

So i agree with jour general idea to improve rev(1) and other
small utility programs that might have similar issues.

> It is much cheaper to call it once per line and use a simpler loop
> (see the inlined patch below) if the current locale doesn't handle
> multibyte characters:

[useful measurements snipped]

> CC schwarze@ to double-check I'm not misunderstanding MB_CUR_MAX.
> I'm under the impression the return value cannot change unless
> we call setlocale(3).

Mostly correct, except that uselocale(3) may also change it.
But of course, we don't call uselocale(3) in small utility programs.

> ok?

I think your patch can still be improved a bit before committing.


millert@ wrote:

> Why not just store the result of MB_CUR_MAX in a variable and use
> that?  It's value is not going to change during execution of the
> program.  This made a big difference in sort(1) once upon a time.

I agree with that.

Even with your patch, rev(1) would still be slow for large files
consisting of very short lines.


> Index: rev.c
> ===
> RCS file: /cvs/src/usr.bin/rev/rev.c,v
> retrieving revision 1.13
> diff -u -p -r1.13 rev.c
> --- rev.c 10 Apr 2016 17:06:52 -  1.13
> +++ rev.c 7 Jan 2022 15:38:39 -
> @@ -82,13 +82,18 @@ main(int argc, char *argv[])
>   while ((len = getline(, , fp)) != -1) {
>   if (p[len - 1] == '\n')
>   --len;
> - for (t = p + len - 1; t >= p; --t) {
> - if (isu8cont(*t))
> - continue;
> - u = t;
> - do {
> - putchar(*u);
> - } while (isu8cont(*(++u)));
> + if (MB_CUR_MAX == 1) {
> + for (t = p + len - 1; t >= p; t--)
> + putchar(*t);
> + } else {
> + for (t = p + len - 1; t >= p; --t) {

I dislike how you use t-- in one loop but --t in the other.
It made me wonder for a moment whether that needs to be different
depending on the character set.  Wouldn't it be less confusing
to pick one of the forms and stick to it?

> + if (isu8cont(*t))
> + continue;
> + u = t;
> + do {
> + putchar(*u);
> + } while (isu8cont(*(++u)));
> + }
>   }
>   putchar('\n');
>   }
> @@ -104,7 +109,7 @@ main(int argc, char *argv[])
>  int
>  isu8cont(unsigned char c)
>  {
> - return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80;
> + return (c & (0x80 | 0x40)) == 0x80;
>  }

The isu8cont() function is a semi-standard function used in many
OpenBSD utility programs, originally designed by tedu@, i think,
and it has proven highly useful to get UTF-8 support going.

But back in the day, i think we failed to realize that MB_CUR_MAX
is expensive.  In that sense, it is probably somewhat ill-designed:
it is typically called in tight loops, so i guess there aren't
many use cases where the original version is really that good.

Given that ksh(1) and ypldap(8) already use versions of this
function identical to your version, i'm OK with your suggestion
of keeping the name while changing the semantics of the function,

Re: rpki-client: check ipAddrBlock and autonomousSysNum for criticality

2021-12-25 Thread Ingo Schwarze
Hi Claudio,

Claudio Jeker wrote on Sat, Dec 25, 2021 at 05:48:53PM +0100:
> On Sat, Dec 25, 2021 at 11:36:50AM +0100, Theo Buehler wrote:

>> These extensions MUST be marked critical by the sections of the spec
>> mentioned in the cryptowarnx(). That's determined by the ASN1_BOOLEAN
>> that is extracted and ignored after the FIXME a few lines below each of
>> the two hunks. Rather than getting the info from there, it's easier to
>> use an API call that checks what was already parsed by d2i_X509().

> I like this a lot. OK claudio@

Merely to avoid a misunderstanding:
I have no oponion whatsoever on this rpki-client patch.

> I would love to get rid of X509_V_FLAG_IGNORE_CRITICAL and use a callback
> to ensure the right extensions are critical but I never managed to
> understand how the X509_verify_cert() callback actually works.
> Documentation seems to be non-existent.

Not exactly non-existent, but admittedly very poor indeed.

There is X509_STORE_CTX_set_verify_cb(3), mostly written by
Dr. Stephen Henson in 2009.  But it does not really explain how the
callback is used and instead only provides four examples.

The reasons for this being so badly documented are as follows:

 1. X.509 PKI Path Validation as specified in RFC 5280 is ridiculously
complicated no matter the implementation.

 2. Our API design and implementation was inherited from OpenSSL,
and everybody knows what that means.

 3. Our documentation was inherited from OpenSSL, and everybody
knows what that means.

 4. Beck@ has embarked on an epic quest to mitigate parts of the
consequences of item 2.  Consequently, the surrounding airs
have been saturated with large amounts of dust for quite
some time now.

 5. While parts of that dust have arguably started to settle,
the air still feels somewhat dusty to me, and i have been
fearing that spending the several days of work required
to throughly improve this particular part of the documentation
might end up as working straight for the waste paper basket
once beck@ achieves his next major steps forward.

So given that there are literally hundreds of other public API functions
in libcrypto that are still completely undocumented (as opposed to the
verification callback being documented very badly), i so far refrained
from even trying to attack that particular problem...

Yours,
  Ingo



Re: document patch(1) not supporting binary diffs

2021-12-21 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Mon, Dec 20, 2021 at 10:37:00AM -0700:
> Ingo Schwarze  wrote:

>> The patch(1) manual talks about "lines" throughout,
>> and for binary files, a concept of "lines" does not even exist.

> That is a bit strong.  Some utilities designed for "text files" have no
> problem being both 8-bit clean and very-long-line capable.  For example,
> you can use emacs to edit a /bsd.  Add a space at the start of the file,
> save it.  re-edit the file, delete the space, and you get the same
> result.  I changed the internals of mg decades ago to also be 8-bit
> clean + very-long-line capable.

Yes, i agree such extensions are often useful.

> I think in POSIX this concept of "text
> file" is very weakly defined -- there are a variety of line-length
> issues, weird issues relating to \n and \r translation, obvious
> inability to handle embeded NUL etc, etc etc, and even end-of-file
> newline termination.
> 
> It is a bit of a copout to say "text file", when the truth really is is
> "imperfect content handling".  OTOH, adding perfect handling in many unix
> programs would be basically impossible, the complexity would be pretty high.
> 
> It may even be that the patch inputfile format cannot handle arbitrary data.
> It certainly has newline followed by '+', '-', or ' ' and ending with '\n'
> as context tracking, so it might be impossible.
 
>> The problem arises from the manual failing to mention that patch(1)
>> operates on text files.

> Yes, I am OK with it saying so.

>> Many standard utilities operate on text files
>> only, the concept of a text-file is both well-known and defined by
>> POSIX, so there is no need to re-explain what that means in individual
>> pages.

> Nope, I think POSIX fails to create a good definition.

Admittedly, the POSIX definition is a bit weak in some respects,
but all the same, it is usable as a minmium requirement that many
standard utilities have in common.

> Look at the
> recent commit to uniq.c, you even commented on it.  patch can handle files
> without trailing \n, right?   If the POSIX definition was real, patch would
> probably need to fail upon those files, creating other problems.

If POSIX says "The foobar utility operates on text files", that does not
mean that it requires foobar to fail if the input is not a text file.
It merely means that the behaviour is unspecified in that case.

It may occasionally be useful to add statements similar to the following
to the STANDARDS section:

  As an extension to that specification,

  foobar can handle files containing NUL bytes
or
  foobar can handle files containing bytes that do not form characters
or
  foobar can handle lines longer than LINE_MAX bytes
or
  foobar can handle files lacking the terminating newline character
  on the last line

Only in cases where such an extension is useful for some practical
purpose and we are willing to maintain it, of course.

Yours,
  Ingo



Re: document patch(1) not supporting binary diffs

2021-12-21 Thread Ingo Schwarze
Jason McIntyre wrote on Mon, Dec 20, 2021 at 04:13:19PM +:

> i'm ok with your diff

Thanks for checking.

> but it is slightly misleading in context of
> reading from stdin. i suppose that is no biggie.

I think i see why you say so.

Speaking *formally*, what we usually do is describing what the utility
does by default in the first sentence, then explain how command line
options modify that behaviour below "The options are as follows".
We could do that here, somewhat like this:

  patch reads a text file called the "patch file" from standard input,
  containing any of the four forms ... yadayada ...
  and applies those differences to an original text file ...

Besides, usually, we would *not* show a SYNOPSIS line

  patch  it would be simpler to just s/patch file/text file/ maybe?

True, that would be simpler and not outright wrong.  But in normal
operation, patch(1) deals with many text files, so i worry that
people might get confused as to which text file we are talking about.
It seems valuable for the first sentence to make it clear that one
among all these files is special and to introduce the term "patch file"
or ".Ar patchfile".

So to first get the issue reported by chrisz@ out of the way, i
committed the patch as-is.

We could say

  .Nm
  takes a text file called the
  .Sq patch file
  containing any of the four forms ...

Do you think that is better, like in the patch below?

Yours,
  Ingo


Index: patch.1
===
RCS file: /cvs/src/usr.bin/patch/patch.1,v
retrieving revision 1.34
diff -u -r1.34 patch.1
--- patch.1 21 Dec 2021 08:07:20 -  1.34
+++ patch.1 21 Dec 2021 08:37:54 -
@@ -47,8 +47,8 @@
 .Pf \*(Lt Ar patchfile
 .Sh DESCRIPTION
 .Nm
-takes the text file
-.Ar patchfile
+takes a text file called the
+.Dq patch file
 containing any of the four forms of difference
 listing produced by the
 .Xr diff 1
@@ -56,7 +56,7 @@
 producing a patched version.
 If
 .Ar patchfile
-is omitted, or is a hyphen, the patch will be read from the standard input.
+is omitted or is a hyphen, the patch file is read from the standard input.
 .Pp
 .Nm
 will attempt to determine the type of the diff listing, unless overruled by a



Re: document patch(1) not supporting binary diffs

2021-12-20 Thread Ingo Schwarze
Hi Christopher,

Christopher Zimmermann wrote on Mon, Dec 20, 2021 at 04:01:49PM +0100:

> base patch cannot work with diffs of binary files. It might help to say 
> so in the manpage since other implementations do support this (ab)use of 
> patch. OK?

I agree you are pointing out a slight problem with the manual page,
but i dislike your patch for two reasons:

This is not a bug, but a fundamental design decision.
The patch(1) manual talks about "lines" throughout,
and for binary files, a concept of "lines" does not even exist.

And the basic idea of a tool ought to be described in the first
sentence of the description, not as an afterthought at the very end.

The problem arises from the manual failing to mention that patch(1)
operates on text files.  Many standard utilities operate on text files
only, the concept of a text-file is both well-known and defined by
POSIX, so there is no need to re-explain what that means in individual
pages.

Consequently, i think something like the following would be better.

Yours,
  Ingo


Index: patch.1
===
RCS file: /cvs/src/usr.bin/patch/patch.1,v
retrieving revision 1.33
diff -u -r1.33 patch.1
--- patch.1 9 Nov 2021 16:13:40 -   1.33
+++ patch.1 20 Dec 2021 15:42:10 -
@@ -47,10 +47,12 @@
 .Pf \*(Lt Ar patchfile
 .Sh DESCRIPTION
 .Nm
-will take a patch file containing any of the four forms of difference
+takes the text file
+.Ar patchfile
+containing any of the four forms of difference
 listing produced by the
 .Xr diff 1
-program and apply those differences to an original file,
+program and applies those differences to an original text file,
 producing a patched version.
 If
 .Ar patchfile



Re: Fix typo in '}' command in less.1

2021-12-10 Thread Ingo Schwarze
Hi Richard,

Richard Ulmer wrote on Fri, Dec 10, 2021 at 04:04:11PM +0100:

> this is just a minor copy-and-paste error fix for the less(1) man page.

Committed, thank you.

> I also contributed this upstream: https://github.com/gwsw/less/pull/228
> 
> I hope this is the right mailing list for this.

Yes, it is.

Posting patches to tech@ is not required if, for a specific patch,
you know a better destination.  But tech@ is appropriate for all
kinds of patches, even documentation, in particular when you are
unsure where to send them.

Yours,
  Ingo


> Index: less.1
> ===
> RCS file: /cvs/src/usr.bin/less/less.1,v
> retrieving revision 1.58
> diff -u -p -u -r1.58 less.1
> --- less.1  23 Sep 2021 18:46:25 -  1.58
> +++ less.1  10 Dec 2021 14:58:46 -
> @@ -852,7 +852,7 @@ If a right curly bracket appears in the
>  the } command will go to the matching left curly bracket.
>  The matching left curly bracket is positioned on the top
>  line of the screen.
> -If there is more than one right curly bracket on the top line,
> +If there is more than one right curly bracket on the bottom line,
>  a number N may be used to specify the N-th bracket on the line.
>  .It Ic \&(
>  Like {, but applies to parentheses rather than curly brackets.



Re: Explicitly tag commands in vi(1)

2021-11-21 Thread Ingo Schwarze
Hi Simon,

Simon Branch wrote on Sat, Nov 20, 2021 at 03:10:22PM -0800:

> Here's a diff that adds explicit .Tg macros to vi(1),

We don't want that:

   $ man -O tag=Tg mdoc
  [...]
   In most cases, adding a Tg macro would be redundant because mandoc(1)
   is able to automatically tag most definitions.  This macro is intended
   for cases where automatic tagging of a term is unsatisfactory, for
   example if a definition is not tagged automatically (false negative)
   or if places are tagged that do not define the term (false positives).
  [...]

This is a typical example of a .Tg macro that should not be added
because it is redundant:

> +.Tg c
>  .It Fl c Ar cmd

The command

   $ man -O tag=c vi

already works without your patch.

I do not think .Tg should be used to enforce ambiguous tags (like "c"
pointing to both the command line option and the vi(1) "change" command).
On the one hand, adding multiple explicit .Tg macros for the same name
is very noisy.  On the other hand, it is much less useful than having
an unambiguous tag because both "-O tag=c" for terminal output and the
fragment identifier "#c" in HTML output only target the first copy.
Consequently, the cost-to-benefit ratio is bad for ambiguous manual
tagging.

In general, if the same term is defined at multiple places, or if -
like in this case - multiple different terms represented by the same
word are defined in the same file, tagging does not work very well yet.
The basic concept isn't fully worked out for such complicated cases,
and i'm not even convinced designing a solution for that task that
is simple enough to be worth implementing is possible.  As long as
the basic design is an unsolved problem, trying to handle individual
cases by using the .Tg macro is a bad idea.

> so that you can jump to vi or ex commands using -Otag or :t.
> This patch *should* include every command, but I couldn't figure out
> how to tag the '!' and ':!' commands; none of these worked:
> 
>.Tg
>.Tg !
>.Tg !\&
>.Tg "!"

man(1) already writes the "!" tag even without your patch:

   $ grep ^! /tmp/man.*
  /tmp/man.JF8ZO3DibX:! /tmp/man.sZVbqIr51P 729

which alreday targets this paragraph:

  ! argument(s)
  [range] ! argument(s)
 Execute a shell command, or filter lines through a shell command.

But less(1) does not support that, see /usr/src/usr.bin/less/tags.c
line 217:

   /*
* Search the tags file for the desired tag.
*/
   while (fgets(tline, sizeof (tline), f) != NULL) {
if (tline[0] == '!')
/* Skip header of extended format. */
continue;

The ctags(1) format is not perfect.  Some strings exist that it cannot
tag, and strings starting with an exclamation mark are an example of
such strings.  Strings containing space characters are another.

> This patch doesn't tag special keys either (word erase, literal next,
> etc.), because tag names can't contain whitespace and I'd prefer
> consistency with the manpage.  One solution would be to say WERASE and
> LNEXT instead, which is what ksh(1) does, and this also makes it easier
> to lookup what 'word erase' and 'literal next' really are.
> 
>$ man -k Dv=WERASE Dv=LNEXT
>stty(1) - set the options for a terminal device interface
>termios(4) - general terminal line discipline
> 
>$ man 4 termios -Otag=WERASE
>WERASESpecial character on input and is recognized if the ICANON
>  flag is set.  Erases the last word ...

Maybe; but one thing at a time.

We really should not discuss text changes in a thread about tagging.

> If a line is explicitly tagged with the .Tg macro, mandoc removes any
> other automatically-created tags with the same name.  So this patch also
> explicitly tags the 'print', 'number', and 'list' options and the
> [-ceFRrSstvw] flags.
> 
> Comments?  OK?

Not OK.  This is excessive.  The .Tg macro is intended to be used
sparingly, not to be smeared all over the place.

I do not object to marking unambiguos targets that cannot be recognized
automatically.  But you cannot reasonably aim for completeness in this
respect.

Yours,
  Ingo



Re: diff: improve legibility of structs in several manpages

2021-11-20 Thread Ingo Schwarze
Hi Jan,

sorry that i failed to look at this earlier.

Even with your diff, the style is still not completely consistent.

I think that inside .Bd -literal, the best style is exactly the same
as in source code:

structname{
typename;
type*name;
};

where

 *  can be more than one tab
   if some of the types in the struct are long
 * and  can be two spaces if the struct contains a double pointer
 * and  can be reduced to four spaces if space is tight,
   or a different number of spaces in very unusual cases

Even with your diff, some of the structs still indent the type
using eight spaces instead of a tab and some structs still use
multiple spaces instead of .

That said, i think your diff is an improvement and you should commit it
(OK schwarze@), except that i disagree with your change to ktrace(2).
That one ought to use two tabs for  instead of one tab plus eight
spaces and it should use one tab after "struct timespec" instead of
one space.

On Tue, Oct 26, 2021 at 06:56:10PM +0200, Jan Klemkow wrote:

> This diff harmonises the indentation of struct members and comments in
> several manpages.  Also fixes line wraps of comments on 80 column
> terminals.  General uses tabs for general indentation and 4 spaces on
> tight spots.  Also uses extra space to align pointers and non-pointers
> as we do this on certain places in our source.

Yes, i agree with all of that.

Yours,
  Ingo



Re: locale in expr(1)

2021-11-15 Thread Ingo Schwarze
Hi Marc,

Marc Espie wrote on Mon, Nov 15, 2021 at 05:06:23PM +0100:
> On Mon, Nov 15, 2021 at 03:43:47PM +0100, Jan Stary wrote:
>> On Nov 10 18:46:08, h...@stare.cz wrote:
>>> On Nov 10 18:15:44, h...@stare.cz wrote:

 expr(1) says
 
 expr1 {=, >, >=, <, <=, !=} expr2
 
 Returns the results of integer comparison if both arguments
 are decimal integers; otherwise, returns the results of
 string comparison using the locale-specific collation

Yikes.  Yes, that is true in general.  Then again, the OpenBSD
C library intentionally supports only one single collation sequence:
POSIX.

 sequence.  The result of each comparison is 1 if the specified
 relation is true, or 0 if the relation is false.
 
 Looking at expr.c, it boils down to strcoll(), which ignores the locale.
 So the statement is technically true, but there isn't really any
 "locale-specific collation sequence".
 
 Would it be simpler to leave the mention of locale completely out?

Yes, probably, and then add a CAVEATS section warning that on other
operating systems, the collation order may vary depending on the
LC_COLLATE environment variable.

 Or state something similar to what sort(1) or strcoll(3) and other
 string-comparing routines say?

>>> For example,
>>> 
>>>  $ expr č '<' d  
>>>  0
>>> 
>>> Which locale-specific collation sequence determined that?
>>> Byte by byte, it's
>>> 
>>>  c48d  U+00010d  č   LATIN SMALL LETTER C HACEK
>>>  64U+64  d   LATIN SMALL LETTER D
>>> 
>>> and I don't think there is anything more to it.

Yes.

>>> (Although in the Czech alphabet, č comes just before d.)

Fair enough, but adding knowledge about that to the C library
would be quite insane.

> I don't know what Ingo et al plans wrt collation sequence are, but
> it is a relatively innocuous part of locales.

I think i said so several times: i talked to bapt@ when he implemented
LC_COLLATE support in the FreeBSD libc.  To put it mildly, he disagreed
with your opinion.  Or, to be clear, he was swearing because he hated
the resulting complexity, saying it was much worse than he had expected.
And i don't think bapt@ is easily scared.

> More specifically, the ISO committee introduced functions specifically
> to deal with collation sequence, and left the original raw C functions
> (strcmp and the like) completely pristine... so what's missing is mostly
> an engine to compile location tables into something fairly reasonable.
> 
> There is way less of a security risk than, say, LC_CTYPE... ;)

I don't know.  A fairly reliable way to create security risks is
complexity.  Apart from the erratic run time behaviour that is likely to
trip up sysadmins - LC_COLLATE can change the collation sequence even
among ASCII characters - collation support looks like a very effective
way of putting excessive complexity right into the C library.

So, let me say it again, i would argue for rejecting it even if somebody
offered a complete patch implementing it.  The C library is just the
wrong place for handling such matters.

Yours,
  Ingo



Re: Typo in ASN1_STRING_length.3

2021-11-14 Thread Ingo Schwarze
Hi Matthias,

Matthias Schmidt wrote on Sun, Nov 14, 2021 at 10:41:40AM +0100:

> in the recent commit to ASN1_STRING_length.3 a typo sneaked in:

Committed.

Thank you for reading such patches and for reporting the error!

Yours,
  Ingo


> diff 96921f4986cebab89014f4b188cc00426e2e11d8 /usr/src
> blob - 9d8cbf83ce75abecbc8faf0bf1fce674ef2a8abf
> file + lib/libcrypto/man/ASN1_STRING_length.3
> --- lib/libcrypto/man/ASN1_STRING_length.3
> +++ lib/libcrypto/man/ASN1_STRING_length.3
> @@ -346,7 +346,7 @@ returns a number of bytes.
>  and
>  .Fn ASN1_STRING_copy
>  return 1 on success or 0 on failure.
> -They fail is memory allocation fails.
> +They fail if memory allocation fails.
>  .Fn ASN1_STRING_set
>  and
>  .Fn ASN1_OCTET_STRING_set



Re: locale in who(1)

2021-11-10 Thread Ingo Schwarze
Hi,

Martijn van Duren wrote on Wed, Nov 10, 2021 at 02:03:51PM +0100:

> I see no reason to keep it.
> OK martijn@ if anyone wants to commit this.

Done.

> On Wed, 2021-11-10 at 13:37 +0100, Jan Stary wrote:

>> Why does who(1) need to setlocale()?

On some systems, setlocale(LC_TIME) influences strftime(3).
But who(1) uses ctime(3) instead, and

  https://pubs.opengroup.org/onlinepubs/9699919799/functions/ctime.html

explicitly mentions that ctime(3) does not support localized date and
time formats.

Anyway, we clearly do not want localization in such a place,
neither now nor in the future.

Yours,
  Ingo


>> Index: who.c
>> ===
>> RCS file: /cvs/src/usr.bin/who/who.c,v
>> retrieving revision 1.30
>> diff -u -p -r1.30 who.c
>> --- who.c12 Jul 2021 15:09:20 -  1.30
>> +++ who.c10 Nov 2021 12:37:05 -
>> @@ -44,7 +44,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  
>>  void  output(struct utmp *);
>>  void  output_labels(void);
>> @@ -71,8 +70,6 @@ main(int argc, char *argv[])
>>  FILE *ufp;
>>  char *t;
>>  int c;
>> -
>> -setlocale(LC_ALL, "");
>>  
>>  if (pledge("stdio unveil rpath getpw", NULL) == -1)
>>  err(1, "pledge");



Re: [patch] httpd static gzip compression

2021-11-05 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Thu, Nov 04, 2021 at 08:27:47AM -0600:
> prx  wrote:
>> On 2021/11/04 14:21, prx wrote:

>>> The attached patch add support for static gzip compression.
>>> 
>>> In other words, if a client support gzip compression, when "file" is
>>> requested, httpd will check if "file.gz" is avaiable to serve.

>> This diff doesn't compress "on the fly".
>> It's up to the webmaster to compress files **before** serving them.

> Does any other program work this way?

Yes.  The man(1) program does.  At least on the vast majority of
Linux systems (at least those using the man-db implementation
of man(1)), on FreeBSD, and on DragonFly BSD.

Those systems store most manual pages as gzipped man(7) and mdoc(7)
files, and man(1) decompresses them every time a user wants to look
at one of them.  You say "man ls", and what you get is actually
/usr/share/man/man1/ls.1.gz or something like that.

For man(1), that is next to useless because du -sh /usr/share/man =
42.6M uncompressed.  But it has repeatedly caused bugs in the past.
I would love to remove the feature from mandoc, but even though it is
rarely used in OpenBSD (some ports installed gzipped manuals in the
past, but i think the ports tree has been clean now for quite some
time; you might still need the feature when installing software
or unpacking foreign manual page packages without using ports)
it would be a bad idea to remove it because it is too widely used
elsewhere.  Note that even the old BSD man(1) supported it.

> Where you request one filename, and it gives you another?

You did not ask what web servers do, but we are discussing a patch to
a webserver.  For this reason, let me note in passing that even some
otherwise extremely useful sites get it very wrong the other way round:

 $ ftp https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
Trying 130.89.148.77...
Requesting https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
100% |**|  8050   00:00
8050 bytes received in 0.00 seconds (11.74 MB/s)
 $ file ls.1.en.gz
ls.1.en.gz: troff or preprocessor input text
 $ grep -A 1 '^.SH NAME' ls.1.en.gz  
.SH NAME
ls \- list directory contents
 $ gunzip ls.1.en.gz
gunzip: ls.1.en.gz: unrecognized file format

> I have a difficult time understanding why gzip has to sneak it's way
> into everything.
> 
> I always prefer software that does precisely what I expect it to do.

Certainly.

I have no strong opinion whether a webserver qualifies as "everything",
though, nor did i look at the diff.  While all manpages are small in the
real world, some web servers may have to store huge amounts of data that
clients might request, so disk space might occasionally be an issue on
a web server even in 2021.  Also, some websites deliver huge amounts of
data to the client even when the user merely asked for some text (not sure
such sites would consider running OpenBSD httpd(8), but whatever :) - when
browsing the web, bandwidth is still occasionally an issue even in 2021,
even though that is a rather absurd fact.

Yours,
  Ingo



Re: uniq: add HISTORY section

2021-11-02 Thread Ingo Schwarze
Hi Todd,

Todd C. Miller wrote on Tue, Nov 02, 2021 at 08:20:47AM -0600:

> uniq appeared in research Unix version 3.

OK schwarze@.

If - like in this case - the indicated release is the first one in any
system to ever contain the feature, and not just the first one among
the direct ancestors of OpenBSD, i prefer the wording "utility first
appeared in", though.  No big deal, your call.

Yours,
  Ingo


> Index: usr.bin/uniq/uniq.1
> ===
> RCS file: /cvs/src/usr.bin/uniq/uniq.1,v
> retrieving revision 1.21
> diff -u -p -u -r1.21 uniq.1
> --- usr.bin/uniq/uniq.1   23 Dec 2017 00:52:33 -  1.21
> +++ usr.bin/uniq/uniq.1   2 Nov 2021 14:14:19 -
> @@ -159,3 +159,8 @@ The historic
>  and
>  .Fl Ns Ar number
>  options have been deprecated but are still supported in this implementation.
> +.Sh HISTORY
> +A
> +.Nm
> +utility appeared in
> +.At v3 .



Re: uniq(1): ignore newline when comparing lines?

2021-11-02 Thread Ingo Schwarze
Hi,

Todd C. Miller wrote on Tue, Nov 02, 2021 at 08:12:00AM -0600:
> On Mon, 01 Nov 2021 21:04:54 -0500, Scott Cheloha wrote:

>> Yes it would be simpler.  However I didn't want to start changing the
>> input -- which we currently don't do -- without discussing it.
>>
>> The standard says we should "write one copy of each input line on the
>> output." So, if we are being strict, we don't add a newline that isn't
>> there, because that isn't what we read.  Any other interpretation
>> requires handwaving about what an "input line" even is.

> The System V version of uniq actually ignores the last line if it
> doesn't end in a newline.  For example:
> 
> $ printf "bar\nfoo\nfool" | uniq
> bar
> foo
> 
> AIX, Solaris, and HP-UX still exhibit this behavior.  What happens
> is that the gline() function (which reads the line) returns non-zero
> when it hits EOF, discarding any input in that line.  Interestingly,
> it does realloc the line buffer as needed to handle long lines.
> 
> So really, this is a corner case where you can't count on consistent
> behavior among implementations and we just need to do what we think
> is best.  If you prefer we retain the existing behavior wrt a final
> line without a newline that is OK with me.

Just a quick comment without looking at the code:

POSIX says on
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/uniq.html :

  INPUT FILES
The input file shall be a text file.

By definition, a file that does not end in a newline character
is *NOT* a text file.  Consequently, what uniq(1) does in this case
is unspecified.

My personal opinion is that in general, if utilities are specified to
process text files and the terminating newline is missing, it is best to
behave as if the mandatory terminating newline were present.  The reason
why i think so is that forgetting the terminating newline is a popular
user error, and silently assuming a terminating newline causes less
surprise with users than silently discarding the incomplete last line.

Most users are not even aware that a file without a terminating newline
is not a text file, so it definitely is less surprising for the vast
majority.  But even those who are aware of that quirk in the definition
of what a text file is (like myself) are regularly surprised by the
consequences...

Yours,
  Ingo



Re: demystify vport(4) in vport(4) and ifconfig(8)

2021-10-29 Thread Ingo Schwarze
Hi Stuart,

Stuart Henderson wrote on Fri, Oct 29, 2021 at 01:59:38PM +0100:
> On 2021/10/29 14:08, Ingo Schwarze wrote:
>> Stuart Henderson wrote on Fri, Oct 29, 2021 at 10:53:41AM +0100:
>>> On 2021/10/28 23:19, Klemens Nanni wrote:
>>>> On Fri, Oct 29, 2021 at 12:57:54AM +0200, Ingo Schwarze wrote:

>>>>>   MANPAGER=firefox man -T html $(ifconfig -C)

>>>> This doesn't work if firefox is already running as the MANPAGER firefox
>>>> process exits immediately after sending the file/link to the running
>>>> process, which causes mandoc to exit after removing the temporary file,
>>>> by which time firefox fails to open the no longer exiting file.

>>> I use mutt_bgrun for things like this

>> I don't see how that can possibly help.

> Oh, of course you're right - I was confused between my helper
> scripts. The one I actually use for html in mutt looks like this,
> so it's not any user for MANPAGER..
> 
> tmp=`mktemp -d /tmp/mutthtml.X` || exit 1
> mhonarc -rcfile ~/.m2h_rcfile -single > $tmp/mail.html
> (chrome $tmp/mail.html 2>/dev/null; sleep 30; rm -r $tmp) &

Indeed, if i added sleep(30) to the place in question
in usr.bin/mandoc/main.c, i would expect quite an outrage...
A protest rally on my front lawn or something like that...

Then again, the man(1) process that is about to exit could maybe fork
a child before exiting.  The parent exiting right away makes sure that
the user gets back their shell prompt instantly after exiting the pager.
The child process, now being in the background, could sleep a few seconds
before deleting the temporary output files, then exit, too.

This does appear to successfully work around the firefox bug.
With the patch below, "MANPAGER=firefox man -T html ..."
now works reliably for me.

It does seem somewhat ugly, though, so i'm not sure whether i should
commit it.  In particular, fork(2)ing merely to go to the background
seems both cumbersome and expensive - admittedly, your script cited
above does the same, but that doesn't imply i have to like that, right?

Is there a cleaner way to let man(1) go into the background and
let the shell offer a fresh prompt while man(1) still sleeps,
without needing to fork(2)?

Yours,
  Ingo


Index: main.c
===
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.262
diff -u -p -r1.262 main.c
--- main.c  4 Oct 2021 21:28:50 -   1.262
+++ main.c  29 Oct 2021 14:12:08 -
@@ -1203,6 +1203,7 @@ woptions(char *arg, enum mandoc_os *os_e
 static void
 run_pager(struct outstate *outst, char *tag_target)
 {
+   const struct timespec delay = { 5, 0 };
int  signum, status;
pid_tman_pgid, tc_pgid;
pid_tpager_pid, wait_pid;
@@ -1245,13 +1246,16 @@ run_pager(struct outstate *outst, char *
if (wait_pid == -1) {
mandoc_msg(MANDOCERR_WAIT, 0, 0,
"%s", strerror(errno));
-   break;
+   return;
}
if (!WIFSTOPPED(status))
break;
 
signum = WSTOPSIG(status);
}
+   if (fork() > 0)
+   _exit(mandoc_msg_getrc());
+   nanosleep(, NULL);
 }
 
 static pid_t



Re: demystify vport(4) in vport(4) and ifconfig(8)

2021-10-29 Thread Ingo Schwarze
Hi Stuart,

Stuart Henderson wrote on Fri, Oct 29, 2021 at 10:53:41AM +0100:
> On 2021/10/28 23:19, Klemens Nanni wrote:
>> On Fri, Oct 29, 2021 at 12:57:54AM +0200, Ingo Schwarze wrote:

>>>   MANPAGER=firefox man -T html $(ifconfig -C)

>> This doesn't work if firefox is already running as the MANPAGER firefox
>> process exits immediately after sending the file/link to the running
>> process, which causes mandoc to exit after removing the temporary file,
>> by which time firefox fails to open the no longer exiting file.
>> 

> I use mutt_bgrun for things like this

I don't see how that can possibly help.  Accoring to
  https://gist.github.com/tabletick/4129818 ,
the mutt_bgrun program essentially does this:

cp "$file" "$tmpfile"
(
"$viewer" $options "$tmpfile"
rm -f "$tmpfile"
rmdir "$tmpdir"
) &

Assume that $file is the temporary HTML file written by man(1),
$tmpfile is the temporary filename (very poorly) generated
by mutt_bgrun, $viewer is firefox and $options is empty.

Then mutt_bgrun is defeated by the firefox bug that i described
in exactly the same way as that bug defeats man(1) itself.
The following sequence of events will sometimes occur:

 1. man(1) writes $file
 2. man(1) calls mutt_bgrun
 3. mutt_bgrun copies $file to $tmpfile
 4. mutt_bgrun spawns a subshell in the background
 5. mutt_bgrun exits
 6. man(1) deletes $file, which is harmless at this point
because $tmpfile still exists
 7. man(1) exits

 8. the mutt_bgrun subshell calls firefox
 9. That temporary firefox process hands the $tmpfile name
to the main running firefox application via IPC.
 10. the mutt_bgrun subshell deletes "$tmpfile"
 11. the mutt_bgrun subshell exits

Then, much later, because firefox is a monster and a slug:

 12. firefox attempts to open "$tmpfile",
 which was already deleted long before

If i understand correctly, mutt_bgrun is designed to solve
exactly the opposite problem we are having here:  When a viewer
(like, for example, xpdf(1)) blocks the formatter (like, for
example, mutt(1) or man(1)) until the user exits the viewer,
then mutt_bgrun does the job of putting the viewer into the
background such that the formatter can proceed before the user
exits the viewer.

But the problem here is that firefox is already doing too much
in the background (specifically, opening the $file, which it
really ought to do in the foreground before returning), not that
it does too little in the background.

Yours,
  Ingo



Re: demystify vport(4) in vport(4) and ifconfig(8)

2021-10-28 Thread Ingo Schwarze
Hi Klemens,

Klemens Nanni wrote on Thu, Oct 28, 2021 at 11:19:30PM +:
> On Fri, Oct 29, 2021 at 12:57:54AM +0200, Ingo Schwarze wrote:

>>   MANPAGER=firefox man -T html $(ifconfig -C)

> This doesn't work if firefox is already running

It is true that it sometimes works and sometimes fails due to a race
condition caused by firefox, but that is kind of orthogonal to benno@'s
question.

> as the MANPAGER firefox process exits immediately after sending the
> file/link to the running process,

I consider that a bug in firefox.  If a program is instructed to open a
file, and in particular a file in /tmp/, it must not exit before actually
opening the file.

> which causes mandoc to exit after removing the temporary file,
> by which time firefox fails to open the no longer exiting file.

Is there a way how man(1) could work around that braindead behaviour
of firefox?  I don't want to commit the following patch - which delays
program exit of man(1) after the pager exits - because the delay is
annoying when using a reasonable pager like less(1).

If someone can come up with an acceptable idea for a workaround,
or if someone can fix firefox, that will be welcome...

Yours,
  Ingo



Index: main.c
===
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.262
diff -u -p -r1.262 main.c
--- main.c  4 Oct 2021 21:28:50 -   1.262
+++ main.c  29 Oct 2021 00:06:24 -
@@ -1203,6 +1203,7 @@ woptions(char *arg, enum mandoc_os *os_e
 static void
 run_pager(struct outstate *outst, char *tag_target)
 {
+   const struct timespec delay = { 1, 0 };
int  signum, status;
pid_tman_pgid, tc_pgid;
pid_tpager_pid, wait_pid;
@@ -1252,6 +1253,7 @@ run_pager(struct outstate *outst, char *
 
signum = WSTOPSIG(status);
}
+   nanosleep(, NULL);
 }
 
 static pid_t



Re: demystify vport(4) in vport(4) and ifconfig(8)

2021-10-28 Thread Ingo Schwarze
Hi,

Sebastian Benoit wrote on Thu, Oct 28, 2021 at 11:05:54PM +0200:

> We tend to forget that there are other output formats for manpages
> as well.  right now, i can got to https://man.openbsd.org/ifconfig and
> jump from there to all of these manpages. With them removed, i can no
> longer do that.
> 
> Not sure if thats a valid consideration though.

I don't think that's a good reason to keep a long, static list in a
manual page *if* that list keeps creating maintenance headaches and
keeps getting outdated.

If information is naturally available from a command line program but
painful to maintain in manual pages, then i don't think it is reasonable
to expect it to be available from https://man.openbsd.org/, and i think
it is good enough if the manual page explains how to get the information
at the command line.

Besides, choose from:

  for name in $(ifconfig -C); do firefox https://man.openbsd.org/$name.4; done

and

  MANPAGER=firefox man -T html $(ifconfig -C)

All that said, don't forget the web equivalent of semantic apropos(1):

  https://man.openbsd.org/?query=Cd~pseudo-device=1

I admit that the URI syntax of web apropos is exceedingly ugly for
historical reasons.  I really ought to implement something like

  https://man.openbsd.org/apropos/Cd~pseudo-device
  https://man.openbsd.org/a/Cd~pseudo-device
  https://apropos.openbsd.org/Cd~pseudo-device

Another day, another task...
Too many pending tasks already...

Yours,
  Ingo



Re: Missing include in sys/device.h

2021-09-28 Thread Ingo Schwarze
Hi Rafael,

Rafael Sadowski wrote on Tue, Sep 28, 2021 at 08:22:06AM +0200:
> On Mon Sep 27, 2021 at 11:20:55PM -0600, Theo de Raadt wrote:

>> Oh, like how about you try compiling a kernel after your proposed diff?
>> 
>> in userland -
>>size_t comes from sys/types.h
>>or a header file which pulls in sys/types.h, and there should be
>>no further new iteration added
>> 
>> in kernel -
>> size_t comes from either sys/param.h or sys/types.h, depending
>> on which one a file includes
>> 
>> What you are suggesting is completely crazy.  It's akin to "all
>> header files should include all other dependent header files, and
>> hey maybe one day we can have a #include_all directive, eh?
>> 
>> Absolutely no way.

> Totally understandable! I have to get used with the userland/kernel
> src-environment. Thanks for the explanation and sorry for the noise.

By the way: some userland programs (but not all userland programs, and
of course not the C library) use the same rule of thumb that "headers
usually do not include headers".

Some will likely regard the following level of documentation excessive,
and i'm certainly not proposing to do this in more places.  But it does
help me personally to not lose track of what provides and uses what
in a program with a large number of deeply nested data structures and
a considerable number of modules that are in part independent of each
other and in part call the internal APIs of the next-lower layer:

  https://mandoc.bsd.lv/man/mandoc_headers.3.html

Making sure that not everything includes everything also helps to
recognize layering violation design errors early on, usually near
the beginning of the implementation phase of a new feature, sometimes
before any code starts being written at all.

Yours,
  Ingo



Re: Unwind + NSD usage question

2021-09-28 Thread Ingo Schwarze
Hello Paul,

Paul de Weerd wrote on Tue, Sep 28, 2021 at 12:44:07PM +0200:

> 'local-data-ptr:' in unbound.conf(5):
> http://man.openbsd.org/unbound.conf#local~2
> http://man.openbsd.org/unbound.conf#local~3

heh, thank you for *both* of these bug reports,
i'm adding them to the mandoc TODO file for now,
but neither is expected to be hard to fix:

- tag.c, tag_put() should not put ASCII_HYPH into the tag file,
  which happens when the tag contains "-" on the input side
  weerd@ 28 Sep 2021 12:44:07 +0200
  loc *  exist *  algo *  size *  imp ***

- tag.c, tag_put() and callers like man_validate.c, check_tag()
  should not mistake "\-" as a word-ending escape sequence but
  instead translate it to plain "-" in the tag name
  weerd@ 28 Sep 2021 12:44:07 +0200
  loc **  exist *  algo *  size *  imp ***

Yours,
  Ingo



Re: OpenBSD `ftp` port for MSVC (Windows)

2021-09-05 Thread Ingo Schwarze
Hi Samuel,

first, note that this is off-topic on tech@.  If you see a need to
continue the thread, please move it to misc@.

Samuel Marks wrote on Sun, Sep 05, 2021 at 02:10:14PM -0400:

> WiP, basically just started this:
> https://github.com/offscale/libacquire/tree/423b751/libacquire/openbsd_ftp

I feel unsure what you are trying to achieve.  The README.md in that
directory is so terse that i don't really understand what the point is.

> What:
> - My project allows for switching between the HTTPS interface provided
> by your OS, and libcurl's HTTPS interface.

I don't really understand what that is supposed to mean either.

> Why OpenBSD `ftp` port:
> - I don't have a "pure" socket solution, and OpenBSD does not have a C
> interface version of their `ftp` (like, `fetch` from FreeBSD does, for
> example)
> - SLOC is short
> - It's from OpenBSD which gives certain trust for protocol conformity,
> safety, security, lower rates of change, and all-in-all quality

None of that applies to our current implementation of the ftp(1) client.
Yes, like everything in OpenBSD, it did receive some auditing, some
maintenance, and some feature additions helping security in the past.

But all the same, our -current ftp(1) client consists of ancient,
terribly ill-designed, technically totally outdated and obsolete
code.  So much so that developers felt it was time to throw it all
away and write a completely new ftp(1) program from scratch.  One
developer started an effort to do so, and that new program can
already do most of what ftp(1) should, but before it was completely
ready for commit, the effort unfortunately somehow petered out, so
we still have the old shit in the tree.

> - Once the MSVC port is done, the port to other OSs should be
> relatively trivial

Porting OpenBSD software to other systems is often a good idea.
Worthy examples include OpenSSH, OpenBGPD, OpenNTPD, OpenSMTPD,
OpenIKED, LibreSSL, and there are likely many more.

But OpenBSD ftp(1) is most definitely not a worthy target for
porting.

Just look at this utter mess:

 $ cd /usr/src/usr.bin/ftp/
 $ grep jmp *.c
cmds.c: (void)setjmp(jabort);
cmds.c: (void)setjmp(jabort);
cmds.c: (void)setjmp(jabort);
cmds.c:jmp_buf abortprox;
cmds.c: longjmp(abortprox, 1);
cmds.c: if (setjmp(abortprox)) {
fetch.c:static jmp_buf  httpabort;
fetch.c:if (setjmp(httpabort)) {
fetch.c:if (setjmp(httpabort)) {
fetch.c:longjmp(httpabort, 1);
fetch.c:if (setjmp(toplevel)) {
ftp.c:jmp_buf   ptabort;
ftp.c:  longjmp(ptabort, 1);
ftp.c:jmp_buf   sendabort;
ftp.c:  longjmp(sendabort, 1);
ftp.c:  if (setjmp(sendabort)) {
ftp.c:  if (setjmp(sendabort))
ftp.c:jmp_buf   recvabort;
ftp.c:  longjmp(recvabort, 1);
ftp.c:  if (setjmp(recvabort)) {
ftp.c:  if (setjmp(recvabort))
ftp.c:  longjmp(ptabort, 1);
ftp.c:  if (setjmp(ptabort))
ftp.c:jmp_buf forceabort;
ftp.c:  longjmp(forceabort, 1);
ftp.c:  if (cout == NULL || setjmp(forceabort)) {
main.c:jmp_buf  toplevel;
main.c: if (setjmp(toplevel))
main.c: top = setjmp(toplevel) == 0;
main.c: longjmp(toplevel, 1);
small.c:jmp_buf jabort;
small.c:longjmp(jabort, 1);
small.c:longjmp(jabort, 1);
small.c:(void)setjmp(jabort);

Or look at this in small.c:

/* XXX - Signal race. */
/* ARGSUSED */
void
mabort(int signo)
{
int save_errno = errno;

alarmtimer(0);
(void) write(fileno(ttyout), "\n\r", 2);
#ifndef SMALL
if (mflag && fromatty) {
/* XXX signal race, crazy unbelievable stdio misuse */
if (confirm(mname, NULL)) {
errno = save_errno;
longjmp(jabort, 1);
}
}
#endif /* !SMALL */
mflag = 0;
errno = save_errno;
longjmp(jabort, 1);
}

Then look at util.c what confirm() is.  The word "unbelievable" is an
accurate description of the situation.

Or look at intr() in main.c.  On top of calling longjmp(3), that
signal handler also calls alarmtimer() in util.c, which happily
does multiple non-atomic stores into a struct, then calls the
function setitimer(2) which is not async-signal-safe either.

Or the updateprogressmeter() signal handler in util.c.  That one
calls progressmeter() in the same file, which is just as bad as
confirm().

We have a policy to mark broken code with /* XXX */ in the tree when
it is found and whoever discovers the breakage has no time to fix
it right away.  But ftp(1) is so bad that i feel like i don't even
have the time to mark stuff as broken.  No matter where i look,
almost all of it is broken.

I'm usually a great fan of fixing broken code by step-by-step
refactoring, even if it is broken in quite serious ways.  But that
is totally out of the question for OpenBSD ftp(1).  That code is
so thoroughly ill-designed throughout that starting over from scratch
is the only viable option.

> So, want me to finish working on this?

It sounds like a waste of time to me.

> - And 

Re: man sed(1) diff

2021-09-05 Thread Ingo Schwarze
Hello,

i see where Ian's confusion is coming from, even though arguably,
the existing text is accurate.  But it is not a good idea to insert
exceptions as parenthetic remarks in the middle of an enumeration
of steps that is already somewhat long and complicated.

I think it is better to explain the special rules for D processing
in the paragraph describing D (surprise, surprise) rather than in
the middle of the general description of what a "cycle" is.


Consider the following minimal example:

   $ printf 'axb' | sed -n 'y/x/\n/;s/^b/c/;P;D'
   a
   c

The D deletes "a\n" from the pattern space.
After that, the next cycle is entered with "b" in the pattern space,
without copying a new line into the pattern space.


OK?
  Ingo


Index: sed.1
===
RCS file: /cvs/src/usr.bin/sed/sed.1,v
retrieving revision 1.48
diff -u -r1.48 sed.1
--- sed.1   17 Mar 2016 05:27:10 -  1.48
+++ sed.1   5 Sep 2021 11:46:48 -
@@ -140,9 +140,6 @@
 cyclically copies a line of input, not including its terminating newline
 character, into a
 .Em pattern space ,
-(unless there is something left after a
-.Ic D
-function),
 applies all of the commands with addresses that select that pattern space,
 copies the pattern space to the standard output, appending a newline, and
 deletes the pattern space.
@@ -331,7 +328,8 @@
 Delete the pattern space and start the next cycle.
 .It [2addr] Ns Ic D
 Delete the initial segment of the pattern space through the first
-newline character and start the next cycle.
+newline character and start the next cycle without copying the next
+line of input into the pattern space.
 .It [2addr] Ns Ic g
 Replace the contents of the pattern space with the contents of the
 hold space.



Re: timeout: Prettify man page and usage

2021-09-04 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Sat, Sep 04, 2021 at 09:47:12PM +0100:

> pretty damning that my ok is on that commit ;)
> i'll try to remember...

Heh.  With the amount of work you are doing - your current commit
count stands at 9113, on average 1.34 per day, during a time of
over eightteen years and seven months (and that does not even include
the many OKs you provided), it would be pretty unreasonable to
expect remembering every single commit...   8-)

Yours,
  Ingo



Re: timeout: Prettify man page and usage

2021-09-04 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Fri, Sep 03, 2021 at 02:46:47PM +0100:
> On Fri, Sep 03, 2021 at 03:42:21PM +0200, Ingo Schwarze wrote:
>> Theo de Raadt wrote on Thu, Sep 02, 2021 at 09:57:11AM -0600:

>>> I think we should list shorts, and longs which have no shorts.

>> I agree, and i think we arrived at the same conclusion in the past.
>> 
>> It applies to both usage() and SYNOPSIS, and ideally, both should
>> match, except maybe in very unusual cases.

> sure. but grep tripped me up, and it's always the page i think of with
> long options. why did we leave --context in SYNOPSIS?

I admit the grep(1) case is slightly confusing, but it is explained
here why it is somewhat special:

  https://cvsweb.openbsd.org/src/usr.bin/grep/grep.1#rev1.47

Yours,
  Ingo



Re: mark getsubopt(3) as part of POSIX

2021-09-03 Thread Ingo Schwarze
Hi Theo,

as you sent it, your patch is misleading since our manual page describes
two features that are not required by POSIX.

So i propose the larger patch shown below instead.

While here, clarify what "null-terminated" means.  It matters for
this page in particular because the page talks about both NUL bytes
and NULL pointers in different places.

Also, failing to state that the original input string is modified
feels like a serious omission to me.  By the way, these replacements
are indeed required by POSIX.  The strsep(3) below SEE ALSO provides
a rather weak hint, but i regard that as insufficient.

OK?
  Ingo


Index: getsubopt.3
===
RCS file: /cvs/src/lib/libc/stdlib/getsubopt.3,v
retrieving revision 1.14
diff -u -r1.14 getsubopt.3
--- getsubopt.3 15 Nov 2014 14:41:02 -  1.14
+++ getsubopt.3 3 Sep 2021 14:38:28 -
@@ -55,7 +55,9 @@
 is a pointer to a pointer to the string.
 The argument
 .Fa tokens
-is a pointer to a null-terminated array of pointers to strings.
+is a pointer to a
+.Dv NULL Ns -terminated
+array of pointers to strings.
 .Pp
 The
 .Fn getsubopt
@@ -79,6 +81,11 @@
 .Fa optionp
 will be set to point to the start of the next token in the string,
 or the NUL at the end of the string if no more tokens are present.
+The comma, space, or tab character ending the token just parsed,
+and the equal sign separating name and value if any, are replaced
+with NUL bytes in the original
+.Pf * Fa optionp
+input string.
 The external variable
 .Fa suboptarg
 will be set to point to the start of the current token, or
@@ -138,6 +145,16 @@
 .Sh SEE ALSO
 .Xr getopt 3 ,
 .Xr strsep 3
+.Sh STANDARDS
+The
+.Fn getsubopt
+function conforms to
+.St -p1003.1-2008 .
+.Pp
+Allowing space and tab characters to separate tokens
+and the external variable
+.Va suboptarg
+are extensions to that standard.
 .Sh HISTORY
 The
 .Fn getsubopt



Re: timeout: Prettify man page and usage

2021-09-03 Thread Ingo Schwarze
Hi,

Theo de Raadt wrote on Thu, Sep 02, 2021 at 09:57:11AM -0600:

> I think we should list shorts, and longs which have no shorts.

I agree, and i think we arrived at the same conclusion in the past.

It applies to both usage() and SYNOPSIS, and ideally, both should
match, except maybe in very unusual cases.

Yours,
  Ingo



Re: rm.1: remove extraneous word

2021-09-03 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Fri, Sep 03, 2021 at 07:47:19AM +0100:
> On Thu, Sep 02, 2021 at 11:10:54PM +0100, Jason McIntyre wrote:

>> i wonder if it was originally an attempt to not quote posix
>> (or posix attempting to not quote bsd). posix refers to removing
>> "directory entries", which seems more natural.

Quite to the contrary.

A wording "removes the entries" first appeared in AT System III
UNIX (1980).  I first see the specific wording "remove directory
entries" in Version 10 AT UNIX in December 1989 and it was then
also used by Cynthia Livingston on June 11, 1990 when she converted
the page from man(7) to mdoc(7).  So this predates POSIX.2 (1992).
Not sure what XPG 3 said in 1989.

The current wording of the .Nd and the first sentence of the DESCRIPTION
was committed by Keith Bostic on August 12, 1990 with this commit message:

  new version of rm from scratch and the POSIX.2 description

I'm surprised that Keith referred to POSIX.2 in 1990 even though
the original POSIX.2 is usually quoted as "IEEE Std 1003.2-1992".
I know that POSIX.2 drafts were being worked on in 1991, but
apparently some were already in circulation in the summer of 1990.

> on the other hand, the phrase "non-directory type files" is pretty
> awful. posix is clearer i think, sticking to "directory entries
> specified by each file argument".we could also use this: "directory
> entries specified on the command line". but that would feel like
> deliberately avoiding the term "file", which is clear and simple.
> 
> just using "non-directory files" is also weird. i mean, you can very
> much remove directory files.

It is standard practice that the first paragraph of the DESCRIPTION
describes default behaviour, then the option list describes
modifications of this behaviour, in this case that -d and -r also
remove directories.  So regarding the content, there is nothing to fix.

If you think the wording is awful, you could say

   The rm utility attempts to remove the files specified on the
   command line.  By default, specifying a directory is an error.
   If the permissions ...

or something similar.  Since this is about wording, i would say it
is your call.

In any case, i agree with Theo that the .Nd should not be changed.

Yours,
  Ingo



Re: Atomic signal flags for vi(1)

2021-09-02 Thread Ingo Schwarze
Hi Tim,

trondd wrote on Wed, Sep 01, 2021 at 08:46:28PM -0400:
> Ingo Schwarze  wrote:
>> Ingo Schwarze wrote on Wed, Sep 01, 2021 at 04:38:51PM +0200:

>>> Note that the h_hup() and h_term() signal handlers are still unsafe
>>> after this commit because they also set the "killersig" (how fitting!)
>>> field in a global struct.

>> I like it when fixing two bugs only amounts to minus: minus 14 LOC,
>> 1 function, 1 global variable, 2 automatic variables, 1 struct member,
>> 9 pointer dereferences, 1 #define, and minus 1 #undef.
>> 
>> The argument that only one single GS exists applies unchanged.

> Great.  This is the direction I was going to go in with it.  I hadn't
> thought of collapsing h_hup an h_term, though.
> 
> This makes sense as I understand the signal/event handling and a quick
> test through the signals shows no difference in behavior.

Thank you (and martijn@) for reviewing and testing.
This is now committed.

> Should h_term() and cl_sigterm be named something more general to
> indicate that they also handle SIGHUP?  Or is it good enough since it
> includes all the signals that 'term'inate vi?  It's not hard to follow
> as-is. 

That is what i was thinking: why re-paint a bike shed if the existing
paint is already pretty and not yet falling off?

Yours,
  Ingo


>> Index: cl/cl.h
>> ===
>> RCS file: /cvs/src/usr.bin/vi/cl/cl.h,v
>> retrieving revision 1.11
>> diff -u -p -r1.11 cl.h
>> --- cl/cl.h  1 Sep 2021 14:28:15 -   1.11
>> +++ cl/cl.h  1 Sep 2021 17:15:34 -
>> @@ -11,7 +11,6 @@
>>   *  @(#)cl.h10.19 (Berkeley) 9/24/96
>>   */
>>  
>> -extern  volatile sig_atomic_t cl_sighup;
>>  extern  volatile sig_atomic_t cl_sigint;
>>  extern  volatile sig_atomic_t cl_sigterm;
>>  extern  volatile sig_atomic_t cl_sigwinch;
>> @@ -31,7 +30,6 @@ typedef struct _cl_private {
>>  char*rmso, *smso;   /* Inverse video terminal strings. */
>>  char*smcup, *rmcup; /* Terminal start/stop strings. */
>>  
>> -int  killersig; /* Killer signal. */
>>  #define INDX_HUP0
>>  #define INDX_INT1
>>  #define INDX_TERM   2
>> Index: cl/cl_funcs.c
>> ===
>> RCS file: /cvs/src/usr.bin/vi/cl/cl_funcs.c,v
>> retrieving revision 1.21
>> diff -u -p -r1.21 cl_funcs.c
>> --- cl/cl_funcs.c1 Sep 2021 14:28:15 -   1.21
>> +++ cl/cl_funcs.c1 Sep 2021 17:15:34 -
>> @@ -437,7 +437,7 @@ cl_refresh(SCR *sp, int repaint)
>>   * If we received a killer signal, we're done, there's no point
>>   * in refreshing the screen.
>>   */
>> -if (clp->killersig)
>> +if (cl_sigterm)
>>  return (0);
>>  
>>  /*
>> @@ -582,7 +582,7 @@ cl_suspend(SCR *sp, int *allowedp)
>>   * unchanged.  In addition, the terminal has already been reset
>>   * correctly, so leave it alone.
>>   */
>> -if (clp->killersig) {
>> +if (cl_sigterm) {
>>  F_CLR(clp, CL_SCR_EX_INIT | CL_SCR_VI_INIT);
>>  return (0);
>>  }
>> Index: cl/cl_main.c
>> ===
>> RCS file: /cvs/src/usr.bin/vi/cl/cl_main.c,v
>> retrieving revision 1.34
>> diff -u -p -r1.34 cl_main.c
>> --- cl/cl_main.c 1 Sep 2021 14:28:15 -   1.34
>> +++ cl/cl_main.c 1 Sep 2021 17:15:34 -
>> @@ -33,7 +33,6 @@
>>  
>>  GS *__global_list;  /* GLOBAL: List of screens. */
>>  
>> -volatile sig_atomic_t cl_sighup;
>>  volatile sig_atomic_t cl_sigint;
>>  volatile sig_atomic_t cl_sigterm;
>>  volatile sig_atomic_t cl_sigwinch;
>> @@ -120,9 +119,9 @@ main(int argc, char *argv[])
>>  }
>>  
>>  /* If a killer signal arrived, pretend we just got it. */
>> -if (clp->killersig) {
>> -(void)signal(clp->killersig, SIG_DFL);
>> -(void)kill(getpid(), clp->killersig);
>> +if (cl_sigterm) {
>> +(void)signal(cl_sigterm, SIG_DFL);
>> +(void)kill(getpid(), cl_sigterm);
>>  /* NOTREACHED */
>>  }
>>  
>> @@ -215,17 +214,6 @@ term_init(char *ttype)
>>  }
>>  }
>>  
>> -#define GLOBAL_CLP \
>> -CL_PRIVATE *clp = GCLP(__global_list);
>> -static void
>> -h_hup(int signo)
>> -{
>> -GLOBAL

Re: timeout: execvp(2) return is always an error

2021-09-02 Thread Ingo Schwarze
Hi Sebastien,

Sebastien Marie wrote on Thu, Sep 02, 2021 at 09:09:43AM +0200:

> If execvp(2) returns, it is always an error: there is no need to check
> if the return value is -1. Just unconditionally call err(3).
> 
> Comments or OK ?

OK schwarze@
  Ingo


>  timeout: execvp(2) should not return except on error

"should not" is confusing, "does not" would be clearer.

> --- usr.bin/timeout/timeout.c
> +++ usr.bin/timeout/timeout.c
> @@ -165,7 +165,7 @@ main(int argc, char **argv)
>   int ch;
>   unsigned long   i;
>   int foreground = 0, preserve = 0;
> - int error, pstat, status;
> + int pstat, status;
>   int killsig = SIGTERM;
>   pid_t   pgid = 0, pid, cpid = 0;
>   double  first_kill;
> @@ -251,9 +251,8 @@ main(int argc, char **argv)
>   signal(SIGTTIN, SIG_DFL);
>   signal(SIGTTOU, SIG_DFL);
>  
> - error = execvp(argv[0], argv);
> - if (error == -1)
> - err(1, "execvp");
> + execvp(argv[0], argv);
> + err(1, "execvp");
>   }
>  
>   if (pledge("stdio", NULL) == -1)



Re: Atomic signal flags for vi(1)

2021-09-01 Thread Ingo Schwarze
Hi,

Ingo Schwarze wrote on Wed, Sep 01, 2021 at 04:38:51PM +0200:

> Note that the h_hup() and h_term() signal handlers are still unsafe
> after this commit because they also set the "killersig" (how fitting!)
> field in a global struct.

I like it when fixing two bugs only amounts to minus: minus 14 LOC,
1 function, 1 global variable, 2 automatic variables, 1 struct member,
9 pointer dereferences, 1 #define, and minus 1 #undef.

The argument that only one single GS exists applies unchanged.

OK?
  Ingo


Index: cl/cl.h
===
RCS file: /cvs/src/usr.bin/vi/cl/cl.h,v
retrieving revision 1.11
diff -u -p -r1.11 cl.h
--- cl/cl.h 1 Sep 2021 14:28:15 -   1.11
+++ cl/cl.h 1 Sep 2021 17:15:34 -
@@ -11,7 +11,6 @@
  * @(#)cl.h10.19 (Berkeley) 9/24/96
  */
 
-extern volatile sig_atomic_t cl_sighup;
 extern volatile sig_atomic_t cl_sigint;
 extern volatile sig_atomic_t cl_sigterm;
 extern volatile sig_atomic_t cl_sigwinch;
@@ -31,7 +30,6 @@ typedef struct _cl_private {
char*rmso, *smso;   /* Inverse video terminal strings. */
char*smcup, *rmcup; /* Terminal start/stop strings. */
 
-   int  killersig; /* Killer signal. */
 #defineINDX_HUP0
 #defineINDX_INT1
 #defineINDX_TERM   2
Index: cl/cl_funcs.c
===
RCS file: /cvs/src/usr.bin/vi/cl/cl_funcs.c,v
retrieving revision 1.21
diff -u -p -r1.21 cl_funcs.c
--- cl/cl_funcs.c   1 Sep 2021 14:28:15 -   1.21
+++ cl/cl_funcs.c   1 Sep 2021 17:15:34 -
@@ -437,7 +437,7 @@ cl_refresh(SCR *sp, int repaint)
 * If we received a killer signal, we're done, there's no point
 * in refreshing the screen.
 */
-   if (clp->killersig)
+   if (cl_sigterm)
return (0);
 
/*
@@ -582,7 +582,7 @@ cl_suspend(SCR *sp, int *allowedp)
 * unchanged.  In addition, the terminal has already been reset
 * correctly, so leave it alone.
 */
-   if (clp->killersig) {
+   if (cl_sigterm) {
F_CLR(clp, CL_SCR_EX_INIT | CL_SCR_VI_INIT);
return (0);
}
Index: cl/cl_main.c
===
RCS file: /cvs/src/usr.bin/vi/cl/cl_main.c,v
retrieving revision 1.34
diff -u -p -r1.34 cl_main.c
--- cl/cl_main.c1 Sep 2021 14:28:15 -   1.34
+++ cl/cl_main.c1 Sep 2021 17:15:34 -
@@ -33,7 +33,6 @@
 
 GS *__global_list; /* GLOBAL: List of screens. */
 
-volatile sig_atomic_t cl_sighup;
 volatile sig_atomic_t cl_sigint;
 volatile sig_atomic_t cl_sigterm;
 volatile sig_atomic_t cl_sigwinch;
@@ -120,9 +119,9 @@ main(int argc, char *argv[])
}
 
/* If a killer signal arrived, pretend we just got it. */
-   if (clp->killersig) {
-   (void)signal(clp->killersig, SIG_DFL);
-   (void)kill(getpid(), clp->killersig);
+   if (cl_sigterm) {
+   (void)signal(cl_sigterm, SIG_DFL);
+   (void)kill(getpid(), cl_sigterm);
/* NOTREACHED */
}
 
@@ -215,17 +214,6 @@ term_init(char *ttype)
}
 }
 
-#defineGLOBAL_CLP \
-   CL_PRIVATE *clp = GCLP(__global_list);
-static void
-h_hup(int signo)
-{
-   GLOBAL_CLP;
-
-   cl_sighup = 1;
-   clp->killersig = SIGHUP;
-}
-
 static void
 h_int(int signo)
 {
@@ -235,10 +223,7 @@ h_int(int signo)
 static void
 h_term(int signo)
 {
-   GLOBAL_CLP;
-
-   cl_sigterm = 1;
-   clp->killersig = SIGTERM;
+   cl_sigterm = signo;
 }
 
 static void
@@ -246,7 +231,6 @@ h_winch(int signo)
 {
cl_sigwinch = 1;
 }
-#undef GLOBAL_CLP
 
 /*
  * sig_init --
@@ -261,20 +245,19 @@ sig_init(GS *gp, SCR *sp)
 
clp = GCLP(gp);
 
-   cl_sighup = 0;
cl_sigint = 0;
cl_sigterm = 0;
cl_sigwinch = 0;
 
if (sp == NULL) {
-   if (setsig(SIGHUP, >oact[INDX_HUP], h_hup) ||
+   if (setsig(SIGHUP, >oact[INDX_HUP], h_term) ||
setsig(SIGINT, >oact[INDX_INT], h_int) ||
setsig(SIGTERM, >oact[INDX_TERM], h_term) ||
setsig(SIGWINCH, >oact[INDX_WINCH], h_winch)
)
err(1, NULL);
} else
-   if (setsig(SIGHUP, NULL, h_hup) ||
+   if (setsig(SIGHUP, NULL, h_term) ||
setsig(SIGINT, NULL, h_int) ||
setsig(SIGTERM, NULL, h_term) ||
setsig(SIGWINCH, NULL, h_winch)
Index: cl/cl_read.c
===
RCS file: /cvs/src/usr.bin/vi/cl/cl_read.c,v
retrieving revision 1.22
diff -u -p -r1.22 cl_read.c
--- cl/cl_read.c1 Sep 2021 14:28:15 -   1.22
+++ cl/cl_r

Re: Atomic signal flags for vi(1)

2021-09-01 Thread Ingo Schwarze
Hi Tim,

trondd wrote on Tue, Aug 24, 2021 at 07:45:33PM -0400:
> "Theo de Raadt"  wrote:
 
>> +h_alrm(int signo)
>> +{
>> +   GLOBAL_CLP;
>> +
>> +   F_SET(clp, CL_SIGALRM);
>> 
>> F_SET is |=, which is not atomic.
>> 
>> This is unsafe.  Safe signal handlers need to make single stores to
>> atomic-sized variables, which tend to be int-sized, easier to declare
>> as sig_atomic_t.  Most often these are global scope with the following
>> type:
>> 
>>  volatile sig_atomic_t variable
>> 
>> I can see you copied an existing practice.  Sadly all the other
>> signal handlers are also broken in the same way.
>> 
>> The flag bits should be replaced with seperate sig_atomic_t variables.

> Ok.  Took a swing at converting the signal handling to sig_atomic_t flags.
> No additional changes added other than removing a redundant check of the
> flags in cl_read.c.  Seemed to be a pretty straight-forward conversion and
> I haven't found any change in behavior.

Committed, thank you.
  Ingo


CVSROOT:/cvs
Module name:src
Changes by: schwa...@cvs.openbsd.org2021/09/01 08:28:15

Modified files:
usr.bin/vi/cl  : cl.h cl_funcs.c cl_main.c cl_read.c 

Log message:
As a first step towards safe signal handling, improve the h_int()
and h_winch() signal handlers to make one single store to a
sig_atomic_t variable.  Note that the h_hup() and h_term() signal
handlers are still unsafe after this commit because they also set
the "killersig" (how fitting!) field in a global struct.

Despite storing information in static global variables rather than
in structs passed around as arguments, this patch does not cause a
change in behaviour because there is always exactly one GS object,
initialized using gs_init() called from the top of main(), and
screen_init() stores a pointer to this one and only GS object in
the .gp member of each and every SCR object.  Talk about useless
abstraction...

Problem pointed out by deraadt@.
Patch from Tim  on tech@.
OK deraadt@.


> Index: cl/cl.h
> ===
> RCS file: /cvs/src/usr.bin/vi/cl/cl.h,v
> retrieving revision 1.10
> diff -u -p -r1.10 cl.h
> --- cl/cl.h   27 May 2016 09:18:11 -  1.10
> +++ cl/cl.h   24 Aug 2021 23:34:27 -
> @@ -11,6 +11,11 @@
>   *   @(#)cl.h10.19 (Berkeley) 9/24/96
>   */
>  
> +extern   volatile sig_atomic_t cl_sighup;
> +extern   volatile sig_atomic_t cl_sigint;
> +extern   volatile sig_atomic_t cl_sigterm;
> +extern   volatile sig_atomic_t cl_sigwinch;
> +
>  typedef struct _cl_private {
>   CHAR_T   ibuf[256]; /* Input keys. */
>  
> @@ -45,11 +50,7 @@ typedef struct _cl_private {
>  #define  CL_RENAME_OK0x0004  /* User wants the windows renamed. */
>  #define  CL_SCR_EX_INIT  0x0008  /* Ex screen initialized. */
>  #define  CL_SCR_VI_INIT  0x0010  /* Vi screen initialized. */
> -#define  CL_SIGHUP   0x0020  /* SIGHUP arrived. */
> -#define  CL_SIGINT   0x0040  /* SIGINT arrived. */
> -#define  CL_SIGTERM  0x0080  /* SIGTERM arrived. */
> -#define  CL_SIGWINCH 0x0100  /* SIGWINCH arrived. */
> -#define  CL_STDIN_TTY0x0200  /* Talking to a terminal. */
> +#define  CL_STDIN_TTY0x0020  /* Talking to a terminal. */
>   u_int32_t flags;
>  } CL_PRIVATE;
>  
> Index: cl/cl_funcs.c
> ===
> RCS file: /cvs/src/usr.bin/vi/cl/cl_funcs.c,v
> retrieving revision 1.20
> diff -u -p -r1.20 cl_funcs.c
> --- cl/cl_funcs.c 27 May 2016 09:18:11 -  1.20
> +++ cl/cl_funcs.c 24 Aug 2021 23:34:27 -
> @@ -601,7 +601,7 @@ cl_suspend(SCR *sp, int *allowedp)
>   if (cl_ssize(sp, 1, NULL, NULL, ))
>   return (1);
>   if (changed)
> - F_SET(CLP(sp), CL_SIGWINCH);
> + cl_sigwinch = 1;
>  
>   return (0);
>  }
> Index: cl/cl_main.c
> ===
> RCS file: /cvs/src/usr.bin/vi/cl/cl_main.c,v
> retrieving revision 1.33
> diff -u -p -r1.33 cl_main.c
> --- cl/cl_main.c  5 May 2016 20:36:41 -   1.33
> +++ cl/cl_main.c  24 Aug 2021 23:34:27 -
> @@ -33,6 +33,11 @@
>  
>  GS *__global_list;   /* GLOBAL: List of screens. */
>  
> +volatile sig_atomic_t cl_sighup;
> +volatile sig_atomic_t cl_sigint;
> +volatile sig_atomic_t cl_sigterm;
> +volatile sig_atomic_t cl_sigwinch;
> +
>  static void cl_func_std(GS *);
>  static CL_PRIVATE *cl_init(GS *);
>  static GS  *gs_init(void);
> @@ -217,16 +222,14 @@ h_hup(int signo)
>  {
>   GLOBAL_CLP;
>  
> - F_SET(clp, CL_SIGHUP);
> + cl_sighup = 1;
>   clp->killersig = SIGHUP;
>  }
>  
>  static void
>  h_int(int signo)
>  {
> - GLOBAL_CLP;
> -
> - F_SET(clp, CL_SIGINT);
> + cl_sigint = 1;
>  }
>  
>  static void
> @@ -234,16 +237,14 @@ h_term(int signo)
>  {
>   GLOBAL_CLP;
>  
> - F_SET(clp, 

Re: averse to lisp in base?

2021-08-29 Thread Ingo Schwarze
Hi,

Tomasz Rola wrote on Sun, Aug 29, 2021 at 08:21:03PM +0200:
> On Sun, Aug 29, 2021 at 03:27:27AM +0200, mayur...@kathe.in wrote:

>> Would the core team consider including a minimalist lisp in the base?
>> e.g. http://t3x.org/klisp/index.html

[...]
> If I would want to propose any Lisp into the base system, I would
> point at s9fes (Scheme 9 from Empty Space), also by Nils Holm.

As a general rule of thumb:  if some program

 * is maintained by a person who is not an OpenBSD developer,
 * and/or is maintained in a repository outside of cvs.openbsd.org,
 * and is not required for building or running anything
   contained in the base system,

then it is usually better maintained in ports than in base.

For users, typing "doas pkg_add my_loveliest_lisp" once is negligible
effort, so it really doesn't matter for them.

For developers, regularly synching a piece of third-party software
into base is quite tedious and wastes considerable working time,
whereas keeping a port up to date causes orders of magnitude less
work.

For example, synching to base is a royal pain for essential stuff
like clang and Perl and drm.  It causes non-trivial work even for
smaller stuff like unbound and nsd.  In several caaes, it was such
a nightmare that developers just gave up in those cases - for
example, just think of nginx and sqlite.

In any case, this thread is wildly off-topic on tech@.  If you have
additional questions related to this topic, please start a new
thread on misc@, or post a port or a port update to ports@.

Yours,
  Ingo



Re: Atomic signal flags for vi(1)

2021-08-29 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Sun, Aug 29, 2021 at 11:38:18AM -0600:
> Ingo Schwarze wrote:

>> *If* more than one GS object ever existed and/or the .gp pointers
>> in different SCR objects could point to different GS objects, this
>> patch might change behaviour.

> If such multiple GS condition ever existed, since signals are (global),
> the handler is only indicating a signal has happened.  And the code
> observing non-zero of these signal-indicating variables would be
> responsible to determine which event it applies to.

Absolutely, that is the sane way to code such stuff.

However, we have all seen signal handlers making complicated decisions
by navigating complex, non-trivial data structures, even in our own
tree.  Of course, those need to be fixed.  But before changing the
code, i still need to figure out whether the seemingly non-trivial
data structure is indeed non-trivial, and which higher-level code
must be made smarter to preserve the functionality, or whether the
apparent complexity merely hides the fact that the data is actually
trivial, as i think it is in this case.

> It is too hard to do this kind of work safely/atomically in a
> signal handler.

100% correct.  And yet, people do try to, and then it even works.
Well, kind of.  Most of the time.  Until it suddenly doesn't.  And
when it suddenly doesn't, people are used to blaming cosmic muons
(and yes, there is indeed about one of those per square centimeter
per minute, on average) rather than thinking something might be
wrong with their code.

Yours,
  Ingo



Re: Atomic signal flags for vi(1)

2021-08-29 Thread Ingo Schwarze
Hi,

Theo de Raadt wrote on Sun, Aug 29, 2021 at 02:54:57AM -0600:

> This does look better.
> 
> I appreciate that you are fixing this underlying problem first, before
> overlaying your timer diff.

Indeed.

> Is this working for the vi crowd?

*If* more than one GS object ever existed and/or the .gp pointers
in different SCR objects could point to different GS objects, this
patch might change behaviour.  So the hardest part about veryfying
it is to make sure that only one single GS object exists at any
given time and all .gp pointers in all SCR objects always point to
that one GS object.  GS and [.>]gs are used at hundreds of places,
but i only managed to find a single place where GS is instantiated
(gs_init() in cl/cl_main.c, which is called inly once at the very
beginning of main()), and only a single place where the .gp member
of SCR is ever changed, in screen_init() in common/screen.c, which
is called from a handful of places, but always with a pointer to
that one particular GS object.

So yes, i do think changing members of GS to static globals is safe
and testing with a single screen is sufficient.  After the patch,
 * SIGHUP prints "Hangup" and exits as expected;
 * SIGINT prints "Interrupted" in the status line and exits insert
   mode, but stays in vi(1);
 * SIGTERM prints "Terminated" and exits as expected;
 * window size changes work as expected both in the forground and
   in the background while suspended with Ctrl-Z, and sending
   SIGWINCH manually has no effect, as expected.

Note that the patch is incomplete.  The signal handler functions
h_hup() and h_term() still aren't safe because they still write to
GCLP(__global_list)->killersig.  But this patch performs one
well-defined step and is hard enough to audit already, so let's get
it committed, then take care of killersig in a second step.

So if someone wants to commit, this patch is OK schwarze@.

Alternatively, provide an OK and tell me to commit.

Be careful that the original author of this patch signed as "Tim"
but is not tim@ for all i can tell.

Yours,
  Ingo


> trondd  wrote:
> 
> > "Theo de Raadt"  wrote:
> > 
> > > +h_alrm(int signo)
> > > +{
> > > +   GLOBAL_CLP;
> > > +
> > > +   F_SET(clp, CL_SIGALRM);
> > > 
> > > F_SET is |=, which is not atomic.
> > > 
> > > This is unsafe.  Safe signal handlers need to make single stores to
> > > atomic-sized variables, which tend to be int-sized, easier to declare
> > > as sig_atomic_t.  Most often these are global scope with the following
> > > type:
> > > 
> > >   volatile sig_atomic_t variable
> > > 
> > > I can see you copied an existing practice.  Sadly all the other
> > > signal handlers are also broken in the same way.
> > > 
> > > The flag bits should be replaced with seperate sig_atomic_t variables.
> > 
> > Ok.  Took a swing at converting the signal handling to sig_atomic_t flags.
> > No additional changes added other than removing a redundant check of the
> > flags in cl_read.c.  Seemed to be a pretty straight-forward conversion and
> > I haven't found any change in behavior.
> > 
> > Tim.
> > 
> > 
> > Index: cl/cl.h
> > ===
> > RCS file: /cvs/src/usr.bin/vi/cl/cl.h,v
> > retrieving revision 1.10
> > diff -u -p -r1.10 cl.h
> > --- cl/cl.h 27 May 2016 09:18:11 -  1.10
> > +++ cl/cl.h 24 Aug 2021 23:34:27 -
> > @@ -11,6 +11,11 @@
> >   * @(#)cl.h10.19 (Berkeley) 9/24/96
> >   */
> >  
> > +extern volatile sig_atomic_t cl_sighup;
> > +extern volatile sig_atomic_t cl_sigint;
> > +extern volatile sig_atomic_t cl_sigterm;
> > +extern volatile sig_atomic_t cl_sigwinch;
> > +
> >  typedef struct _cl_private {
> > CHAR_T   ibuf[256]; /* Input keys. */
> >  
> > @@ -45,11 +50,7 @@ typedef struct _cl_private {
> >  #defineCL_RENAME_OK0x0004  /* User wants the windows renamed. */
> >  #defineCL_SCR_EX_INIT  0x0008  /* Ex screen initialized. */
> >  #defineCL_SCR_VI_INIT  0x0010  /* Vi screen initialized. */
> > -#defineCL_SIGHUP   0x0020  /* SIGHUP arrived. */
> > -#defineCL_SIGINT   0x0040  /* SIGINT arrived. */
> > -#defineCL_SIGTERM  0x0080  /* SIGTERM arrived. */
> > -#defineCL_SIGWINCH 0x0100  /* SIGWINCH arrived. */
> > -#defineCL_STDIN_TTY0x0200  /* Talking to a terminal. */
> > +#defineCL_STDIN_TTY0x0020  /* Talking to a terminal. */
> > u_int32_t flags;
> >  } CL_PRIVATE;
> >  
> > Index: cl/cl_funcs.c
> > ===
> > RCS file: /cvs/src/usr.bin/vi/cl/cl_funcs.c,v
> > retrieving revision 1.20
> > diff -u -p -r1.20 cl_funcs.c
> > --- cl/cl_funcs.c   27 May 2016 09:18:11 -  1.20
> > +++ cl/cl_funcs.c   24 Aug 2021 23:34:27 -
> > @@ -601,7 +601,7 @@ cl_suspend(SCR *sp, int *allowedp)
> > if (cl_ssize(sp, 1, NULL, NULL, ))
> > return (1);
> > if (changed)
> > -   F_SET(CLP(sp), CL_SIGWINCH);
> > +  

Re: allow KARL with config(8)'d kernels

2021-08-29 Thread Ingo Schwarze
Hi,

Theo de Raadt wrote on Sun, Aug 29, 2021 at 07:15:34AM -0600:

> I am not thrilled with the name "kernel.conf".
> It does not seem intuitively discoverable.

What would be a canonical name?

It is a command file for config(8).

Note that the "config-file" for config is something else, and that
other thing is also called a "kernel configuration file", and it
is located in places like /sys/arch/luna88k/conf/GENERIC iiuc.

For command files, we normally use the name of the programming
language as a suffix (*.sh, *.pl, *.py, *.awk, *.sed, ...).  But
here, the domain specific language is really config(8), and *.config
does not work as a suffix because it is next to indistinguishable
from *.conf, and we just saw this is *not* a configuration but a
command file.

The root cause of the terminologic mess here is that the choice of
the name config(8) for the kernel configuration program was too
generic in the first place, but we are almost exactly 40 years late
for fixing that.

One - admittedly completely unUNIXy - way would be to invent a long,
descriptive name like /etc/kernel.config.commands or even /bsd.config.cmd
in the root rather than the /etc directory, which is more discoverable
because it is right next to the kernel itself.  The UNIXy way
is not necessarily better: Ken would have invented a very short,
pronouncable name that is not an English word but similar to one,
like /etc/sycoc (for SYstem COnfiguration Commands).

Is /bsd.config.cmd good enough?

Yours,
  Ingo

> Paul de Weerd  wrote:

>> Index: distrib/miniroot/install.sub
[...]
>>  chroot /mnt /bin/ksh -e -c "cd ${_kernel_dir#/mnt}/$_kernel; \
>> -make newbsd; make newinstall"
>> +make newbsd; \
>> +[ -e /etc/kernel.conf ] && \
>> +config -e -c /etc/kernel.conf -f bsd; \
>> +make newinstall"
>>  ) >/dev/null 2>&1 && echo " done." || echo " failed."



Re: cal(1): Clean up mutually exclusive options

2021-08-16 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Mon, Aug 16, 2021 at 08:15:03AM -0600:
> Jason McIntyre  wrote:

>> well, in those cases i think the authors shared the viewpoint that
>> mutually exclusive means you can;t mix them but in this case it is
>> essentially not an error to do so, and so documented it that way.
>> 
>> maybe it is more helpful to think of "mutually exclusive" as "causes
>> an error", but i am not sure.

Probably you are right that my description of "mutually exclusive"
as "it errors out" was too narrow and that in some situations,
understanding it in a broader sense of "the effect of combining
them is unspecified" or even broader, "combining them provides no
useful functionality" may be adequate.

Still, "mutually exclusive" is not the same concept as "one overrides
the other".

> I agree with you that programs should not neccessary error out.
> 
> I appreciate that you pointed at ls(1) before, which has so many options
> I think we should look there for examples.
> 
>  -A  List all entries except for `.' and `..'.  Always set for the
>  superuser.
> 
>  -a  Include directory entries whose names begin with a dot (`.').
> 
> These two options are not described as mutually exclusive, except, a
> mutually exclusive action is described.
> 
> How do -Aa and -aA behave?  Try it.

Just like -a alone.  That is obviously reasonable because both
options enlarge the set of names listed, and the set provided by
-a is a strict superset of the set provided by -A.

> Does POSIX take a position?

Not as far as i see.

POSIX does explicitly document them as mutually exclusive by
having [-a | -A] in the SYNOPSIS, but what happens when both are
provided appears to remain unspecified.

> If POSIX doesn't take a position, why should we?

We probably shouldn't.  Since behaviour of -aA is unspecied, our
behviour is not only reasonable but also conforming.  Strictly
speaking, it is undocumented, but easy to guess and certainly
not surprising.

> Should we break scripts that accidentally use both options?

Absolutely not.

Even if our page explicitly said "-a and -A are mutually exclusive",
that should only be regarded as an indication that the combination
*might* error out, not that it actually does - i failed to make
that clear in my previous mail.

> Argument composition does happen in scripting and interactive use.
> 
> Should we search for places to make POSIX programs exit(1) if people
> use them in the newly-defined incorrect way?

No.

But conversely, we shouldn't attempt to hunt down all cases either
where programs currently error out when given mutually exlusive
options and make them all override each other.  In many cases, it
is better to not jump to conclusions when the user didn't make it
clear what they wanted.

Unsurprisingly, sane behaviour depends on the individual program
and options.

> If we decide to not cause ls(1) -a and -A to exit, then I think we
> should not make other programs exit either.

Sounds reasonable.

> And then probably follow jmc's lead that "mutually exclusive" either
> in text, or in interpreted text, doesn't neccessarily describe an exit
> action.

That's certainly the current situation.  In some pages, it is used for
combinations that error out, in some for options that override each
other, and when left unqualified, it can mean "the effect of combining
is not considered valid syntax and may have an unspecified effect".

The latter is often fine (like for ls -aA) and more rarely weird
(like for jot -b foo -w bar 3 97).  When it is weird, improving it
may be worthwhile.

Yours,
  Ingo



Re: cal(1): Clean up mutually exclusive options

2021-08-16 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Mon, Aug 16, 2021 at 12:02:13PM +0100:

> when i wrote my mail, i failed to understand that "overrides earlier"
> was really just another way of saying "mutually exclusive".

That is incorrect.

This is what "mutually exclusive" means (without further qualification):

   $ frob -f -u
  usage: frob [-f | -u]

This is what "overrides earlier" means:

   $ frob -f -u
  output from successful unfrobnification

   $ frob -u -f
  output from successful frobnification

> i don;t find it as clear, and i don;t hugely like it, but i guess
> it's just my preference.
> 
> i also dislike the sentence structure of "Overrides earlier -f."
> although it's understandable, it reads like there's a word or phrase
> missing.

Yes, it is an elliptic wording.

> Overrides earlier instances of -f. Overrides previous -f options.

Both are still elliptic.

The intended meaning of the ellipsis is:

  [This option] overrides [any] earlier -f [options].

Should we prefer the non-elliptic wording?

> i just find it clearer when we just say -x and -y are mutually exclusive.

OK, let's spell this out in full.

I'm aware of two ways to document "mutually exclusive":

  -f   Frobnicate.  Mutually exclusive with -u.
  -u   Unfrobnicate.  Mutually exclusive with -f.

Or:

  -f   Frobnicate.
  -u   Unfrobnicate.
  [...]
  The options -f and -u are mutually exclusive.

I have no strong preference either way because if a user specifies both,
they get the usage() message, so the error is likely to be discovered
either way.


I'm aware of two ways to document "overrides":

  -f   Frobnicate.  Overrides earlier -u.
  -u   Unfrobnicate.  Overrides earlier -f.

Or:

  -f   Frobnicate.
  -u   Unfrobnicate.
  [...]
  The options -f and -u override each other [and the action
  is determined by the last one specified].

I prefer the the former because the latter is easy to miss (in
addition to being longer), and there is no error message to tell
the user that *something* is going on.


Some pages use

  The options -f and -u are mutually exclusive and override each other.

as a synonym for

  The options -f and -u override each other.

That may have caused your confusion.  Arguably, it might be better
to avoid the somewhat redundant and potentially confusing wording
"are mutually exclusive and override each other".  Do you agree?

Yours,
  Ingo



Re: cal(1): Clean up mutually exclusive options

2021-08-16 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Sun, Aug 15, 2021 at 05:42:13PM -0600:

> Is it really reasonable that we should strictly follow a non-applicable
> standard in such rarely used non-standard utilities,

Not strictly, no.  Usefulness needs to be considered in individual
cases.

There is value in not *gratuitiously* violating POSIX utility syntax
guidelines though, for two reasons:

 1. They make sense independently of the fact that they are in POSIX.

 2. Following this (small) set of (relatively unintrusive) guidelines
helps with designing a harmonious user experience all across
the system.  On the one hand, it helps avoiding surprises.
On the other hand, it saves the user from having to learn
completely new ways of constructing syntax for every new command.

But this is unrelated to martijn@'s cal(1) diff.  POSIX provides
no arguments for or against that diff as far as i'm aware.

Yours,
  Ingo



Re: cal(1): Clean up mutually exclusive options

2021-08-16 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Sun, Aug 15, 2021 at 11:30:05PM +0100:

> can't we take a stance that where options override each other,
> the last one wins?

Yes, that is possible.

Cases exist where one option overrides another and order does not
matter - for example, "lock -n -t -1" is the same as just "lock -n",
even though -t comes after -n.

But if options override *each other*, that means the last one wins
unless otherwise stated.

A utility can also be designed such that some options override each
other and the first one wins, but usually, i wouldn't consider that
ideal UI design in a command line utility.  In configuration files,
on the other hand, both "first one wins" and "last one wins" exist.
For that reason, nothing much seems wrong with being explicit that
the last one wins, where that is the case.

> and then not document this fact every time. or at least document
> exceptions only?
> 
> ...and continue to document where options are mutually exclusive? 
> 
> also the text "overrides earlier" is unclear.

In which sense do you consider

  -f   Frobnicate.  Overrides earlier -u.
  -l   [...]
  -n   [...]
  -u   Unfrobnicate.  Overrides earlier -f.

unclear?  Options exclusively appear on the command line, so i don't
see how "earlier" or "later" could possibly refer to anything else
than the position on the command line.

Note that this seems only loosely related to martijn@'s diff, which
i think is undesirable for other, stronger reasons.

Yours,
  Ingo



Re: cal(1): Clean up mutually exclusive options

2021-08-16 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Sun, Aug 15, 2021 at 11:40:49PM +0200:
> To quote schwarze in the jot mutually exclusive thread:
> On Fri, 2021-08-13 at 11:48 +0200, Ingo Schwarze wrote:

>> In this case, even though this is not a POSIX command, POSIX utility
>> convention 12.2.11 is pertinent:
>> 
>>   The order of different options relative to one another should not
>>   matter, unless the options are documented as mutually-exclusive
>>   and such an option is documented to override any incompatible
>>   options preceding it.  If an option that has option-arguments is
>>   repeated, the option and option-argument combinations should be
>>   interpreted in the order specified on the command line.

> This is also violated by cal(1)

You seem to misunderstand.

The cal(1) utility does *not* violate this syntax guideline.
It neither has options where the order on the command line matters
nor options that take arguments, so this guideline simply makes
no recommendation whatsoever for cal(1).

> (and maybe others, but this one came
> up first). Diff below should fix this.

I think the diff is not OK.

It changes established and documented behaviour of a program, and
you provide no explanation why the new behaviour would be more
useful than the old one.  Also, considering whether this might or
might not cause compatibility issues would would be required.

Compatibility issues may be unlikely because it merely defines new
behaviour for something that is now a syntax error.  That alone does
not make the change useful, though.

The change itself seems illogical and confusing to me.  If a user
says "i want days numbered consecutively from Jan 1 *and* week
numbers displayed in addition", why should the first half of the
request be ignored?  And worse, *silently* ignored?  Both halfs of
the request are unrelated to each other, so both one overriding
the other and their effect depending on their order would seem
surprising and hardly useful.

Also, allowing additional syntax makes the user interface more
complex, which should not be done without a very good reason.

Finally, even *if* we would decide to make the UI more complex and
allow -jw and -wj (i see no pressing need because -j does not seem
all that useful in the first place, and needing both at the same
time would be an even less common task), then the logical behaviour
would be to simply honour both at the the same time.  So your change
not only makes the UI more complex without a good reason.  It also
blocks syntax space that could, in the future, possibly be assigned
to more useful functionality.

I admit there are no strong logical reasons why these two options
have to mutually exclusive.  That's seems merely an implementation
choice to keep the command and documentation simpler.  Nothing is
wrong with having some (not usually needed) functionality unimplemented
in a program.  In this case, this is even properly documented.

To summarize, i oppose the diff as a whole, and i see not parts in
it that could be salvaged.

Yours,
  Ingo


> Index: cal.1
> ===
> RCS file: /cvs/src/usr.bin/cal/cal.1,v
> retrieving revision 1.31
> diff -u -p -r1.31 cal.1
> --- cal.1 27 Nov 2016 10:37:22 -  1.31
> +++ cal.1 15 Aug 2021 21:39:28 -
> @@ -53,11 +53,8 @@ The options are as follows:
>  .Bl -tag -width Ds
>  .It Fl j
>  Display Julian dates (days one-based, numbered from January 1).
> -The options
> -.Fl j
> -and
> -.Fl w
> -are mutually exclusive.
> +Overrides earlier
> +.Fl w .
>  .It Fl m
>  Display weeks starting on Monday instead of Sunday.
>  .It Fl w
> @@ -65,11 +62,8 @@ Display week numbers in the month displa
>  If
>  .Fl m
>  is specified the ISO week format is assumed.
> -The options
> -.Fl j
> -and
> -.Fl w
> -are mutually exclusive.
> +Overrides earlier
> +.Fl j .
>  .It Fl y
>  Display a calendar for the current year.
>  .El
> Index: cal.c
> ===
> RCS file: /cvs/src/usr.bin/cal/cal.c,v
> retrieving revision 1.30
> diff -u -p -r1.30 cal.c
> --- cal.c 9 Oct 2015 01:37:06 -   1.30
> +++ cal.c 15 Aug 2021 21:39:28 -
> @@ -158,12 +158,14 @@ main(int argc, char *argv[])
>   switch(ch) {
>   case 'j':
>   julian = 1;
> + wflag = 0;
>   break;
>   case 'm':
>   mflag = 1;
>   break;
>   case 'w':
>   wflag = 1;
> + julian = 0;
>   break;
>   case 'y':
>   yflag = 1;
> @@ -174,9 +176,6 @@ main(int argc, char *argv[])
>   }
>   argc -= optind;
>   argv += optind;
> -
> - if (julian && wflag)
> - usage();
>  
>   day_headings = DAY_HEADINGS_S;
>   sep1752 = sep1752s;



Re: fresh prompt after Ctrl-C in cdio(1)

2021-08-13 Thread Ingo Schwarze
Hi,

Christian Weisgerber wrote on Thu, Aug 12, 2021 at 06:31:56PM +0200:
> Ingo Schwarze:

>> deraadt@ recently pointed out to me in private mail that it is good
>> for usability if interactive programs providing line editing
>> functionality are consistent what they do with Ctrl-C, ideally
>> discard the current input line and provide a fresh prompt.
>> 
>> So i propose to do the same in cdio(1), which currently just exits
>> on Ctrl-C.
>> 
>> OK?

> That looks correct and works fine for me.  ok naddy@

Committed for now, thanks for checking.

> (I still have a USB CD drive and I use it from time to time to play
> audio CDs with cdio.  I had completely forgotten that cdio supports
> editline, though.  The cdio interactive mode is pretty useless since
> interrupting a running command aborts the program.)

Any user of this program with a sufficient knowledge of signal
handling is welcome to design, implement, and test signal handling
that remains active during other parts of the execution of the
cdio(1) program.  The committed patch provides a starting point.

Yours,
  Ingo



Re: jot(1): make -b -c and -w mutually exclusive

2021-08-13 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Fri, Aug 13, 2021 at 08:51:42AM +0200:

> So here's the first one:
> - When -b is set, but followed by -w it doesn't remove the boring flag
>   and the -w is interpreted as a -b string

I agree that can't be quite right.

> - -c is defined as "This is an abbreviation for -w %c.", but adding a
>   -w option itself results in:
>   $ jot -cw"bla " 5 97
>   bla a
>   bla b
>   bla c
>   bla d
>   bla e
> 
>   instead of
> 
>   $ jot -w"bla " 5 97 
>   bla 97
>   bla 98
>   bla 99
>   bla 100
>   bla 101

I agree that cannot be right either.

It was even listed in my personal BUGS.txt file for the utility.

> - -b always overwrites -c.

That isnt necessarily wrong, ...

> tb already agrees that these options should be mutually exlusive, so
> here's the accompanying code. I choose to go for the last option wins
> appraoch, similar to ls, df, du, ...

 ... but i agree "mutually exclusive, overriding" is potentially
more useful.

In this case, even though this is not a POSIX command, POSIX utility
convention 12.2.11 is pertinent:

  The order of different options relative to one another should not
  matter, unless the options are documented as mutually-exclusive
  and such an option is documented to override any incompatible
  options preceding it.  If an option that has option-arguments is
  repeated, the option and option-argument combinations should be
  interpreted in the order specified on the command line.

In its current state, the program violates that.  With your patch,
it conforms.

> Regress seems to pass.
> 
> OK?

OK schwarze@.

Consider putting the declaration of "bool word" alongside the other
"static bool" in file scope.  I admit this one is only used in main,
but so are finalnl, infinity, and randomize, and physicists like
symmetry.

I would prefer to keep the documentation of overriding together
with each option, making it harder to overlook and even reducing
the total amount of text.

That is, append

 * "Overrides earlier -b, -c, and -w." to the -b and -w list entries
 * "Overrides earlier -b and -w." to the -c list entry.

OK either way, though.

Yours,
  Ingo


> Index: jot.c
> ===
> RCS file: /cvs/src/usr.bin/jot/jot.c,v
> retrieving revision 1.51
> diff -u -p -r1.51 jot.c
> --- jot.c 30 Jul 2021 02:47:37 -  1.51
> +++ jot.c 13 Aug 2021 06:50:42 -
> @@ -86,6 +86,7 @@ main(int argc, char *argv[])
>   int n = 0;
>   int ch;
>   const char  *errstr;
> + boolword;
>  
>   if (pledge("stdio", NULL) == -1)
>   err(1, "pledge");
> @@ -94,10 +95,13 @@ main(int argc, char *argv[])
>   switch (ch) {
>   case 'b':
>   boring = true;
> + chardata = word = false;
>   format = optarg;
>   break;
>   case 'c':
>   chardata = true;
> + boring = word = false;
> + format = "";
>   break;
>   case 'n':
>   finalnl = false;
> @@ -115,6 +119,8 @@ main(int argc, char *argv[])
>   sepstring = optarg;
>   break;
>   case 'w':
> + word = true;
> + boring = chardata = false;
>   format = optarg;
>   break;
>   default:



Re: fresh prompt after Ctrl-C in cdio(1)

2021-08-12 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Thu, Aug 12, 2021 at 04:37:24PM +0200:

> Maybe I've used cdio once or twice and I don't have a cd-player at hand
> (at least connected to my workstation) to test this. So purely from code
> inspection: You set the signal handler before entering el_gets and you
> ignore it right after.

More precisely, i restore default signal handling right after, which
is "terminate process" for SIGINT.  In most situations, i dislike
ignoring SIGINT.

> Wouldn't this imply that if you do something like "cdplay" at the prompt
> and while you wait for the cd to spin you hit ^C the application just
> exits?

Yes.

> If so, wouldn't it make more sense to install the signal handler once
> and let open_cd() handle EINTR as well and just return to the prompt?

Maybe.  The question whether cdio(1) should allow Ctrl-C to be used to
interrupt more operations, and not just command line editing, and return
to the interactive prompt in that case is certainly legitimate.

Then again, my point is to unify line editing across different
programs, not to consider overall signal handling in cdio(1).
As long as we only touch line editing in cdio(1), that goal can be
reached and the risk of regressions is easier to control.  Extending
signal handling over larger swaths of code requires reviewing more
code and making sure what Ctrl-C does is sane in all code paths in
all of that code.

For example, looking at the function run(), i see that most commands
start by calling open_cd(), but not all do.  A few do not call it at
all (e.g. CMD_SET, CMD_HELP), in which case we would have to make sure
that the signal handler does not remain installed longer than intended,
a few do some other work before calling it (e.g. CMD_DEVICE).

So i would prefer to leave the broader question of signal handling
in cdio(1), outside the specific domain of command line editing,
to people who are more familiar with the code and usage of cdio(1).

If somebody wants to develop, test and propose a patch with a wider
scope, for example a global signal handler that can interupt any
operation - such a patch would have to make sure cleanup is done
as required after all operations that might get aborted -, i'm not
opposed to that, but it don't feel like developing such a patch
myself at this point.

Yours,
  Ingo



fresh prompt after Ctrl-C in cdio(1)

2021-08-12 Thread Ingo Schwarze
Hi,

deraadt@ recently pointed out to me in private mail that it is good
for usability if interactive programs providing line editing
functionality are consistent what they do with Ctrl-C, ideally
discard the current input line and provide a fresh prompt.
At least that is what bc(1), ftp(1), sftp(1) and all shells do.

So i propose to do the same in cdio(1), which currently just exits
on Ctrl-C.

Sending to tech@ because i'm not quite sure which developer owns cdio(1),
or who might be interested.  According to the CVS logs, naddy@ was the
last one to change user-visible functionality, and before that, there
was nothing but bug fixes and cleanup since 2010.

OK?
  Ingo

P.S.
Note that the current "!siz" is fragile; el_gets(3) sets it to -1
on error.  I'm polishing that while here.

P.P.S.
buf = (char *)el_gets(...) is ugly and potentially dangerous, the
manual page explicitly warns to not do that, but that's a job for
another day.

P.P.P.S.
Insects are the most successful class of animals on earth.
Even in OpenBSD, it is rarely possible to look at code without
finding something unrelated that could be improved, too.


Index: cdio.c
===
RCS file: /cvs/src/usr.bin/cdio/cdio.c,v
retrieving revision 1.80
diff -u -p -r1.80 cdio.c
--- cdio.c  18 Jan 2021 00:44:00 -  1.80
+++ cdio.c  12 Aug 2021 13:36:38 -
@@ -63,6 +63,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -158,6 +159,7 @@ int verbose = 1;
 intmsf = 1;
 const char *cddb_host;
 char   **track_names;
+volatile sig_atomic_t signo;
 
 EditLine   *el = NULL; /* line-editing structure */
 History*hist = NULL;   /* line-editing history */
@@ -179,6 +181,7 @@ int pstatus(char *arg);
 intplay_next(char *arg);
 intplay_prev(char *arg);
 intplay_same(char *arg);
+void   sigint_handler(int);
 char   *input(int *);
 char   *prompt(void);
 void   prtrack(struct cd_toc_entry *e, int lastflag, char *name);
@@ -1499,18 +1502,36 @@ status(int *trk, int *min, int *sec, int
return s.data->header.audio_status;
 }
 
+void
+sigint_handler(int signo_arg)
+{
+   signo = signo_arg;
+}
+
 char *
 input(int *cmd)
 {
+   struct sigaction sa;
char *buf;
int siz = 0;
char *p;
HistEvent hev;
 
+   memset(, 0, sizeof(sa));
do {
-   if ((buf = (char *) el_gets(el, )) == NULL || !siz) {
-   *cmd = CMD_QUIT;
+   signo = 0;
+   sa.sa_handler = sigint_handler;
+   if (sigaction(SIGINT, , NULL) == -1)
+   err(1, "sigaction");
+   buf = (char *)el_gets(el, );
+   sa.sa_handler = SIG_DFL;
+   if (sigaction(SIGINT, , NULL) == -1)
+   err(1, "sigaction");
+   if (buf == NULL || siz <= 0) {
fprintf(stderr, "\r\n");
+   if (siz < 0 && errno == EINTR && signo == SIGINT)
+   continue;
+   *cmd = CMD_QUIT;
return (0);
}
if (strlen(buf) > 1)



libedit: stop playing hopeless games with FIONBIO

2021-08-11 Thread Ingo Schwarze
Hi,

in its current state, the editline(3) library is completely unfit
to work with non-blocking I/O.  Neither the API nor the implementation
are even designed for that purpose.

I do have some candidate ideas to minimally extend the API - without
adding any new functions - to also work with non-blocking I/O, but
frankly, i don't see any sane use case for it yet.  Why would anyone
desire to mix non-blocking I/O and interactive line editing with the
editline(3) library on the same file descriptor?

Consequently, i propose that we document the facts up front and
simply delete some code that isn't going to do any good in practice.
For programs not using non-blocking I/O on the editline input file
descriptor, this patch makes no difference.  Programs that do set
FIONBIO on a file descriptor and then call editline(3) on it anyway
can't possibly work as intended as far as i can see.  Either setting
FIONBIO is needless in the first place, or el_gets(3) clearing it
will ruin the rest of the program's functionality.

Do any of you have any doubts, and if so, which ones?

Or is anybody willing to OK this patch?


If soembody comes up with a sane use case for mixing FIONBIO with
editline(3) at a later point in time, we can still consider my plans
to make that work.  In that case, deleting the junk deleted by this
patch would be required as the first step anyway.


I'm appending a simple test program at the end.

Before the patch:

  nbio is on
  ?test
  the user said: test
  nbio is off/* oops, what is this? */

After the patch:

  nbio is on
  ?progname: el_gets: Resource temporarily unavailable

I think the latter makes more sense and is less surprising.

Yours,
  Ingo


Index: editline.3
===
RCS file: /cvs/src/lib/libedit/editline.3,v
retrieving revision 1.46
diff -u -p -r1.46 editline.3
--- editline.3  22 May 2016 22:08:42 -  1.46
+++ editline.3  11 Aug 2021 16:20:00 -
@@ -169,6 +169,20 @@ Programs should be linked with
 .Pp
 The
 .Nm
+library is designed to work with blocking I/O only.
+If the
+.Dv FIONBIO
+.Xr ioctl 2
+is set on
+.Ar fin ,
+.Fn el_gets
+and
+.Fn el_wgets
+will almost certainly fail with
+.Ar EAGAIN .
+.Pp
+The
+.Nm
 library respects the
 .Ev LC_CTYPE
 locale set by the application program and never uses
Index: read.c
===
RCS file: /cvs/src/lib/libedit/read.c,v
retrieving revision 1.47
diff -u -p -r1.47 read.c
--- read.c  11 Aug 2021 15:13:46 -  1.47
+++ read.c  11 Aug 2021 16:20:00 -
@@ -66,7 +66,6 @@ struct el_read_t {
int  read_errno;
 };
 
-static int read__fixio(int, int);
 static int read_char(EditLine *, wchar_t *);
 static int read_getcmd(EditLine *, el_action_t *, wchar_t *);
 static voidread_clearmacros(struct macros *);
@@ -132,26 +131,6 @@ el_read_getfn(struct el_read_t *el_read)
 }
 
 
-/* read__fixio():
- * Try to recover from a read error
- */
-static int
-read__fixio(int fd, int e)
-{
-   int zero = 0;
-
-   switch (e) {
-   case EAGAIN:
-   if (ioctl(fd, FIONBIO, ) == -1)
-   return -1;
-   return 0;
-
-   default:
-   return -1;
-   }
-}
-
-
 /* el_push():
  * Push a macro
  */
@@ -231,15 +210,12 @@ static int
 read_char(EditLine *el, wchar_t *cp)
 {
ssize_t num_read;
-   int tried = 0;
char cbuf[MB_LEN_MAX];
int cbp = 0;
-   int save_errno = errno;
 
  again:
el->el_signal->sig_no = 0;
while ((num_read = read(el->el_infd, cbuf + cbp, 1)) == -1) {
-   int e = errno;
if (errno == EINTR) {
switch (el->el_signal->sig_no) {
case SIGCONT:
@@ -252,14 +228,8 @@ read_char(EditLine *el, wchar_t *cp)
break;
}
}
-   if (!tried && read__fixio(el->el_infd, e) == 0) {
-   errno = save_errno;
-   tried = 1;
-   } else {
-   errno = e;
-   *cp = L'\0';
-   return -1;
-   }
+   *cp = L'\0';
+   return -1;
}
 
/* Test for EOF */


 - 8< - schnipp - >8 - 8< - schnapp - >8 -


#include 
#include 
#include 
#include 
#include 

int
main(void)
{
EditLine*el;
const char  *line;
int  len;
int  flags = O_NONBLOCK;

if (fcntl(STDIN_FILENO, F_SETFL, ) == -1)
err(1, "fcntl(F_SETFL)");
flags = fcntl(STDIN_FILENO, F_GETFL);
printf("nbio is %s\n", flags & O_NONBLOCK ? "on" : "off");
if ((el = el_init("test", stdin, stdout, stderr)) == NULL)
err(1, "el_init");
if ((line = el_gets(el, )) == 

Re: date -j and seconds since the Epoch

2021-08-11 Thread Ingo Schwarze
Hi,

> ok gerhard@

Thanks for reporting, for the initial patch, and for checking the
final one.  This now committed.

Yours,
  Ingo


>> Index: date.c
>> ===
>> RCS file: /cvs/src/bin/date/date.c,v
>> retrieving revision 1.56
>> diff -u -p -r1.56 date.c
>> --- date.c   8 Aug 2019 02:17:51 -   1.56
>> +++ date.c   6 Aug 2021 17:48:01 -
>> @@ -219,7 +219,11 @@ setthetime(char *p, const char *pformat)
>>  }
>>   
>>  /* convert broken-down time to UTC clock time */
>> -if ((tval = mktime(lt)) == -1)
>> +if (pformat != NULL && strstr(pformat, "%s") != NULL)
>> +tval = timegm(lt);
>> +else
>> +tval = mktime(lt);
>> +if (tval == -1)
>>  errx(1, "specified date is outside allowed range");
>>   
>>  if (jflag)



Re: libedit: stop ignoring SIGINT

2021-08-09 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Mon, Aug 09, 2021 at 02:15:42PM +0200:

> If we're stripping down fixio to a single function, why not
> inline the whole thing?

I deliberately tried to:

 1. Keep patches that change behaviour as small as possible to make
review as simple as possible for people who fear they might be
affected but are not specifically interested in editline(3).

 2. Make sure that reorg / cleanup patches do not change behaviour.

> Also, as far as my brain explains things to me, it could theoretically
> be possible that the signal handler kicks in right after we return a -1
> from read(2), making it possible to get something like an EIO and
> entering the sig switch statement. Assuming that a signal handler
> doesn't clobber our errno (which libedit doesn't do, but who knows what
> bad code is out there) we should probably only check the signal type
> when we know we have an EINTR.

Yes, that looks like a race condition bug that so far escaped my
attention.  But it seems unrelated to what should happen with signals
in general, so can we keep fixing that bug separate?

> Finally, I don't understand why we only have a single retry on EAGAIN?

Not having *any* retries inside read_char() would look like a worthy
long-term goal to me, but i'm not yet completely sure that can be
reached.  I would prefer to steer into the direction of fewer magic
retries rather than more of them.  Either way, EAGAIN seems unrelated
to SIGINT, so i'd prefer to keep topics separate.

> If the application keeps resetting the FIONBIO in such a furious way
> that it happens more then once in a single call (no idea how that could
> happen) it might be an uphill battle, but we just burn away cpu cycles.
> It is not and should probably not be treated like something fatal.

Actually, if the application sets FIONBIO at all (even once), chances
are quite low in the first place that stuff works as inteded by the
author of the application.  So i certainly wouldn't worry about an
application setting FIONBIO in a loop, we have significantly worse
problems than that.  But again, that's a separate topic.

> Diff below does this. (minus 27 LoC)

I do not in general object to cleaning this code up, and getting rid
of read__fixio() indeed seems to be a long-term goal.  But i hope
we will be able to remove all the (mostly broken) functionality
from read__fixio() in a step-by-step manner, and once the function
is empty, it will fade away without having to disturb the code in
read_char().  Either way - separate topic...

The most important short-term goal seems to fix sftp(1), including
the steps required to get that done in a clean way.

> The following ports use libedit (there might be a better way of
> finding them, but this works):

Hmm, thanks, that list feels useful.

> So if we decide which of our interpretations should take precedence
> it might be a good idea to put it into snaps for a while.

I don't think so in this case.  Let's not over-use the feature of
putting stuff in snaps.  I think that should be reserved for stuff
that is quite important and somewhat urgent and can't easily be
tested in a less disruptive way.  But here, testing a program is
quite feasible once you know which program to test.

Besides, *if* this patch causes a change in behaviour of a port,
the most likely change is that a program that now ignores SIGINT
exits on SIGINT afterwards.  That may be worth investigating and
making a decision on in each individual case, but it's not a
super-critical change in behaviour that might require testing
in snaps.

Yours,
  Ingo



libedit: stop ignoring SIGINT

2021-08-09 Thread Ingo Schwarze
Hi,

as mentioned earlier, deraadt@ reported that sftp(1) ignores Ctrl-C.
Fixing that without longjmp(3) requires making editline(3) better
behaved.

Currently, when read(2) from the terminal gets interrupted by a
signal, editline(3) ignores the (first) signal and unconditionally
calls read(2) a second time.  That seems wrong: if the user hits
Ctrl-C, it is sane to assume that they meant it, not that they
want it ignored.

The following patch causes el_gets(3) and el_wgets(3) to return
failure when read(2)ing from the terminal is interrupted by a
signal other than SIGCONT or SIGWINCH.  That allows the calling
program to decide what to do, usually either exit the program or
provide a fresh prompt to the user.

If i know how to grep(1), the following programs in the base system
use -ledit:

 * bc(1)
   It behaves well with the patch: Ctrl-C discards the current
   input line and starts a new input line.
   The reason why this already works even without the patch
   is that bc(1) does very scary stuff inside the signal handler:
   pass a file-global EditLine pointer on the heap to el_line(3)
   and access fields inside the returned struct.  Needless to
   say that no signal handler should do such things...

 * cdio(1)
   Behaviour is acceptable and unchanged with the patch:
   Ctrl-C exits cdio(1).

 * ftp(1)
   It behaves well with the patch: Ctrl-C discards the current
   input line and provides a fresh prompt.
   The reason why it already works without the patch is that ftp(1)
   uses setjmp(3)/longjmp(3) to forcefully grab back control
   from el_gets(3) without waiting for it to return.

 * lldb(1)
   It misbehaves with or without the patch and ignores Ctrl-C.
   I freely admit that i don't feel too enthusiastic about
   debugging that beast.

 * sftp(1)
   Behaviour is improved with the patch: Ctrl-C now exits sftp(1).
   If desired, i can supply a very simple follow-up patch to sftp.c
   to instead behave like ftp(1) and bc(1), i.e. discard the
   current input line and provide a fresh prompt.
   Neither doing undue work in the signal handler nor longjmp(3)
   will be required for that (if this patch gets committed).

 * bgplgsh(8), fsdb(8)
   I have no idea how to test those.  Does anyone think that testing
   either of them would be required?

Regarding the patch below, note that differentiating EINTR behaviour
by signal number is not needed because the calling code in read_char()
already handles SIGCONT and SIGWINCH, so those two never arrive in
read__fixio() in the first place.

Also note that deraadt@ pointed out in private mail to me that the
fact that read__fixio() clears FIONBIO is probably a symptom of
botched editline(3) API design.  That might be worth fixing, too,
as far as that is feasible, but it is unrelated to the sftp(1)
Ctrl-C issue; let's address one topic at a time.

Any feedback is welcome.

Yours,
  Ingo

P.S.
I decided to Cc: tech@ again because this patch might affect
even people who are not specifically interested in editline(3),
and i have no intention to cause unpleasant surprises.


Index: read.c
===
RCS file: /cvs/src/lib/libedit/read.c,v
retrieving revision 1.45
diff -u -U15 -r1.45 read.c
--- read.c  9 Aug 2021 09:11:26 -   1.45
+++ read.c  9 Aug 2021 10:00:18 -
@@ -134,22 +134,19 @@
 
 /* read__fixio():
  * Try to recover from a read error
  */
 static int
 read__fixio(int fd, int e)
 {
int zero = 0;
 
switch (e) {
case EAGAIN:
if (ioctl(fd, FIONBIO, ) == -1)
return -1;
return 0;
 
-   case EINTR:
-   return 0;
-
default:
return -1;
}
 }



Re: libedit: read__fixio cleanup

2021-08-09 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Mon, Aug 09, 2021 at 11:04:35AM +0200:

> Btw, would there be any benefit to declare zero as const in this
> context?

I dont think so.

At least according to the ioctl(2) manual page, FIONBIO expects
an int * argument, not a const int *.

Yours,
  Ingo



libedit: read__fixio cleanup

2021-08-08 Thread Ingo Schwarze
Hi,

deraadt@ recently reported to me that the editline(3) library, while
line editing is active - for example inside el_gets(3) - ignores
the first SIGINT it receives, for example the the first Ctrl-C the
user presses.  I consider that a bug in the editline(3) library.
Some programs, for example our old ftp(1) implementation, work
around the bug in a horrible way by using setjmp(3)/longjmp(3).

The root cause of the problem is in libedit/read.c, in the interaction
of the functions read_char() and read__fixio().  Before fixing the bug
can reasonably be considered, the function read__fixio() direly needs
cleanup.  As it stands, it is utterly unreadable.

So here is a patch to make it clear what the function does, with
no functional change intended yet (-37 +5 LOC).

There will be one or more follow-up patches.  If you want to receive
them, please reply to me, i won't necessarily send them all to
tech@.

I see some value in avoiding gratuitious divergence from NetBSD,
but getting rid of this horrible mess is not gratuitious.

Rationale:
 * Do not mark an argument as unused that is actually used.
 * errno cannot possibly be -1.  Even is it were, treating it as
   EAGAIN makes no sense, treating it as the most severe error
   imaginable makes more sense to me.
 * We define EWOULDBLOCK to be the same as EAGAIN, so no need
   to handle it separately.
 * No need to #define TRY_AGAIN to use it just once.
 * Don't do the same thing twice.  We do support the FIONBIO ioctl(2),
   so the the indirection using the F_GETFL fcntl(2) can be deleted.
 * No need to play confusing games with the "e" variable.
   Just return -1 for error or 0 for success in a straightforward
   manner.

OK?
  Ingo

P.S.
I also considered whether this FIONBIO dance should better be done
at the initialization stage rather than after EAGAIN already happened.
But maybe not.  This is a library.  The application program might
set the fd to non-blocking mode at any time and then call el_gets(3)
again, in which case the library needs to restore blocking I/O to
do its work.


Index: read.c
===
RCS file: /cvs/src/lib/libedit/read.c,v
retrieving revision 1.44
diff -u -p -r1.44 read.c
--- read.c  25 May 2016 09:36:21 -  1.44
+++ read.c  8 Aug 2021 10:30:06 -
@@ -39,9 +39,10 @@
  * read.c: Clean this junk up! This is horrible code.
  *Terminal read functions
  */
+#include 
+
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -134,55 +135,16 @@ el_read_getfn(struct el_read_t *el_read)
 /* read__fixio():
  * Try to recover from a read error
  */
-/* ARGSUSED */
 static int
-read__fixio(int fd __attribute__((__unused__)), int e)
+read__fixio(int fd, int e)
 {
+   int zero = 0;
 
switch (e) {
-   case -1:/* Make sure that the code is reachable */
-
-#ifdef EWOULDBLOCK
-   case EWOULDBLOCK:
-#ifndef TRY_AGAIN
-#define TRY_AGAIN
-#endif
-#endif /* EWOULDBLOCK */
-
-#if defined(POSIX) && defined(EAGAIN)
-#if defined(EWOULDBLOCK) && EWOULDBLOCK != EAGAIN
case EAGAIN:
-#ifndef TRY_AGAIN
-#define TRY_AGAIN
-#endif
-#endif /* EWOULDBLOCK && EWOULDBLOCK != EAGAIN */
-#endif /* POSIX && EAGAIN */
-
-   e = 0;
-#ifdef TRY_AGAIN
-#if defined(F_SETFL) && defined(O_NDELAY)
-   if ((e = fcntl(fd, F_GETFL)) == -1)
+   if (ioctl(fd, FIONBIO, ) == -1)
return -1;
-
-   if (fcntl(fd, F_SETFL, e & ~O_NDELAY) == -1)
-   return -1;
-   else
-   e = 1;
-#endif /* F_SETFL && O_NDELAY */
-
-#ifdef FIONBIO
-   {
-   int zero = 0;
-
-   if (ioctl(fd, FIONBIO, ) == -1)
-   return -1;
-   else
-   e = 1;
-   }
-#endif /* FIONBIO */
-
-#endif /* TRY_AGAIN */
-   return e ? 0 : -1;
+   return 0;
 
case EINTR:
return 0;



Re: less(1): refreshing file of size 0 results in file being treated as a pipe

2021-08-07 Thread Ingo Schwarze
Hi,

thank you for reporting this bug and for providing a patch to fix it.
I just committed your patch.

Also thanks to tb@ and deraadt@ for cross-checking the patch.

Yours,
  Ingo


user wrote on Thu, Aug 05, 2021 at 12:43:21AM -0500:

> Oops, forgot that OpenBSD doesn't have ! capability in less. Instead of !echo 
> a > % and !echo b > %, run 
> $ echo a > /tmp/test
> Press h and q in less to reload the file
> $ echo b > /tmp/test
> Press h and q in less to reload the file
> 
> On Thu, Aug 05, 2021 at 12:37:00AM -0500, user wrote:
> > Bug Reproduction:
> > $ touch /tmp/test
> > $ less /tmp/test
> > Press r
> > Run !echo a > %
> > Run !echo b > %
> > Output:
> > a
> > b
> > 
> > On Fri, Jul 23, 2021 at 11:15:59AM -0500, user wrote:
> > > Less contains a hack to force files of size 0 to become non-seekable in 
> > > order to workaround a linux kernel bug. 
> > > 
> > > When the file becomes non-seekable any further reads from the file are 
> > > appended rather than overwriting the original contents of the file.
> > > 
> > > diff --git ch.c ch.c
> > > index 1a679767a42..d7c0aa34e90 100644
> > > --- ch.c
> > > +++ ch.c
> > > @@ -643,19 +643,6 @@ ch_flush(void)
> > > ch_block = 0; /* ch_fpos / LBUFSIZE; */
> > > ch_offset = 0; /* ch_fpos % LBUFSIZE; */
> > > 
> > > -#if 1
> > > -   /*
> > > -* This is a kludge to workaround a Linux kernel bug: files in
> > > -* /proc have a size of 0 according to fstat() but have readable
> > > -* data.  They are sometimes, but not always, seekable.
> > > -* Force them to be non-seekable here.
> > > -*/
> > > -   if (ch_fsize == 0) {
> > > -   ch_fsize = -1;
> > > -   ch_flags &= ~CH_CANSEEK;
> > > -   }
> > > -#endif
> > > -
> > > if (lseek(ch_file, (off_t)0, SEEK_SET) == (off_t)-1) {
> > > /*
> > >  * Warning only; even if the seek fails for some reason



Re: date -j and seconds since the Epoch

2021-08-06 Thread Ingo Schwarze
Hi,

sorry for the afterthought, i just noticed that the Subject: line
is misleading.  This patch has nothing to do with -j.  If -j is not
specified, this patch changes the value passed to adjtime(2) or
settimeofday(2), which arguably matters even more than something
merely printed on stdout.

However, the value being passed to adjtime(2) or settimeofday(2)
right now is just as wrong as the one printed, if -f %s is used,
so i think the patch is correct for that case, too.

Yours,
  Ingo



Re: date -j and seconds since the Epoch

2021-08-06 Thread Ingo Schwarze
Hi Gerhard and Bryan,

Gerhard Roth wrote on Mon, Aug 02, 2021 at 10:36:05AM +0200:

> Bryan Vyhmeister found a strange behavior in date(1):
>
>   # date -f %s -j 1627519989
>   Thu Jul 29 01:53:09 PDT 2021
>   # date -u -f %s -j 1627519989
>   Thu Jul 29 00:53:09 UTC 2021
> 
> Looks like PDT is GMT-1, which of course is wrong.
> 
> The problem arises from the -f option. The argument of date(1) is passed
> to strptime(3). Normally, this will return a broken down time in the
> local timezone.

This claim confused me somewhat at first.  I think a more accurate
statement would be that strptime(3) does not use the TZ at all.
It merely parses the string and fills the fields in struct tm.
The question whether the caller had any particular time zone in
mind does not even arise here.

Since date(1) says:

  ENVIRONMENT
 TZ   The time zone to use when parsing or displaying dates.  [...]

the command

   $ date -f %H:%M -j 9:00

is correct to essentially just echo "09:00" back at you
because date(1) is specified to use the same time zone
for parsing and printing.

> But the '%s' format makes an exception and returns a
> date in UTC.

That is indeed true.

So for %s the time zone used for parsing is necessarily different,
while the time zone used for printing is still the $TZ specified
by the user (or the /etc/localtime specified by the sysadmin).

So i think your approach of using timegm(3) for %s and mktime(3)
otherwise is essentially correct.

However, a format string can contain more characters than just 
a single conversion specification.  For example, somebody might
wish to parse an input file containing a line

  SSE=1627519989

and legitimately say

  $ date -f SSE=%s -j $(grep SSE= input_file)

which still yields the wrong result even with your patch.

Even worse, what are the admittedly weird commands

  $ date -f %s:%H -j 1627519989:15
  $ date -f %H:%s -j 15:1627519989
  $ date -f %H:%s:%m -j 15:1627519989:03

supposed to do?  Apparently, data from later conversions is supposed
to override data from earlier ones, so should the last conversion
that is related to the the time zone win, i.e. %s:%H use mktime(3)
but %H:%s and %H:%s:%m gmtime(3)?  I would argue that is excessive
complexity for little benefit, if any.

One might also argue that %s:%H 1627519989:09 is more usefully
interpreted as "9 o'clock UTC on the day containing 1627519989"
than "9 o'clock in the local TZ on the day containing 1627519989".
In addition to using a consistent input time zone, the former also
avoids fun with crossing the date line that the latter might cause.
The former is also easier to implement, see the patch below.

What do people think?

> The patch below isn't very beautiful, but fixes the problem:
> 
>   # date -f %s -j 1627519989
>   Wed Jul 28 17:53:09 PDT 2021

P.S.
Your patch was mangled (one missing blank line and spurious spaces
inserted before tabs on some lines).  Lately, it is becoming annoying
that even very experienced developers no longer seem able to reliably
send email containing patches that actually apply...  :-(


Index: date.c
===
RCS file: /cvs/src/bin/date/date.c,v
retrieving revision 1.56
diff -u -p -r1.56 date.c
--- date.c  8 Aug 2019 02:17:51 -   1.56
+++ date.c  6 Aug 2021 17:48:01 -
@@ -219,7 +219,11 @@ setthetime(char *p, const char *pformat)
}
 
/* convert broken-down time to UTC clock time */
-   if ((tval = mktime(lt)) == -1)
+   if (pformat != NULL && strstr(pformat, "%s") != NULL)
+   tval = timegm(lt);
+   else
+   tval = mktime(lt);
+   if (tval == -1)
errx(1, "specified date is outside allowed range");
 
if (jflag)



Re: less(1): refreshing file of size 0 results in file being treated as a pipe

2021-08-06 Thread Ingo Schwarze
Hi,

after quite some head-scratching, i consider the following bug report
legitimate:

user wrote on Thu, Aug 05, 2021 at 12:43:21AM -0500:
> On Thu, Aug 05, 2021 at 12:37:00AM -0500, user wrote:
>> On Fri, Jul 23, 2021 at 11:15:59AM -0500, user wrote:

>>> Less contains a hack to force files of size 0 to become non-seekable
>>> in order to workaround a linux kernel bug. 

I'm inserting a few words into the next sentence to make it clearer
what it is trying to say:

>>> When the file becomes non-seekable any further reads from the file
>>> are appended

  to the temporary buffer in which less(1) holds the content
  of the file

>>> rather than overwriting the original contents of the file

  in that buffer.

>> Bug Reproduction:

   $ rm -rf /tmp/test

>> $ touch /tmp/test
>> $ less /tmp/test
>> Press r

> $ echo a > /tmp/test# my comment: from a different shell
> Press h and q in less to reload the file
> $ echo b > /tmp/test
> Press h and q in less to reload the file

Now less(1) shows the following on the screen because it thinks
that would be the current content of the file:

>> a
>> b

That is wrong.  Instead, it should show the actual file content,
which is just:

  b

I think the proposed patch makes sense and should be committed:
File size has nothing to do with whether a file is seekable,
so i don't think it can cause regressions.
Also, it fixes the bug described above.

Any developer willing to provide an OK?
Alternatively, commit yourself with OK schwarze@.

I'm attaching the patch again because the OP mangled it (tabs
replaced by spaces) and it did not apply.

Yours,
  Ingo


Index: ch.c
===
RCS file: /cvs/src/usr.bin/less/ch.c,v
retrieving revision 1.21
diff -u -p -r1.21 ch.c
--- ch.c3 Sep 2019 23:08:42 -   1.21
+++ ch.c6 Aug 2021 16:14:29 -
@@ -643,19 +643,6 @@ ch_flush(void)
ch_block = 0; /* ch_fpos / LBUFSIZE; */
ch_offset = 0; /* ch_fpos % LBUFSIZE; */
 
-#if 1
-   /*
-* This is a kludge to workaround a Linux kernel bug: files in
-* /proc have a size of 0 according to fstat() but have readable
-* data.  They are sometimes, but not always, seekable.
-* Force them to be non-seekable here.
-*/
-   if (ch_fsize == 0) {
-   ch_fsize = -1;
-   ch_flags &= ~CH_CANSEEK;
-   }
-#endif
-
if (lseek(ch_file, (off_t)0, SEEK_SET) == (off_t)-1) {
/*
 * Warning only; even if the seek fails for some reason,



Re: time.3: miscellaneous cleanup and rewrites

2021-08-05 Thread Ingo Schwarze
Hi Scott,

since i see this wasn't committed yet, it is OK schwarze@, too.

Consider leaving the .Nd alone since both jmc@ and millert@ recommended
that, and use "That version counted the time" in the HISTORY section
as recommended by jmc@.

I think you should indeed remove the sentence about gettimeofday(2)
from the DESCRIPTION because millert@ has a point that it's an
implementation detail, and even if considered interesting, it's
already mentioned in the HISTORY section.

Yours,
  Ingo


Scott Cheloha wrote on Fri, Jul 30, 2021 at 01:01:21PM -0500:

> Index: time.3
> ===
> RCS file: /cvs/src/lib/libc/gen/time.3,v
> retrieving revision 1.16
> diff -u -p -r1.16 time.3
> --- time.329 Jan 2015 01:46:30 -  1.16
> +++ time.330 Jul 2021 17:59:37 -
> @@ -36,52 +36,54 @@
>  .Os
>  .Sh NAME
>  .Nm time
> -.Nd get time of day
> +.Nd get the time of day
>  .Sh SYNOPSIS
>  .In time.h
>  .Ft time_t
> -.Fn time "time_t *tloc"
> +.Fn time "time_t *now"
>  .Sh DESCRIPTION
>  The
>  .Fn time
> -function returns the value of time in seconds since 0 hours, 0 minutes,
> -0 seconds, January 1, 1970, Coordinated Universal Time (UTC).
> +function returns the the number of seconds elapsed since
> +Jan 1 1970 00:00:00 UTC.
> +This value is also written to
> +.Fa now
> +unless
> +.Fa now
> +is
> +.Dv NULL .
>  .Pp
> -A copy of the time value may be saved to the area indicated by the
> -pointer
> -.Fa tloc .
> -If
> -.Fa tloc
> -is a null pointer, no value is stored.
> +This version of
> +.Fn time
> +is implemented with
> +.Xr gettimeofday 2 .
>  .Sh RETURN VALUES
> -Upon successful completion,
> +The
>  .Fn time
> -returns the value of time.
> -Otherwise a value of
> -.Po Fa time_t Pc Ns -1
> -is returned and the global variable
> -.Va errno
> -is set to indicate the error.
> -.Sh ERRORS
> -The following error codes may be set in
> -.Va errno :
> -.Bl -tag -width Er
> -.It Bq Er EFAULT
> -An argument address referenced invalid memory.
> -.El
> +function is always successful,
> +and no return value is reserved to indicate an error.
>  .Sh SEE ALSO
>  .Xr clock_gettime 2 ,
>  .Xr gettimeofday 2 ,
>  .Xr ctime 3
> +.Sh STANDARDS
> +The
> +.Fn time
> +function conforms to
> +.St -p1003.1-2008 .
>  .Sh HISTORY
>  A
>  .Fn time
>  system call first appeared in
> -.At v1
> -and used to return time in sixtieths of a second in 32 bits,
> -which was to guarantee a crisis every 2.26 years.
> -Since
> -.At v6 ,
> -.Fn time
> -scale was changed to seconds, extending the pre-crisis stagnation
> -period up to a total of 68 years.
> +.At v1 .
> +This version counted time in sixtieths of a second with a 32-bit return 
> value,
> +ensuring an integer overflow crisis every 2.26 years.
> +In
> +.At v6
> +the granularity of the return value was reduced to whole seconds,
> +delaying the aforementioned crisis until 2038.
> +In
> +.Bx 4.1c
> +the function was moved out of the kernel into the C standard library and
> +reimplemented with
> +.Xr gettimeofday 2 .



Re: Add versioned lib to system perl's @INC for non-packaged modules

2021-08-05 Thread Ingo Schwarze
Hi,

thanks to sthen@ for providing more background!

so the bottom line seems to be that, once the code changes are
tested, committed, and prove stable and once people come round to
it, moving the site manual page directories out of /usr/local/man/
and making them versioned in exactly the same way as the sitelib
directories may be a reasonable option.

Even if the main effect of versioning them might be to make it more
obvious for people what went wrong in case things get mixed up.

Yours,
  Ingo



Re: Add versioned lib to system perl's @INC for non-packaged modules

2021-08-04 Thread Ingo Schwarze
Hi Andrew,

Andrew Fresh wrote on Fri, Jul 30, 2021 at 05:34:40PM -0700:
> On Sun, May 16, 2021 at 03:30:39PM -0700, Andrew Hewus Fresh wrote:

>> There do appear to be some annoyances with still shared directories for
>> man pages, in that if you install a CPAN module and then attempt to
>> pkg_add it, it complains because "files already exist" in the man
>> directory.  We could separate the cpan man pages, or I think un-setting
>> the site_lib man*dir will mean cpan won't instal them.  I haven't yet
>> tested that, but I'm not sure what would be best there.

If i understand correctly, this is all about making installation of
un-ported Perl modules work better and reducing clashes between doing
that on the one hand and between package installation with pkg_add(1)
on the other hand.

Who will profit most from that?  I guess very experienced Perl users
and porters who play around a lot with many different Perl modules
and with very large Perl programs, in particular when evaluating
whether those large programs are worth porting.  Not having manual
pages installed for dependencies while trying to get a complicated
program to work and while trying to understand it seems like a
disservice to me.  So "un-setting the site_lib man*dir" doesn't
strike me as particularly convincing on first sight.

I guess ordinary users are less likely to use this route, they are more
likely to just use pkg_add(1).  But if they do, for whatever reason,
not getting manual pages is not good for them, either.

So i think in the long term, installing those manual pages into their
own manpath outside /ust/local/man/ would seem desirable to me.
I'm not saying it needs to be done in the same commit; sorting that
out can certainly be a later step.


You suggest to make the sitelib directory versioned.  Consequently,
i guess that you want the following process to work:

 1. install base and arbitrary numbers of packages
 2. install lots of modules and programs from CPAN for evaluation
 3. install the next Perl version outside base to figure out what
needs to be done to ultimately move base to it
 4. install various modules for the new Perl from step 3
for testing purposes

If i got that right, then the new man directory needs to be versioned
just like the new sitelib directory, or the manual pages from steps 2
and 4 will clash just like the libraries from steps 2 and 4 would.
The price to pay, of course, is that you need to edit man.conf(5)
for each new Perl you install.  And not just you, the base system
Perl maintainer, but all users installing stuff from CPAN.

On the other hand, if working with two different Perl versions in
parallel is not a requirement (for example because people do testing
like in steps 3 and 4 on separate machines?), then i don't see why
either the sitelib or the new man dir needs to be versioned at all,
making everything simpler.

Yours,
  Ingo



Re: sleep.3: miscellaneous cleanup and rewrites

2021-07-29 Thread Ingo Schwarze
Hi Scott,

Scott Cheloha wrote on Tue, Jul 27, 2021 at 05:15:09PM -0500:
> On Mon, Jul 26, 2021 at 10:37:03AM -0600, Todd C. Miller wrote:
>> On Mon, 26 Jul 2021 07:55:00 -0500, Scott Cheloha wrote:

>>> Still wondering whether we need an Errors section to mention that
>>> sleep(3) can set errno.

>> POSIX doesn't define any errors.  Of the errors returned by nanosleep,
>> only EINTR is really possible.  I guess we could save & restore
>> errno in sleep.c but I don't see any implementations that actually
>> do that.  I only checked illumos, glibc and musl though.

> We definitely don't want to save/restore.
> 
> I meant more like a simple acknowledgement for the programmer.
> Knowing what can set errno is helpful.
> 
> We do it for other library functions where some part of the
> functionality is implemented with a system call.  In other manpages we
> say something like:
> 
> The
> .Fn foo
> function may fail and set
> .Va errno
> for any of the errors specified for the
> .Fn bar 2
> routine.
> 
> See, e.g., `man 3 nice` or `man 3 fopen`.
> 
> Then again, there's really only one error.

That would sound like overkill to me, especially since millert@ pointed
out that only EINTR is relevant.

> Maybe we should just
> duplicate the information in sleep.3.
> 
> What about this?
> 
> The
> .Fn sleep
> function sets
> .Va errno
> to
> .Dv EINTR
> if it is interrupted by the delivery of a signal.

I like that: it seems correct, complete, and concise.

OK schwarze@ for the patch as a whole.

Feel free to use or discard the two suggestions provided in-line
below as you see fit.

Yours,
  Ingo


> Index: sleep.3
> ===
> RCS file: /cvs/src/lib/libc/gen/sleep.3,v
> retrieving revision 1.16
> diff -u -p -r1.16 sleep.3
> --- sleep.3   8 Feb 2020 01:09:57 -   1.16
> +++ sleep.3   27 Jul 2021 22:13:22 -
> @@ -32,7 +32,7 @@
>  .Os
>  .Sh NAME
>  .Nm sleep
> -.Nd suspend process execution for interval measured in seconds
> +.Nd suspend execution for an interval of seconds
>  .Sh SYNOPSIS
>  .In unistd.h
>  .Ft unsigned int
> @@ -40,41 +40,46 @@
>  .Sh DESCRIPTION
>  The
>  .Fn sleep
> -function suspends execution of the calling process until either
> +function suspends execution until at least the given number of

In nanosleep(2), you say

  ... suspends execution of the calling thread ...

Would that be useful here, too?

>  .Fa seconds
> -seconds have elapsed or a signal is delivered to the process and its
> -action is to invoke a signal-catching function or to terminate the
> -process.
> -The suspension time may be longer than requested due to the
> -scheduling of other activity by the system.
> +have elapsed or an unmasked signal is delivered to the calling thread.
>  .Pp
> -This function is implemented using
> -.Xr nanosleep 2
> -by pausing for
> -.Fa seconds
> -seconds or until a signal occurs.
> -Consequently, in this implementation,
> -sleeping has no effect on the state of process timers,
> -and there is no special handling for
> -.Dv SIGALRM .
> +This version of
> +.Fn sleep
> +is implemented with
> +.Xr nanosleep 2 ,
> +so delivery of any unmasked signal will terminate the sleep early,
> +even if
> +.Dv SA_RESTART
> +is set with
> +.Xr sigaction 2
> +for the interrupting signal.
>  .Sh RETURN VALUES
> -If the
> +If
> +.Fn sleep
> +sleeps for the full count of
> +.Fa seconds
> +it returns 0.
> +Otherwise,
> +.Fn sleep
> +returns the number of seconds remaining from the original request.
> +.Sh ERRORS
> +The
>  .Fn sleep
> -function returns because the requested time has elapsed, the value
> -returned will be zero.
> -If the
> -.Fn sleep
> -function returns due to the delivery of a signal, the value returned
> -will be the unslept amount (the request time minus the time actually
> -slept) in seconds.
> +function sets
> +.Va errno
> +to
> +.Dv EINTR
> +if it is interrupted by the delivery of a signal.
>  .Sh SEE ALSO
>  .Xr sleep 1 ,
> -.Xr nanosleep 2
> +.Xr nanosleep 2 ,
> +.Xr sigaction 2
>  .Sh STANDARDS
>  The
>  .Fn sleep
>  function conforms to
> -.St -p1003.1-90 .
> +.St -p1003.1-2008 .

I would consider it useful to add this sentence here:

Setting
.Va errno
is an extension to that specification.



Re: nanosleep.2: miscellaneous cleanup

2021-07-21 Thread Ingo Schwarze
Hi Scott,

Scott Cheloha wrote on Wed, Jul 21, 2021 at 11:02:00AM -0500:

[ EFAULT ]
> Given deraadt@'s response I'm just going to leave the existing
> language.

Fine with me.

> I guess I will need to dig into it a bit.  Finding the text of the
> really early documents, prior to SUSv2, is tricky.

I think you are right.  I recently tried to find an online copy
of XPG4.2 / SUSv1 (1994) and did not succeed.  What you want here
is even one step further back: POSIX.1 (1988), POSIX.1b (1993),
POSIX.1c (1995), POSIX.2 (1992), XPG4 (1992), see mdoc(7) for an
overview.

> One additional change:
> - Escape the minus sign for "-1".

I like that, using \- for mathematical minus is certainly correct in
roff(7) in general and perfectly acceptable in manual pages, too.
Our mandoc_char(7) page explains that we don't require it, but there
is a risk that groff(1) might start requiring it in the next release.
I tried to push back, but without success so far.

It seems you somewhere lost the s/one billion/1000 million/ tweak
that you advertised earlier.

Either way, OK schwarze@.

Yours,
  Ingo


> Index: nanosleep.2
> ===
> RCS file: /cvs/src/lib/libc/sys/nanosleep.2,v
> retrieving revision 1.16
> diff -u -p -r1.16 nanosleep.2
> --- nanosleep.2   31 Dec 2018 18:54:00 -  1.16
> +++ nanosleep.2   21 Jul 2021 15:57:55 -
> @@ -41,53 +41,71 @@
>  .Ft int
>  .Fn nanosleep "const struct timespec *timeout" "struct timespec *remainder"
>  .Sh DESCRIPTION
> +The
>  .Fn nanosleep
> -suspends execution of the calling process for the duration specified by
> +function suspends execution of the calling thread for at least the
> +duration specified by
>  .Fa timeout .
> -An unmasked signal will cause it to terminate the sleep early,
> -regardless of the
> +Delivery of an unmasked signal to the calling thread terminates the
> +sleep early,
> +even if
>  .Dv SA_RESTART
> -value on the interrupting signal.
> +is set with
> +.Xr sigaction 2
> +for the interrupting signal.
>  .Sh RETURN VALUES
> -If the
> +If
>  .Fn nanosleep
> -function returns because the requested duration has elapsed, the value
> -returned will be zero.
> +sleeps the full
> +.Fa timeout
> +without interruption it returns 0.
> +Unless
> +.Fa remainder
> +is
> +.Dv NULL ,
> +it is set to zero.
>  .Pp
> -If the
> +If
>  .Fn nanosleep
> -function returns due to the delivery of a signal, the value returned
> -will be \-1, and the global variable
> +is interrupted by a signal it returns \-1 and the global variable
>  .Va errno
> -will be set to indicate the interruption.
> -If
> +is set to
> +.Dv EINTR .
> +Unless
>  .Fa remainder
> -is non-null, the timespec structure it references is updated to contain the
> -unslept amount (the requested duration minus the duration actually slept).
> -.Sh ERRORS
> -If any of the following conditions occur, the
> +is
> +.Dv NULL ,
> +it is set to the unslept portion of the
> +.Fa timeout .
> +.Pp
> +Otherwise,
>  .Fn nanosleep
> -function shall return \-1 and set
> +returns \-1 and the global variable
>  .Va errno
> -to the corresponding value.
> +is set to indicate the error.
> +.Sh ERRORS
> +.Fn nanosleep
> +will fail if:
>  .Bl -tag -width Er
>  .It Bq Er EINTR
> -.Fn nanosleep
> -was interrupted by the delivery of a signal.
> +The call is interrupted by the delivery of a signal.
>  .It Bq Er EINVAL
>  .Fa timeout
> -specified a nanosecond value less than zero or greater than or equal to
> -1000 million,
> +specifies a nanosecond value less than zero or greater than or equal to
> +one billion,
>  or a second value less than zero.
>  .It Bq Er EFAULT
> -Either
>  .Fa timeout
> -or
> -.Fa remainder
>  points to memory that is not a valid part of the process address space.
> +.It Bq Er EFAULT
> +.Fa remainder
> +is not
> +.Dv NULL
> +and points to memory that is not a valid part of the process address space.
>  .El
>  .Sh SEE ALSO
>  .Xr sleep 1 ,
> +.Xr sigaction 2 ,
>  .Xr sleep 3
>  .Sh STANDARDS
>  The
> @@ -97,17 +115,23 @@ function conforms to
>  .Sh HISTORY
>  The predecessor of this system call,
>  .Fn sleep ,
> -appeared in
> -.At v3 ,
> -but was removed when
> +first appeared in
> +.At v3 .
> +It was removed in
> +.At v7
> +and replaced with a C library implementation based on
>  .Xr alarm 3
> -was introduced into
> -.At v7 .
> +and
> +.Xr signal 3 .
> +.Pp
>  The
>  .Fn nanosleep
> -system call has been available since
> +function first appeared in
> +.St -p1003.1b-93 .
> +.Pp
> +This implementation of
> +.Fn nanosleep
> +first appeared in
>  .Nx 1.3
>  and was ported to
> -.Ox 2.1
> -and
> -.Fx 3.0 .
> +.Ox 2.1 .



Re: nanosleep.2: miscellaneous cleanup

2021-07-21 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Tue, Jul 20, 2021 at 08:14:20PM -0600:
> Ingo Schwarze wrote:

>>  [EFAULT]  foo points outside the process's allocated address space.
>>
>> But i don't really i like that.  The word "allocated" makes me wonder
>> because it sounds too much like malloc(3) for my taste.
>> Usually, pointers to automatic and to static objects are acceptable,
>> too, and are those "allocated"?  Some might say they are not.
>> Besides, "process's" looks awkward.

> Disagree on your dislike of this wording.
> 
> A process has an address space, VM_MIN_ADDRESS to VM_MAXUSER_ADDRESS.
> Parts of this are not mapped, because nothing has allocated backing
> resources.  Allocation happens via stack growth, execve, shm, mmap, etc
> etc etc etc.  Those are all methods for "allocating" within the
> processes' address space.
> 
> Unallocated space generates EFAULT.
> 
> Addresses outside the VM_MIN_ADDRESS to VM_MAXUSER_ADDRESS range also
> generate EFAULT, and this is because backing resources are not permitted
> to be allocated there.  No allocation, thus EFAULT.
> 
> "allocated" is correct terminology, and it has nothing to do with
> malloc.

Fair enough, thank you for explaining.

Given that it is correct terminology, i recommend using it in the
nanosleep(2) manual, too, because it is the most widely used idiom
and it isn't excessively long.

> I don't see how trying to mince words is going to help anyone.
> 
> Unifying all the varients into one form will be quite a task.

Right, that would probably be a make-work project.  Having a wide
variety of wordings is slightly messy, but not enough of a problem
to warrant spending a significant effort on unification.

Yours,
  Ingo

> I don't know if this is the right form to use, but I don't like the
> argument you made.



Re: nanosleep.2: miscellaneous cleanup

2021-07-20 Thread Ingo Schwarze
Hi Scott,

Scott Cheloha wrote on Tue, Jul 20, 2021 at 05:20:16PM -0500:

> The nanosleep.2 page could use some cleanup.  Here's a bunch of fixes,
> rewrites, etc.
> 
> I've included my notes on the changes below.  I have some (mostly
> stylistic) questions in there, too.

Thanks for explaining what you are doing.  I'm snipping what i agree
with as well as what i have no opinion on and trust you on, but that
doesn't mean that mentioning it was useless.

> Should we reference sigaction(2) here alongside SA_RESTART?  e.g.:
> 
>   ... even if SA_RESTART is set for the interrupting signal
>   (see sigaction(2) for further details).

It isn't absolutely required.  On the one hand, people can use
"man -k any=SA_RESTART" to find out what it is.  On the other hand,
people can maybe be expected to know that sigaction(2) is the system
call to configure signal handling, and even if they ognore it, they
will easily find out with "man signal".

Then again, i can see how saying more precisely what SA_RESTART is
might be helpful especially for novice users, if it can be done
without being too much of a distraction.

In general, i dislike both parenthetic remarks and "see ... for details"
unless a more natural wording cannot be found.

Maybe something like:

  ... even if SA_RESTART was set with sigaction(2) for the
  interrupting signal.

Your call, i would say.

> I'm unsure about the correct voice here.  Should nanosleep actively
> return values?  Or should values "be returned"?

Both are acceptable.  In general, and not just in manual pages,
prefer the active voice when both work equally well.

> I'm unclear on whether I need the indent here in the .Bd block.  I
> think it looks better than unindented code, but maybe I'm flouting
> some practice we have.

Using tabs is permitted inside .Bd -literal and .Bd -unfilled.
They should definitely be used for indenting the bodies of for,
while, and if.

Whether you indent the whole thing depends on what looks better.
When the code is relatively narrow, indenting the whole example
is often a good idea.  When the code contains very long lines,
not indenting the whole example may be better.

If you choose to indent, i have no strong prefernce whether you
do it with tabs or with .Bd -literal -offset indent; maybe i very
slightly prefer the latter, because the global indentation is
a formatting choice rather than part of the displayed code.
But i really wouldn't complain about either.

> ERRORS
> - Simplify the opening sentence.  Yeesh, what a mouthful.

Indeed, ERRORS should not repeat the content of RETURN VALUES.

> Unsure if a second EFAULT case for remainder is a good thing here.
> Strictly speaking, NULL is not a valid part of the process address
> space.  Maybe that's too pedantic?

I don't think you can be too pedantic about where NULL is allowed
and where it isn't.  That's a notorious source of bugs, so precision
about NULL is usually a good thing.

> Also, do we have a more standard "boilerplate" sentence for EFAULT?

Judging from "man -l /usr/share/man/man2/*.2", the most common
wording is:

  [EFAULT]  foo points outside the process's allocated address space.

But i don't really i like that.  The word "allocated" makes me wonder
because it sounds too much like malloc(3) for my taste.
Usually, pointers to automatic and to static objects are acceptable,
too, and are those "allocated"?  Some might say they are not.
Besides, "process's" looks awkward.

These occur, too:

  [EFAULT]  The foo parameter is not in a valid part of the user
address space.

  [EFAULT]  foo references invalid memory.

  [EFAULT]  The name parameter specifies an area outside the
process address space.

  [EFAULT]  foo points to an illegal address.

  [EFAULT]  The userspace address uaddr is invalid.

  [EFAULT]  Part of buf points outside the process's allocated
address space.

  [EFAULT]  The buf parameter points to an invalid address.

  [EFAULT]  The argument foo specifies an invalid address.

  [EFAULT]  The foo parameter specified a bad address.

  [EFAULT]  The foo parameter points to memory not in
a valid part of the process address space.

  [EFAULT]  The address specified for foo is invalid.

  [EFAULT]  An argument address referenced invalid memory.

  In errno(2):
  14 EFAULT Bad address. The system detected an invalid address in
attempting to use an argument of a call.

  [EFAULT]  There was an error reading or writing the kevent
structure.

Quite a mess, i would say.

I think trying to explain over and over again what an invalid address
is, in each and every manual page, is neither reasonable nor feasible -
i'm not convinced the rules are really simple given how many types
of valid memory there are: static, stack, heap, ...

So i'd probably settle for a concise, simple wording, and i think
i like this one best:

  [EFAULT]  foo points to an invalid address.

What is valid can easily depend on the function 

Re: ualarm.3: cleanup

2021-07-01 Thread Ingo Schwarze
Hi,

fair enough.  When there are good reasons to not merge manual pages - 
in this case, causing confusion about types and encouraging type
conversion, overflow, and truncation bugs, and alarm(3) and ualarm(3)
not being part of the same subsystem in the first place (if i
understand deraadt@ correctly) don't merge.

Theo de Raadt wrote on Wed, Jun 30, 2021 at 05:24:28PM -0600:

> In short words: "don't intermix".

In that case, i'd just say, in each of the three manual pages:

  Calling more than one of the interfaces setitimer(2), alarm(3),
  and ualarm(3) in the same program results in unspecified behaviour.

Probably best below CAVEATS.

That is precise and concise, and it sounds sufficiently scary.

One could say, in addition, that doing so is likely to cause various
kinds of platform- and implementation-dependent bugs, but i see no
real need.  That's basically just what "unspecified behaviour" means.

Yours,
  Ingo



Re: ualarm.3: cleanup

2021-06-30 Thread Ingo Schwarze
Hi Scott,

Scott Cheloha wrote on Wed, Jun 30, 2021 at 03:15:31PM -0500:

> As it would turn out, there are not one, but *two* simplified
> interfaces to setitimer(2).  We just cleaned up the alarm(3) manpage,
> but did you know there is also a ualarm(3) interface?

I certainly didn't...   :-)

> Luckily it's basically the same interface as alarm(3).  So, I have
> taken the rewritten manpage from alarm.3 and transposed it into
> ualarm.3.  The only details that change are the units (microseconds
> instead of seconds) and the additional input, "interval", which I have
> noted in the DESCRIPTION.

Uh, oh, to my naive ears (naive with respect to matters of time-related
functions), this sounds like a typical case where we likely want one
manual page documenting both alarm(3) and ualarm(3), not two pages.

Basically, the rule of thumb is: if two functions have a very similar
purpose, share most of their properties (including important parts
of the syntax and/or semantics of their arguments, return values,
behaviour, error handling or implementation, such that describing
them separately results in considerable duplication of text, and if
describing them together is beneficial for the user in so far as it
makes it easier to highlight the (few) differences, then describe
them together (unless that grows the number of functions sharing
the same manual page too much, but 2 is certainly not too much).

> All the changes over there apply here re describing what the interface
> actually does.  It basically does the same thing so I see no reason
> not to reuse what I wrote for that page.

Do you see any reason to not put all the information into the alarm(3)
manual page?

Should i give merging the two a try, or do you want to?

> Given that alarm(3), ualarm(3), and setitimer(2) all poke at the same
> underlying per-process timer I think it makes sense to mention all
> three interfaces in the CAVEATS section of both alarm.3 and ualarm.3.
> (A getitimer.2 cleanup is coming, don't fret.)

Can the current wording below CAVEATS, "confusing behavior", be made
more precise, explaining what actually happens rather than merely
calling it "confusing"?  For example, saying that they all override
each other or something like that?  Or saying that behavior is
undefined when more than one of them is used (isn't that what POSIX
says)?

> The only thing I'm unsure about is how to talk about the input limits.
> useconds_t is uint32_t on all platforms.  Is there a more graceful way
> to tell people the limit than the literal number as we do now?  Maybe
> UINT32_MAX?  POSIX defines the type but does not define a USECONDS_MAX
> constant.

I'm confused.  It seems i fail to find useconds_t in POSIX.
I only see suseconds_t there, in , which must be signed,
include [-1, 100], and not be wider than long.  Also, our manual
says ualarm(3) is -xpg4.2.  That's basically -susv1, isn't it?
So i would expect ualarm(3) to appear in modern POSIX, too.
Was it deleted at some point?

I think it would be ideal for the manual page text to state
which values can be used in a *portable* way.  If, for some reason,
that is not desirable, then the main text should say which values
can be used on OpenBSD, and STANDARDS should say that using values
above XYZ is an extension.

Yours,
  Ingo



Re: Replace .Ar macros with .Fa in pledge.2

2021-06-30 Thread Ingo Schwarze
Hi Emil,

Emil Engler wrote on Wed, Jun 30, 2021 at 07:09:27PM +0200:

> The pledge.2 man-page makes use of the incorrect .Ar macro which is
> not intended for manuals in section 2 as .Fa exists for that purpose.
> Similar to 1.18 in /cvs/src/lib/libm/man/sqrt.3

Note that the pledge(2) manual page has special needs in a few respects
and handling it schematically would likely be counter-productive, but
this particular patch is correct and straightforward, focussed, and
unintrusive enough that i just committed it.

Thanks,
  Ingo


> Index: pledge.2
> ===
> RCS file: /cvs/src/lib/libc/sys/pledge.2,v
> retrieving revision 1.60
> diff -u -p -u -p -r1.60 pledge.2
> --- pledge.217 Jul 2020 16:40:26 -  1.60
> +++ pledge.230 Jun 2021 17:02:04 -
> @@ -33,9 +33,9 @@ management, read-write operations on fil
>  and networking.
>  In general, these modes were selected by studying the operation
>  of many programs using libc and other such interfaces, and setting
> -.Ar promises
> +.Fa promises
>  or
> -.Ar execpromises .
> +.Fa execpromises .
>  .Pp
>  Use of
>  .Fn pledge
> @@ -60,7 +60,7 @@ with the
>  flag.
>  .Pp
>  A
> -.Ar promises
> +.Fa promises
>  value of
>  .Qq \&
>  restricts the process to the
> @@ -72,9 +72,9 @@ with another process.
>  Passing
>  .Dv NULL
>  to
> -.Ar promises
> +.Fa promises
>  or
> -.Ar execpromises
> +.Fa execpromises
>  specifies to not change the current value.
>  .Pp
>  Some system calls, when allowed, have restrictions applied to them:
> @@ -136,9 +136,9 @@ and any files below
>  .Pa /usr/share/zoneinfo .
>  .It Fn pledge :
>  Can only reduce permissions for
> -.Ar promises
> +.Fa promises
>  and
> -.Ar execpromises .
> +.Fa execpromises .
>  .It Xr sysctl 2 :
>  A small set of read-only operations are allowed, sufficient to
>  support:
> @@ -150,7 +150,7 @@ and system sensor readings.
>  .El
>  .Pp
>  The
> -.Ar promises
> +.Fa promises
>  argument is specified as a string, with space separated keywords:
>  .Bl -tag -width "prot_exec" -offset indent
>  .It Va stdio
> @@ -464,7 +464,7 @@ Coupled with the
>  .Va proc
>  promise, this allows a process to fork and execute another program.
>  If
> -.Ar execpromises
> +.Fa execpromises
>  has been previously set the new program begins with those promises,
>  unless setuid/setgid bits are set in which case execution is blocked with
>  .Er EACCES .
> @@ -596,12 +596,12 @@ Rather than killing the process upon vio
>  Also when
>  .Fn pledge
>  is called with higher
> -.Ar promises
> +.Fa promises
>  or
> -.Ar execpromises ,
> +.Fa execpromises ,
>  those changes will be ignored and return success.
>  This is useful when a parent enforces
> -.Ar execpromises
> +.Fa execpromises
>  but an execve'd child has a different idea.
>  .El
>  .Sh RETURN VALUES
> @@ -611,12 +611,12 @@ but an execve'd child has a different id
>  will fail if:
>  .Bl -tag -width Er
>  .It Bq Er EFAULT
> -.Ar promises
> +.Fa promises
>  or
> -.Ar execpromises
> +.Fa execpromises
>  points outside the process's allocated address space.
>  .It Bq Er EINVAL
> -.Ar promises
> +.Fa promises
>  is malformed or contains invalid keywords.
>  .It Bq Er EPERM
>  This process is attempting to increase permissions.



Re: More use of mdoc macros in sqrt.3

2021-06-29 Thread Ingo Schwarze
Hi Emil,

Emil Engler wrote on Tue, Jun 29, 2021 at 03:41:31PM +0200:

> This diff inserts an .Fa to the places where it belongs to as well
> as an .Er for EDOM.

This patch is completely correct and i just committed it.

Thank you,
  Ingo


> Index: lib/libm/man/sqrt.3
> ===
> RCS file: /cvs/src/lib/libm/man/sqrt.3,v
> retrieving revision 1.17
> diff -u -p -u -p -r1.17 sqrt.3
> --- lib/libm/man/sqrt.3 8 Feb 2020 01:09:57 -   1.17
> +++ lib/libm/man/sqrt.3 29 Jun 2021 13:36:28 -
> @@ -57,7 +57,7 @@
>  The
>  .Fn cbrt
>  function computes the cube root of
> -.Ar x .
> +.Fa x .
>  The
>  .Fn cbrtf
>  function is a single precision version of
> @@ -70,7 +70,8 @@ function is an extended precision versio
>  The
>  .Fn sqrt
>  function computes
> -the non-negative square root of x.
> +the non-negative square root of
> +.Fa x .
>  The
>  .Fn sqrtf
>  function is a single precision version of
> @@ -80,14 +81,17 @@ The
>  function is an extended precision version of
>  .Fn sqrt .
>  .Sh RETURN VALUES
> -If x is negative,
> +If
> +.Fa x
> +is negative,
>  .Fn sqrt "x" ,
>  .Fn sqrtf "x"
>  and
>  .Fn sqrtl "x"
>  set the global variable
>  .Va errno
> -to EDOM.
> +to
> +.Er EDOM .
>  .Sh HISTORY
>  A
>  .Fn sqrt



Re: mandoc style warning about text lines > 80 bytes

2021-06-27 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Sat, Jun 26, 2021 at 06:55:44PM +0100:

> i count mkhybrid as 3rd party. the man page is still old style, and has
> only 2 commits in 20 years. if it is ours, it needs considerable work.

Fair enough, you are probably right about that.

> i have no strong opinions. it actually doesn;t really touch anything i
> do - i never check pages for long lines, though i do shorten them if i'm
> doing other stuff on a page. i guess the question is more if others
> would find it handy to have it flagged, or disruptive.

It appears some consider it potentially useful.

> on balance, i'd be fine with such an addition.

Thanks to all three of you for checking, it is now committed.

Yours,
  Ingo



Re: Patch: ksh: fix input handling for 4 byte UTF-8 sequences

2021-06-27 Thread Ingo Schwarze
Hi Soeren,

Soeren Tempel wrote on Mon, Jun 07, 2021 at 07:02:25PM +0200:

> Nice, wasn't aware that you also had a patch ready.

Yeah, that was due to the fact that we, as developers, often use
the lists but sometimes also send comments and patches privately,
to reduce the mail volume for everybody.  Deciding where to send
comments and patches works out well enough overall, but it does
occasionally cause slight communication gaps, like in this case.
I don't think any general rule can be designed to make it better
(at least not more specific than "think about who needs to see this
and act accordingly"), it's a matter of case-by-case judgement.

> Sounds good to me and also fixes the problem I originally experienced
> with 4 byte UTF-8 sequences.

Great, thanks for reporting and testing, it's now finally fixed.

> BTW: Is there any reason why ksh doesn't use editline for all its line
> editing needs? That would allow handling all these nitty-gritty details
> in a central place.

It would, and in principle, that would be an improvement.
But i think editline(3) code quality is insufficent for use in a
shell.  It's all quite messy and hastily and sloppily written.
I tried to polish some of it in the past, but got distracted,
so editline(3) code is still full of stuff that needs review
and quality improvement.

Actually, i'm a bit scared that sftp(1) uses it.  Then again, i'm not
aware that it caused any major vulnerabilities in the past, and the
OpenSSH developers are not at all reckless people, so i am sure they
know what they are doing.

Yours,
  Ingo



Re: Patch: ksh: fix input handling for 4 byte UTF-8 sequences

2021-06-27 Thread Ingo Schwarze
Hi Jeremie,

Jeremie Courreges-Anglas wrote on Thu, Jun 03, 2021 at 11:17:08PM +0200:
> On Wed, Jun 02 2021, Ingo Schwarze  wrote:

>> I'm also adding a few comments as suggested by jca@.  Parsing of UTF-8
>> is less trivial than one might think, witnessed once again by the fact
>> that i got this code wrong in the first place.

> Thanks for this, though I was mostly interested into a pointer to the
> relevant standard document that was used.

Standards tend to be huge, the manual page is very compact, contains
all the relevant information and already points to the standard for
more details.  So i considered refering to the manual more helpful.

If you see a way to improve the STANDARDS section in the manual page,
you are very welcome.

> Now that the checks match the
> Unicode standard, I'm less confused.  You're now pointing readers to the
> utf8(7) manpage.  Discussion for another diff: should this manpage list
> valid sequences instead of invalid ones?

The idea is that it does both: first it show valid sequences (in the
first table).  Then it explains the constraints, and which bytes are
consequently valid, in the following text.

Finally, it turns the matter around and summarizes by also listing
invalid start bytes and invalid initial pairs.

Showing invalid rather than valid bytes and pairs in the final two
tables seems a good choice to me because there are much fewer invalid
than valid start bytes and initial pairs.

Does that make sense to you, or can you propose a specific diff to
improve the page?

> ok jca@

Thanks for checking the ksh/emacs.c diff and sorry for the ridiculous
delay, it is now finally committed.

Yours,
  Ingo



mandoc style warning about text lines > 80 bytes

2021-06-26 Thread Ingo Schwarze
Hi Jason and Theo,

Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:

>> You have two overlong lines as indicated below. I would have thought
>> that mandoc -Tlint complains about that, but apparently it doesn't have
>> such a warning... With those wrapped,
 
> yes, there is no feedback on long lines. although we try to keep the
> source less than 80 width, there are some places where it is not
> possible.
> 
> i'm not sure whether adding a warning would be helpful or disruptive.

Here is a patch implementing such a style warning, leaning very
heavily into the direction of never producing false positives, that
is, not warning about long lines

 - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
 - that start with a dot (normal macro and request lines)
   or with a non-standard control character
 - that start with a space character or with an escape sequence
 - in tbl(7) context
 - in eqn(7) context
 - that do not contain a blank character before column 80

So, this certainly does not find all long lines that can be improved,
but everything it finds ought to be trivial and worthwhile to fix.

There are less than twenty-five offenders below /usr/src/share/man/,
and none of those are false positives.

However, i see over twenty-three thousand offending lines
below /usr/share/man/, the vast majority in Perl manual pages
and considerable amounts in other third-party stuff like GCC
and binutils.
The worst offenders we maintain ourselves are tmux(1) with 15
offending lines, terminfo(5) with 11, mkhybrid(8) with 6, tic(1),
magic(5), sysctl(2), cdio(1), and the rest with three or less
offending lines each, maybe a few dozen offending files all told.

I don't think such a patch would be disruptive.  I expect most of the
third-party manuals it flags already have many other warnings.
Our own stuff does not have large amounts of issues due to Jason's
diligent manual work.

Does it help?  Maybe it could slightly reduce one aspect of Jason's
workload and help developers who want to find their own glitches in
this respect, in particular those usually working on terminals wider
than 80 columns.

There is no need to chase all of these down, but the style warning
might help when editing a page for other reasons.

What do you think?
  Ingo


P.S.
The following script makes it easy to count violations:

mandoc -T lint */*.[1-9]* */*/*.[1-9]* | \
  grep 'longer than' | \
  cut -d : -f 2 | \
  uniq -c | \
  sort -nr


Index: libmandoc.h
===
RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
retrieving revision 1.64
diff -u -p -r1.64 libmandoc.h
--- libmandoc.h 3 Apr 2020 11:34:19 -   1.64
+++ libmandoc.h 26 Jun 2021 16:19:03 -
@@ -73,7 +73,7 @@ void   roff_reset(struct roff *);
 voidroff_man_free(struct roff_man *);
 struct roff_man*roff_man_alloc(struct roff *, const char *, int);
 voidroff_man_reset(struct roff_man *);
-int roff_parseln(struct roff *, int, struct buf *, int *);
+int roff_parseln(struct roff *, int, struct buf *, int *, size_t);
 voidroff_userret(struct roff *);
 voidroff_endparse(struct roff *);
 voidroff_setreg(struct roff *, const char *, int, char);
Index: mandoc.1
===
RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
retrieving revision 1.175
diff -u -p -r1.175 mandoc.1
--- mandoc.12 Jun 2021 18:27:36 -   1.175
+++ mandoc.126 Jun 2021 16:19:04 -
@@ -1066,6 +1066,9 @@ An
 request occurs even though the document already switched to no-fill mode
 and did not switch back to fill mode yet.
 It has no effect.
+.It Sy "input text line longer than 80 bytes"
+Consider breaking the input text line
+at one of the blank characters before column 80.
 .It Sy "verbatim \(dq--\(dq, maybe consider using \e(em"
 .Pq mdoc
 Even though the ASCII output device renders an em-dash as
Index: mandoc.h
===
RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
retrieving revision 1.212
diff -u -p -r1.212 mandoc.h
--- mandoc.h2 Jun 2021 18:27:37 -   1.212
+++ mandoc.h26 Jun 2021 16:19:04 -
@@ -72,6 +72,7 @@ enum  mandocerr {
MANDOCERR_DELIM_NB, /* no blank before trailing delimiter: macro ... */
MANDOCERR_FI_SKIP, /* fill mode already enabled, skipping: fi */
MANDOCERR_NF_SKIP, /* fill mode already disabled, skipping: nf */
+   MANDOCERR_TEXT_LONG, /* input text line longer than 80 bytes */
MANDOCERR_DASHDASH, /* verbatim "--", maybe consider using \(em */
MANDOCERR_FUNC, /* function name without markup: name() */
MANDOCERR_SPACE_EOL, /* whitespace at end of input line */
Index: mandoc_msg.c
===
RCS file: 

Patch: ksh: fix input handling for 4 byte UTF-8 sequences

2021-06-02 Thread Ingo Schwarze
Hi,

feeling hesitant to commit into ksh without at least one proper OK,
i'm resending this patch here, sorry if i missed private feedback.

What the existing code does:
It tries to make sure that multi-byte UTF-8 characters get passed on by
the shell without fragmentation, not one byte at time.  I heard people
say that some software, for example tmux(1), may sometimes get confused
when receiving a UTF-8 character in a piecemeal manner.

Which problem needs fixing:
Of the four-byte UTF-8 sequences, only a subset is identified by the
existing code.  The other four-byte UTF-8 sequences still get chopped
up resulting in individual bytes being passed on.


I'm also adding a few comments as suggested by jca@.  Parsing of UTF-8
is less trivial than one might think, witnessed once again by the fact
that i got this code wrong in the first place.

I also changed "cc & 0x20" to "cc > 0x9f" and "cc & 0x30" to "cc > 0x8f"
for uniformity and readabilty - UTF-8-parsing is bad enough without
needless micro-optimization, right?


Note that even with the patch below, moving backward and forward
over a blowfish icon on the command line still does not work because
the character is width 2 and the ksh code intentionally does not
use wcwidth(3).  But maybe it improves something in tmux?  Not sure.

Either way, unless it causes regressions, this (or a further improved
version) should go in because what is there is clearly wrong.

OK?
  Ingo


Index: emacs.c
===
RCS file: /cvs/src/bin/ksh/emacs.c,v
retrieving revision 1.87
diff -u -p -r1.87 emacs.c
--- emacs.c 8 May 2020 14:30:42 -   1.87
+++ emacs.c 13 May 2021 18:16:13 -
@@ -1851,11 +1851,17 @@ x_e_getu8(char *buf, int off)
return -1;
buf[off++] = c;
 
-   if (c == 0xf4)
+   /*
+* In the following, comments refer to violations of
+* the inequality tests at the ends of the lines.
+* See the utf8(7) manual page for details.
+*/
+
+   if ((c & 0xf8) == 0xf0 && c < 0xf5)  /* beyond Unicode */
len = 4;
else if ((c & 0xf0) == 0xe0)
len = 3;
-   else if ((c & 0xe0) == 0xc0 && c > 0xc1)
+   else if ((c & 0xe0) == 0xc0 && c > 0xc1)  /* use single byte */
len = 2;
else
len = 1;
@@ -1865,9 +1871,10 @@ x_e_getu8(char *buf, int off)
if (cc == -1)
break;
if (isu8cont(cc) == 0 ||
-   (c == 0xe0 && len == 3 && cc < 0xa0) ||
-   (c == 0xed && len == 3 && cc & 0x20) ||
-   (c == 0xf4 && len == 4 && cc & 0x30)) {
+   (c == 0xe0 && len == 3 && cc < 0xa0) ||  /* use 2 bytes */
+   (c == 0xed && len == 3 && cc > 0x9f) ||  /* surrogates  */
+   (c == 0xf0 && len == 4 && cc < 0x90) ||  /* use 3 bytes */
+   (c == 0xf4 && len == 4 && cc > 0x8f)) {  /* beyond Uni. */
x_e_ungetc(cc);
break;
}



Re: mandoc: -Tlint: search /usr/local/man as well

2021-05-15 Thread Ingo Schwarze
Hi Klemens,

Klemens Nanni wrote on Mon, Apr 05, 2021 at 09:33:13PM +0200:
> On Mon, Apr 05, 2021 at 06:47:58PM +0200, Ingo Schwarze wrote:
>> Klemens Nanni wrote on Sun, Apr 04, 2021 at 03:54:43PM +0200:
>>> On Sun, Apr 04, 2021 at 03:42:03PM +0200, Tim van der Molen wrote:

>>>> Doesn't mandoc -Tlint -Wstyle do what you want?

>>> I don't think so.  Oddly enough, `-Wstyle' does not do what I would
>>> expect:  it omits STYLE messages instead of printing them.

>> It does not.  It only omits BASE messages, which is the lowest message
>> level and the only one below STYLE.  See  `man -O tag=base mandoc`
>> what the purpose of BASE is.
[...]

>>> But it would also suppress legitimate messages, e.g. `.Xr' macros with
>>> misspelled manuals in the base MANPATH:

>> Yes, so far, checking of .Xr targets is only done on the BASE level,
>> not on the STYLE level because i regarded the requirement that .Xr targets
>> exist as a requirement of the OpenBSD base system.  For portable software
>> outside the OpenBSD base system, in the past, i didn't think checking .Xr
>> targets would be helpful.  It appears you now found a use case for that.

> A use case, but not really a requirement, at least not for people
> developing portable software on OpenBSD -- that's where I'm coming from:
> messages telling me installed manuals cannot be found are annoying.
> 
> But I guess that turns into a valid use case as soon as mandoc is used
> on and for another system with different manual paths.

>> Your patch is not OK as it stands because mandoc would no longer help
>> developers working on the base system to find .Xr links pointing
>> outside the base system, and we have a policy that we don't want those.

> Agreed.

>> I guess the right thing to do is to leave "-W all" unchanged (including
>> looking in the base system only) but change "-W style" to also do the .Xr
>> target check, but along the user's manpath.

> That makes sense and the diff below enables check_xr() to pick the right
> manual paths.
> 
> HOWEVER it currently won't print anything because when `-Wstyle' is used
> mandoc_msg(MANDOCERR_XR_BAD, ...) in check_xr() won't print anything as
> MANDOC_XR_BAD is lower than MANDOCERR_STYLE.
> 
> I played swapping those two levels in the `enum mandocerr' but that
> won't quite work (e.g. mandoc_msg() printing the wrong message)...
> 
> Ingo, perhaps you have an idea on how to fix that?

Does the following patch work for you?

Here is what it does:

 * parse the config file not only in man(1) and apropos(1) mode,
   but also in mandoc(1) -T style mode because the full manpath
   is now needed for .Xr checking even when the manual page file
   to be processed does not need to be searched for but is provided
   on the command line 

 * pass through configuration data to the various checking functions
   as far as they now need it;

 * move MANDOCERR_XR_BAD from BASE to STYLE both in the enum (mandoc.h)
   and in the message catalog (mandoc_msg.c)

 * properly document the change

Yours,
  Ingo


Index: main.c
===
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.256
diff -u -p -r1.256 main.c
--- main.c  19 Feb 2021 19:49:49 -  1.256
+++ main.c  15 May 2021 20:04:51 -
@@ -1,6 +1,6 @@
 /* $OpenBSD: main.c,v 1.256 2021/02/19 19:49:49 kn Exp $ */
 /*
- * Copyright (c) 2010-2012, 2014-2020 Ingo Schwarze 
+ * Copyright (c) 2010-2012, 2014-2021 Ingo Schwarze 
  * Copyright (c) 2008-2012 Kristaps Dzonsons 
  * Copyright (c) 2010 Joerg Sonnenberger 
  *
@@ -92,7 +92,7 @@ structoutstate {
 
 int  mandocdb(int, char *[]);
 
-static void  check_xr(void);
+static void  check_xr(struct manpaths *);
 static int   fs_lookup(const struct manpaths *,
size_t ipath, const char *,
const char *, const char *,
@@ -103,7 +103,7 @@ static  int   fs_search(const struct man
 static void  glob_esc(char **, const char *, const char *);
 static void  outdata_alloc(struct outstate *, struct manoutput *);
 static void  parse(struct mparse *, int, const char *,
-   struct outstate *, struct manoutput *);
+   struct outstate *, struct manconf *);
 static void  passthrough(int, int);
 static void  process_onefile(struct mparse *, struct manpage *,
int, struct outstate *, struct manconf *);
@@ -430,7 +430,8 @@ main(int argc, char *argv[])
 
/* Read the configuration file. */
 
-   if (search.argmode != ARG_FILE)
+   if (search.argmod

Re: wcwidth of soft hyphen

2021-04-05 Thread Ingo Schwarze
Hi,

Martijn van Duren wrote on Thu, Apr 01, 2021 at 09:30:36AM +0200:

> When it comes to these discussions I prefer to go back to the standards

I would propose an even more rigorous stance: not only go back to
the standards, but use whatever the Unicode data files (indirectly,
via the Perl modules) parsed by gen_ctype_utf8.pl specify.  Manually
changing properties of individual characters should be restricted
to very rare cases of crystal clear, absolutely unambiguous errors.
When there is the slightest doubt or when there are arguments both
ways, follow the Unicode data files and how Perl interprets them.

We have

  iswcntrl = 1  because UnicodeData.txt has class Cf (format control char)
  iswprint = 1  because the class is neither Cc nor Cs
  wcwidth  = 0  because the class starts with C (control char)

This is also neither obviously nor unambiguously wrong, so it should
not be changed.

The choice of iswcntrl = 1 is most definitely correct because
that's what class Cf says, there can be no doubt about that at all.
Consequently, NetBSD, glibc, and musl are definitely buggy in so far
as they return iswcntrl = 0.

Whether class Cf is always printable is maybe not absolutely clear.
There are arguments both ways.  The stronger argument seems to be
that these format control chars usually appear in the middle of
printable characters and they are printed together with the
surrounding characters.  But maybe the FreeBSD argument that
some of them are sometimes not ptinted and hence iswprint = 0
can also be made, though somewhat dubiously because sometimes
they are printed.  Besides, which property would you use for
deciding printability?  Please, don't resort to deciding that
character-by-character.

Whether all control chars are always width 0 can maybe also be
disputed.  Again, the stronger argument seems to me that they are.
If they weren't, they would not be control characters but alphanumeric,
punctuation, spaces, or special printable characters, none of which
they are.  I say width 1 and 2 require standalone glyphs that are
normally used for the character.  Besides, no operating system
correctly identifies this as a control character and yet gives it
width 1.

I insist that the discussion should remain very strictly formal,
about the properties and classification in the Unicode data files
and nothing else.  If people start arguing about what makes sense
for any particular character, that's already an argument going
astray.


> So going by this phrase the character should not be printed

When formatting a document, for example for printing on paper or
the online equivalent like PostScript or PDF, i agree.  But i
strongly prefer the terminal to always display this character because
the terminal's usual purpose is not nice text formatting for visual
consumption.  It should usually show the full content of strings
or files, be it for inspection or for editing.  Omitting characters
in such contexts sets nasty traps for the person working with the
terminal.

So i say nothing should be changed at all in OpenBSD.

Yes, that means column counting is wrong on the terminal, but that's
a very minor problem, if it's a problem at all, compared to the havoc
that could result from not showing the character on the terminal at
all, and it cannot be fixed without causing worse problems in situations
that matter more.

The bug in NetBSD and Linux should be fixed, but that's off-topic here.

Yours,
  Ingo



Re: mandoc: -Tlint: search /usr/local/man as well

2021-04-05 Thread Ingo Schwarze
Hi,

Klemens Nanni wrote on Sun, Apr 04, 2021 at 03:54:43PM +0200:
> On Sun, Apr 04, 2021 at 03:42:03PM +0200, Tim van der Molen wrote:

>> Doesn't mandoc -Tlint -Wstyle do what you want?

> I don't think so.  Oddly enough, `-Wstyle' does not do what I would
> expect:  it omits STYLE messages instead of printing them.

It does not.  It only omits BASE messages, which is the lowest message
level and the only one below STYLE.  See  `man -O tag=base mandoc`
what the purpose of BASE is.

When i inplemented the BASE level, i first printed these messages
with a BASE tag.  But other developers complained that "BASE" is
unclear and confusing and requested that STYLE be printed instead.
For that reason, both levels print STYLE.  I don't like that, it
is also confusing, just in a different way.  But unless you come
up with a name better than "BASE" that other developers will not
complain about, i don't know how to improve it.

You can still visually distinguish STYLE and BASE messages since the
latter have an (OpenBSD) or (NetBSD) tag at the end:

   $ man -clT lint /co/NetBSD/src/usr.bin/true/true.1   
  man: /co/NetBSD/src/usr.bin/true/true.1:34:4: STYLE: duplicate RCS id:
$NetBSD: true.1,v 1.7 2003/08/07 11:16:48 agc Exp $
  man: /co/NetBSD/src/usr.bin/true/true.1:36:5: STYLE: Mdocdate missing:
Dd June 27, 1991 (OpenBSD)
  man: /co/NetBSD/src/usr.bin/true/true.1: STYLE: RCS id missing: (OpenBSD)

   $ man -cT lint -W netbsd true
  man: /usr/share/man/man1/true.1:35:5: STYLE: Mdocdate found:
Dd $Mdocdate: September 29 2010 $ (NetBSD)
  man: /usr/share/man/man1/true.1: STYLE: RCS id missing: (NetBSD)


>> -Wstyle also suppresses other warnings that aren't interesting for
>> manuals not part of OpenBSD:

Indeed, that's exactly what it was designed for.

> But it would also suppress legitimate messages, e.g. `.Xr' macros with
> misspelled manuals in the base MANPATH:

Yes, so far, checking of .Xr targets is only done on the BASE level,
not on the STYLE level because i regarded the requirement that .Xr targets
exist as a requirement of the OpenBSD base system.  For portable software
outside the OpenBSD base system, in the past, i didn't think checking .Xr
targets would be helpful.  It appears you now found a use case for that.

Your patch is not OK as it stands because mandoc would no longer help
developers working on the base system to find .Xr links pointing
outside the base system, and we have a policy that we don't want those.

I guess the right thing to do is to leave "-W all" unchanged (including
looking in the base system only) but change "-W style" to also do the .Xr
target check, but along the user's manpath.

Yours,
  Ingo



Re: Patch for crypt(3) man page.

2021-01-27 Thread Ingo Schwarze
Hi,

this page is a mess.  It is full of unclear wordings, in some cases
verging incorrect statements.  At the same time, parts of it are wordy.

Here is an attempt to start fixing it.
I refrained from trying to explain $2a$ (as suggested by sthen@) or to
document the missing bcrypt_gensalt(3) in the same patch because this
patch is already quite large.

Rationale:
 * Rename the "salt" argument of bcrypt(3) to "setting" to match crypt(3).
   It is *not* the salt, it is exactly the same for both functions.
 * Avoid talking about "first" and "second" arguments, just use the
   argument names as in every other manual page.
 * Properly document ".Fa setting" using a syntax display.
 * Precisely explain the format of the encoded salt, and mention
   what happens with additional bytes.
 * Mention what happens to excessive bytes in the key.
 * Precisely describe the return value for Blowfish.
 * Below RETURN VALUES, avoid the misleading statement that it
   returns just "the encrypted value"; it actually returns much
   more than that.  What exactly depends on the algorithm, and
   those details were already covered in the algorithm-specific
   section(s) above.  So instead mention what this heap of bytes
   is typically used for.

I hope i got this right; OK?
  Ingo


Index: crypt.3
===
RCS file: /cvs/src/lib/libc/crypt/crypt.3,v
retrieving revision 1.45
diff -u -r1.45 crypt.3
--- crypt.3 6 Apr 2015 20:49:41 -   1.45
+++ crypt.3 27 Jan 2021 21:30:22 -
@@ -49,7 +49,7 @@
 .Ft char *
 .Fn bcrypt_gensalt "u_int8_t log_rounds"
 .Ft char *
-.Fn bcrypt "const char *key" "const char *salt"
+.Fn bcrypt "const char *key" "const char *setting"
 .Sh DESCRIPTION
 These functions are deprecated in favor of
 .Xr crypt_checkpass 3
@@ -62,58 +62,80 @@
 Additional code has been added to deter key search attempts and to use
 stronger hashing algorithms.
 .Pp
-The first argument to
-.Fn crypt
-is a NUL-terminated
-string
-.Fa key ,
+The
+.Fa key
+is a NUL-terminated string,
 typically a user's typed password.
-The second,
-.Fa setting ,
-currently supports a single form.
-If it begins
-with a string character
-.Pq Ql $
-and a number then a different algorithm is used depending on the number.
-At the moment
-.Ql $2
-chooses Blowfish hashing; see below for more information.
+The
+.Fa setting
+argument is required to begin with an algorithm identifier enclosed
+in dollar signs.
+The meaning of the rest of the
+.Fa setting
+argument is algorithm-specific.
+.Pp
+Currently, only
+.Sx Blowfish crypt
+is supported, so
+.Fa setting
+needs to begin with
+.Qq $2b$ .
 .Ss Blowfish crypt
-The Blowfish version of crypt has 128 bits of
-.Fa salt
+The Blowfish version of crypt has 128 bits of salt
 in order to make building dictionaries of common passwords space consuming.
+.Pp
+The format of
+.Fa setting
+is:
+.Pp
+.Sm off
+.D1 $ Cm 2b No $ Ar log_rounds No $ Ar encoded_salt
+.Sm on
+.Pp
+The binary logarithm of the number of rounds
+is specified as a decimal number containing exactly two digits
+in the range 04 to 31, inclusive.
+For example,
+.Qq $2b$08$
+requests 256 rounds.
+.Pp
+The
+.Ar encoded_salt
+is a byte array of at least 22 bytes, with the first 172 bits
+encoding the 128-bit salt in base64 form.
+Any additional bytes are ignored.
+.Pp
 The initial state of the
 Blowfish cipher is expanded using the
-.Fa salt
-and the
-.Fa password
-repeating the process a variable number of rounds, which is encoded in
-the password string.
-The maximum password length is 72.
+salt and the
+.Fa key
+repeating the process the specifieds number of rounds.
+.Pp
+The maximum
+.Fa key
+length is 72; excessive bytes are silently discarded.
 The final Blowfish password entry is created by encrypting the string
-.Pp
-.Dq OrpheanBeholderScryDoubt
-.Pp
+.Qq OrpheanBeholderScryDoubt
 with the Blowfish state 64 times.
 .Pp
-The version number, the logarithm of the number of rounds and
-the concatenation of salt and hashed password are separated by the
-.Ql $
-character.
-An encoded
-.Sq 8
-would specify 256 rounds.
-A valid Blowfish password looks like this:
-.Pp
-.Dq $2b$12$FPWWO2RJ3CK4FINTw0Hi8OiPKJcX653gzSS.jqltHFMxyDmmQ0Hqq .
+The return value contains a copy of
+.Ar setting
+at the beginning, followed by 31 bytes containing the encrypted password
+encoded in base64 form, so the return value is effectively of the type
+.Vt char[60] .
+For example, a valid return value for Blowfish looks like this:
 .Pp
-The whole Blowfish password string is passed as
-.Fa setting
-for interpretation.
+.Dl $2b$12$FPWWO2RJ3CK4FINTw0Hi8OiPKJcX653gzSS.jqltHFMxyDmmQ0Hqq
 .Sh RETURN VALUES
-The function
+The functions
 .Fn crypt
-returns a pointer to the encrypted value on success, and
+and
+.Fn bcrypt
+return a string usable for the
+.Ar password
+field in the
+.Xr master.passwd 5
+file on success or
 .Dv NULL
 on failure.
 .Sh SEE ALSO



Re: man: help pagers recognise HTML files as such

2021-01-26 Thread Ingo Schwarze
Hi Klemens,

Klemens Nanni wrote on Sat, Jan 16, 2021 at 10:31:49AM +0100:

> On rare occasions I'd like to use the following idiom to read manuals in
> browsers, mostly to better readability and navigation of long sections:
> 
>   MANPAGER=netsurf-gtk3 man -Thtml jq
> 
> (jq(1) has lots of examples and HTML is nicer to read when it comes to
> literal input and output for example.)

Indeed, on rare occasions, such idioms are useful for various purposes.

> My problem is that browsers fail to render HTML as such in case the
> temporary file generated by mandoc(1) lack a file extension, i.e.
> `$browser /tmp/man.abc123' will show literal HTML code whereas
> `$browser /tmp/man.abc123.html' will actually render it.

Software deriving the file type from the file name extension is
stupid and un-UNIXy.  Such behaviour also tends to be dangerous
for obvious reasons.  Admittedly, such behaviour is widespread
in the Web context - even httpd(8) derives file types from file
name extensions...

Not all browsers fall into that trap, and some explicitly allow
specifying the file type, for example

   $ MANPAGER="lynx --force_html" man -T html ls

But i agree that lack of such support is maybe not a sufficent reason
to refrain from using a browser one wants to use for other reasons,
so the feature you propose seems useful to me, and i agree with
your general direction.

> HTML is the only output type I've encountered problem with so far.
> PDF/mupdf(1) for example works fine with arbitrary file names:
> 
>   MANPAGER=mupdf man -Tpdf jq
> 
> Diff below makes mandoc produce temporary files with the ".html" suffix
> in case `-Thtml' was specified.

Sounds sound, i see no downside.

> It feels a bit off to do this for HTML only,

That part doesn't seem that bad to me.  Abusing filename extensions
is particularly widespread on the web, so it feels unsurprising
that HTML may need more handholding than other modes.

> but it greatly improves
> accessibility for me and is simple enough.  If need be, the suffix
> handling could be easily extended to cover more output types in the
> future.
> 
> Feedback? Objections? OK?

It's maybe just a bikeshed, but could you put the logic selecting
the filename extension (either "" or ".html") at the place where
term_tag_init() is called?  That (main.c) is the module where the OUTT_
constants are defined, so it's the natural place to make decisions based
on them.  Then just pass a third const char * into term_tag_init().

Or do you see any downside that approach might have?

*If* people ever request the same for PDF, it makes adding that even
easier.

Also, maybe put the new argument in the middle position because it is
only related to the first argument and not to the second.

Finally and KNFly, please refrain from using function calls as
variable initializers.

Yours,
  Ingo



Re: syspatch exit state

2020-12-07 Thread Ingo Schwarze
Hi Antoine,

Antoine Jacoutot wrote on Mon, Dec 07, 2020 at 01:39:30PM +0100:
> On Mon, Dec 07, 2020 at 01:30:55PM +0100, Ingo Schwarze wrote:
>> Antoine Jacoutot wrote on Mon, Dec 07, 2020 at 01:01:27PM +0100:

>>> I just tested this change and it seems to work:
[...]
>> I think a slightly more explicit wording might make such a
>> misunderstanding less likely, for example:
>> 
>>   .Sh EXIT STATUS
>>   .Ex -std syspatch
>>   In particular, 2 indicates that applying patches was requested
>>   but no additional patch was installed.
[...]

> Sure that's fine as well.
> I did change it to your initial proposal;

Thanks, but...
I initially documented what you originally implemented.

You changed what your implementation does, so the documentation
needed to change as well to match the second iteration of the
implementation.

> but careful, since you're a doc master
> I will put whatever you send my way ;-)

Sometimes, i am wrong, so i appreciate it when people read my
suggestions with a critical eye.  Authors usually know their code
better than i do, and the documentation being correct is even
more important than being nicely worded and properly formatted.  ;-)


> Index: syspatch.8
> ===
> RCS file: /cvs/src/usr.sbin/syspatch/syspatch.8,v
> retrieving revision 1.21
> diff -u -p -r1.21 syspatch.8
> --- syspatch.825 Jul 2020 15:45:44 -  1.21
> +++ syspatch.87 Dec 2020 12:39:07 -
> @@ -64,6 +64,8 @@ of installed patches.
>  .El
>  .Sh EXIT STATUS
>  .Ex -std syspatch
> +In particular, 2 indicates that applying patches was requested but no
> +additional patch was installed.
>  .Sh SEE ALSO
>  .Xr signify 1 ,
>  .Xr installurl 5 ,
> Index: syspatch.sh
> ===
> RCS file: /cvs/src/usr.sbin/syspatch/syspatch.sh,v
> retrieving revision 1.166
> diff -u -p -r1.166 syspatch.sh
> --- syspatch.sh   27 Oct 2020 17:42:05 -  1.166
> +++ syspatch.sh   7 Dec 2020 12:39:07 -
> @@ -320,6 +320,7 @@ if ((OPTIND == 1)); then
>   [[ -f ${_D}/rollback.tgz ]] || rm -r ${_D}
>   done
>   _PATCHES=$(ls_missing) # can't use errexit in a for loop
> + [[ -n ${_PATCHES} ]] || exit 2
>   for _PATCH in ${_PATCHES}; do
>   apply_patch ${_OSrev}-${_PATCH}
>   _PATCH_APPLIED=true



Re: syspatch exit state

2020-12-07 Thread Ingo Schwarze
Hello Antoine,

Antoine Jacoutot wrote on Mon, Dec 07, 2020 at 01:01:27PM +0100:

> I just tested this change and it seems to work:

I did not repeat my testing, but here is some quick feedback purely
from code inspection:

The proposed code change makes sense to me.

The proposed manual page text might allow the misconception that
the tool will always exit 2 if no additional patch was installed,
including when -c, -l, or -r was specified.

I think a slightly more explicit wording might make such a
misunderstanding less likely, for example:

  .Sh EXIT STATUS
  .Ex -std syspatch
  In particular, 2 indicates that applying patches was requested
  but no additional patch was installed.

With non-standard meanings of exit codes, i think it is worthwhile
to be as precise as possible.

Yours,
  Ingo


> Index: syspatch.8
> ===
> RCS file: /cvs/src/usr.sbin/syspatch/syspatch.8,v
> retrieving revision 1.21
> diff -u -p -r1.21 syspatch.8
> --- syspatch.825 Jul 2020 15:45:44 -  1.21
> +++ syspatch.87 Dec 2020 11:58:06 -
> @@ -64,6 +64,7 @@ of installed patches.
>  .El
>  .Sh EXIT STATUS
>  .Ex -std syspatch
> +2 indicates that no additional patch was installed.
>  .Sh SEE ALSO
>  .Xr signify 1 ,
>  .Xr installurl 5 ,
> Index: syspatch.sh
> ===
> RCS file: /cvs/src/usr.sbin/syspatch/syspatch.sh,v
> retrieving revision 1.166
> diff -u -p -r1.166 syspatch.sh
> --- syspatch.sh   27 Oct 2020 17:42:05 -  1.166
> +++ syspatch.sh   7 Dec 2020 11:58:06 -
> @@ -320,6 +320,7 @@ if ((OPTIND == 1)); then
>   [[ -f ${_D}/rollback.tgz ]] || rm -r ${_D}
>   done
>   _PATCHES=$(ls_missing) # can't use errexit in a for loop
> + [[ -n ${_PATCHES} ]] || exit 2
>   for _PATCH in ${_PATCHES}; do
>   apply_patch ${_OSrev}-${_PATCH}
>   _PATCH_APPLIED=true



Re: syspatch exit state

2020-12-07 Thread Ingo Schwarze
Hi Antoine,

Antoine Jacoutot wrote on Mon, Dec 07, 2020 at 09:48:36AM +0100:
> On Sun, Dec 06, 2020 at 10:52:37PM +0100, Alexander Hall wrote:
>> On December 6, 2020 8:13:26 PM GMT+01:00, Antoine Jacoutot wrote:
>>> On Sun, Dec 06, 2020 at 05:20:31PM +, Stuart Henderson wrote:
 On 2020/12/06 16:39, Otto Moerbeek wrote:
> On Sun, Dec 06, 2020 at 03:31:19PM +, SW wrote:
>> On 06/12/2020 14:32, Otto Moerbeek wrote:
>>> On Sun, Dec 06, 2020 at 02:19:05PM +, SW wrote:

 I've been looking to have syspatch give me a quick indication
 of whether a reboot is likely to be required. As a quick and
 dirty check, I've just been treating "Were patches applied?"
 as the indicator.
 The following diff will cause syspatch to exit when applying
 patches with status code 0 only if patches were actually applied.
 My biggest concern is that it does cause a change in behaviour,
 so perhaps this either needs making into an option or a different
 approach entirely?
[...]
>>> I don't this is correct since it maks syspatch exit 1 on "no
>>> patches applied".

>> That's precisely the idea- from previous discussion with a couple
>> of people there didn't seem to be an easy (programmatic) way to
>> figure out whether syspatch had done anything or not.

> exit code 1 normally used for error conditions. A system being
> up-to-date is not an error condition. 

>> Doing this would be a bit of a blunt way of handling things, and
>> perhaps it would be better gated behind a flag, but is there a
>> better way to make a scripted update work automatically
>> (including rebooting as necessary)?

 How about the same exit codes as acme-client? They seem fairly
 reasonable - 0=updated, 1=failure, 2=no change.

>>> I wouldn't object to that.

>> So that'd boil down to
>> $_PATCH_APPLIED || exit 4
>> or
>> $_PATCH_APPLIED && exit
>> exit 4
>> ...if the explicit exit feels better instead of just running to the
>> end of the script.
>> But maybe this script prefers some more verbosity... :-)

> Something like this?

I'm on the fence whether regarding an up-to-date system as an error
condition makes sense; i neither like the idea nor object to it.

But if you decide to head into that direction, i suggest to try
and make the documentation more precise.  "No patch is available"
sounds as if the OpenBSD project did not yet provide a single
patch for the release in question.

> Index: syspatch.8
> ===
> RCS file: /cvs/src/usr.sbin/syspatch/syspatch.8,v
> retrieving revision 1.21
> diff -u -p -r1.21 syspatch.8
> --- syspatch.825 Jul 2020 15:45:44 -  1.21
> +++ syspatch.87 Dec 2020 08:47:48 -
> @@ -64,6 +64,7 @@ of installed patches.
>  .El
>  .Sh EXIT STATUS
>  .Ex -std syspatch
> +2 indicates that no patch is available.

To describe your proposed implementation, i think you would have
to say something like:

  2 indicates that no additional patch was installed.

But i doubt that is what you actually intend to do.

>  .Sh SEE ALSO
>  .Xr signify 1 ,
>  .Xr installurl 5 ,
> Index: syspatch.sh
> ===
> RCS file: /cvs/src/usr.sbin/syspatch/syspatch.sh,v
> retrieving revision 1.166
> diff -u -p -r1.166 syspatch.sh
> --- syspatch.sh   27 Oct 2020 17:42:05 -  1.166
> +++ syspatch.sh   7 Dec 2020 08:47:48 -
> @@ -248,7 +248,8 @@ must be run manually to install the new 
>   fi
>   fi
>  
> - ${_PATCH_APPLIED} && echo "Errata can be reviewed under ${_PDIR}"
> + ${_PATCH_APPLIED} && echo "Errata can be reviewed under ${_PDIR}" ||
> + return 2

This doesn't appear to work unless i screwed up my testing.
Even if _PATCH_APPLIED==false, syspatch still exits 0 for me.
The return value of the trap function doesn't appear to affect
the exit status of the script.

If i change "return 2" to "exit 2", (and insert a debugging statement

  echo _PATCH_APPLIED = ${_PATCH_APPLIED}

right before the "${_PATCH_APPLIED} && ..." test), then i see
the following potentially unintended behaviour:

  # /usr/sbin/syspatch -r ; echo $? 
  Reverting patch 007_xmaplen
  _PATCH_APPLIED = false
  2

That means -r would never again succeed?  And now:

  # /usr/sbin/syspatch -c ; echo $? 
  007_xmaplen
  _PATCH_APPLIED = false
  2

  # /usr/sbin/syspatch -l ; echo $? 
  001_bgpd
  002_icmp6
  003_tmux
  004_wg
  005_unwind
  006_rpki
  _PATCH_APPLIED = false
  2

So listing patches is an error when new patches are available?

  # /usr/sbin/syspatch ; echo $?
  Get/Verify syspatch68-007_xmaplen... 100% |*|  4547 KB00:00
  Installing patch 007_xmaplen
  _PATCH_APPLIED = true
  Errata can be reviewed under /var/syspatch
  0

That is the only test result matching what i think you might intend.
But now, this still happens:


Re: sio_open.3: clarify what sio_start() does

2020-11-28 Thread Ingo Schwarze
Hi Alexandre,

Alexandre Ratchov wrote on Fri, Nov 27, 2020 at 07:05:27PM +0100:

> this wording is shorter and more precise and complete.

This looks good mdoc(7)-wise, so go ahead, but please consider
the two nits below when committing.

Yours,
  Ingo


> Index: sio_open.3
> ===
> RCS file: /cvs/src/lib/libsndio/sio_open.3,v
> retrieving revision 1.51
> diff -u -p -r1.51 sio_open.3
> --- sio_open.320 Nov 2020 12:09:45 -  1.51
> +++ sio_open.327 Nov 2020 18:02:16 -
> @@ -387,17 +387,17 @@ bitmasks should always be used.
>  .Ss Starting and stopping the device
>  The
>  .Fn sio_start
> -function puts the device in a waiting state:
> -the device will wait for playback data to be provided
> -(using the
> -.Fn sio_write
> -function).
> -Once enough data is queued to ensure that play buffers
> -will not underrun, actual playback is started automatically.
> -If record mode only is selected, then recording starts
> -immediately.
> +function prepares the devices to start.

As Erico already noticed, i think you need to go back to the singular
here, s/devices/device/.  Otherwise, if changing to the plural is
intentional, a clarification might make sense which other device(s)
apart from .Fa hdl the text intends to talk about.

> +Once the play buffer is full, i.e.

This line should be:

  Once the play buffer is full, i.e.\&

If a dot at the end of an input line is not a full stop,
it needs escaping to prevent excessive horizontal spacing.

> +.Fa sio_par.bufsz
> +samples are queued with
> +.Fn sio_write ,
> +playback starts automatically.
> +If record mode only is selected, then
> +.Fn sio_start
> +starts recording immediately.
>  In full-duplex mode, playback and recording will start
> -synchronously as soon as enough data to play is available.
> +synchronously as soon as the play buffer is full.
>  .Pp
>  The
>  .Fn sio_stop



Re: AUDIORECDEVICE environment variable in sndio lib

2020-11-17 Thread Ingo Schwarze
Hi Solene,

sorry if i misunderstand because i did not fully follow the thread...

Solene Rapenne wrote on Tue, Nov 17, 2020 at 07:36:38PM +0100:

> I added the new variable after MIDIDEVICE in the ENVIRONMENT section
> to keep order of appearance in the document.

Usually, we order the ENVIRONMENT section alphabetically.

Also, on first sight, the text of AUDIODEVICE and AUDIORECDEVICE
seems to cause a contradiction.  The earlier text below "Default
Audio and MIDI devices" appears to resolve the apparent contradiction,
but wouldn't it be better to give a complete definition at one
place?  For example similar to ths:

.It Ev AUDIODEVICE
Audio device descriptor to use for playback when no descriptor is
explicitly specified to a program.  It is also used for recording if
.Ev AUDIORECDEVICE
is unset.
.It Ev AUDIORECDEVICE
Audio device descriptor to use for recording when no descriptor is
explicitly specified to a program.
.It Ev MIDIDEVICE
[... unchanged ...]

And below "Default Audio and MIDI devices" something like:

When no audio device descriptor is provided to a program
or when the reserved word
.Cm default
is passed as the device descriptor,
the program uses the one specified in the
.Ev AUDIODEVICE
environment variable for playback and the one specified in
.Ev AUDIORECDEVICE
for recording, falling back to
.Ev AUDIODEVICE
if the latter is unset.
If the applicable variables are unset, the program first tries
to connect to [...]

Feel free to ignore me if i misunderstood or if you disagree,
or to tweak the wording according to your taste.

Yours,
  Ingo


> Index: sndio.7
> ===
> RCS file: /home/reposync/src/lib/libsndio/sndio.7,v
> retrieving revision 1.24
> diff -u -p -r1.24 sndio.7
> --- sndio.7   18 Jul 2020 05:01:14 -  1.24
> +++ sndio.7   17 Nov 2020 18:32:34 -
> @@ -157,6 +157,10 @@ If it is not set, the program first trie
>  .Li snd/0 .
>  If that fails, it then tries to use
>  .Li rsnd/0 .
> +The
> +.Ev AUDIORECDEVICE
> +environment variable
> +defines a device to use prior to the audio device when recording.
>  .Pp
>  Similarly, if no MIDI descriptor is provided to a program
>  or when the reserved word
> @@ -196,6 +200,9 @@ Audio device descriptor to use
>  when no descriptor is explicitly specified to a program.
>  .It Ev MIDIDEVICE
>  MIDI port descriptor to use
> +when no descriptor is explicitly specified to a program.
> +.It Ev AUDIORECDEVICE
> +Audio recording device descriptor to use
>  when no descriptor is explicitly specified to a program.
>  .El
>  .Pp
> 



Re: Import seq(1) from FreeBSD

2020-11-17 Thread Ingo Schwarze
Hi Todd,

in view of your arguments and sthen@'s OK, i'm also OK with this
going in.  I think a bit of code cleanup and copy editing in the
manual page may be useful afterwards, but that can be done in the
tree, no need for playing patch ping pong.

See below for answers to the individual points.  If you think any
of these can already be trivially handled by your initial commit,
feel free to do so.  Otherwise, we can reconsider after the commit.

Also note that this thing is UTF-8 neutral even though many other
utilities processing separators are not.  The -f, -s, and -t
arguments can contain UTF-8 and it just works.   No need to call
setlocale(3) or inspect LC_*.

Yours,
  Ingo


Todd C. Miller wrote on Mon, Nov 16, 2020 at 10:08:08AM -0700:
> On Mon, 16 Nov 2020 16:14:31 +0100, Ingo Schwarze wrote:
 
>> are you really sure this is a good idea?  The version you sent is
>> wildly incompatible with GNU sed.  So we add a non-standard utility
>> that exhibits different behaviour on different systems even though
>> a standard utility already exists for the purpose?

> I don't think we need to be bug-compatible with GNU seq

That's probably true.

> and characterizing jot as a "standard utility" is simply not accurate.

Oops.  You are right, it's a BSD utility, not a standard utility.
Seems like i got too used to having it around.

>>$ seq 3 -1 10 ; echo $?
>>   seq: needs positive increment
>>   1
>>$ gseq 3 -1 10 ; echo $?
>>   0

> This is not valid usage, you cannot get to 10 from 3 with a negative
> increment.  We could silently exit like GNU seq if that is desirable
> but is silently ignoring a usage error really what we want?

Granted that this can be subsumed under "bug-compatible" and that
the FreeBSD behaviour makes more sense.  Even though:

   $ /usr/local/plan9/bin/seq 3 -1 10 ; echo $?
  0

>>$ seq 3 0 10 ; echo $?   
>>   seq: zero increment
>>   1
>>$ gseq 3 0 10 ; echo $?
>>   gseq: Abort trap (core dumped) 
>>   134

> I get the following:
> gseq: invalid Zero increment value: ‘0’
> Try 'gseq --help' for more information.

How stupid of me.  I'm running the kernel+libc combo poisoning
printf(%n) in writeable memory, so i should really inspect
/var/log/messages when something hits abort(3), or running that
stuff is useless.  You are right, with gnulib inside coreutils
fixed, i see the some output as you do.  So no discrepancy here.

   $ /usr/local/plan9/bin/seq 3 0 10 ; echo $?  
  seq: zero increment
  1

>>$ seq 3 1 ; echo $?
>>   3
>>   2
>>   1
>>   0
>>$ gseq 3 1 ; echo $?
>>   0

> GNU seq uses a default increment of 1 even if first > last.
> Personally, I think using a default increment of -1 makes more sense
> in the above case, but we can easily make this match the GNU behavior
> if we desire.

I fully agree that the FreeBSD behaviour is more useful, even though

   $ /usr/local/plan9/bin/seq 3 1 ; echo $?
  0

If this goes in now, we are probably far enough away from a release
to notice if anything in ports builds depends on the GNU behaviour -
which does not seem very likely.  If the FreeBSD behaviour causes
problems, we can decide what to do later.

>>$ seq -f '%a' 3  
>>   0x1p+0
>>   0x1p+1
>>   0x1.8p+1
>>$ gseq -f '%a' 3
>>   0x8p-3
>>   0x8p-2
>>   0xcp-2

> The BSD seq output is consistent with printf(1), GNU seq is not.
> I'd classify this as a GNU bug.

Fair enough.  Besides, %a is documented as ambiguous.  Strangely,
the GNU form matches the description in our manual page (minimal
length of the mantissa) while our printf(3) output does not (it
seems to maximize the exponent instead, which is the same as
minimizing the *size*, not the *length* of the mantissa).

By the way, Plan 9 does not seem to support -f '%a' at all. 

>>$ seq -f '%i' 3  
>>   seq: invalid format string: `%i'
>>$ gseq -f '%i' 3 
>>   gseq: Abort trap (core dumped) 

> I get:
> gseq: format ‘%i’ has unknown %i directive

Yes, so there is no issue here.

>>$ seq -s / 3 ; echo $?
>>   1/2/3/0
>>$ gseq -s / 3 ; echo $?
>>   1/2/3
>>   0

> The missing newline appears to be a bug in the FreeBSD seq, NetBSD
> seq works correctly.

The more serious problem here is whether the -s argument should
appear after each number or merely between numbers.  That is not
a matter of bug compatibility but a serious difference in behaviour.

Both the FreeBSD and the GNU manual page say "separate", so
maybe the trailing separator in FreeBSD seq can be regarded as a bug.

The GNU behaviour also makes more sense because the FreeBSD behaviour
of "printing after" can simply be achieved by appending the -s argument
to the format string.  

Re: Import seq(1) from FreeBSD

2020-11-16 Thread Ingo Schwarze
Hi Todd,

are you really sure this is a good idea?  The version you sent is
wildly incompatible with GNU sed.  So we add a non-standard utility
that exhibits different behaviour on different systems even though
a standard utility already exists for the purpose?

Is this needed for porting work?  If so, which behavour is more
commonly expected - GNU compatible or FreeBSD compatible?

Would it be picked up by ./configure in ports, and to what effect?

Yours,
  Ingo


   $ seq 3 -1 10 ; echo $?
  seq: needs positive increment
  1
   $ gseq 3 -1 10 ; echo $?
  0

   $ seq 3 0 10 ; echo $?   
  seq: zero increment
  1
   $ gseq 3 0 10 ; echo $?
  gseq: Abort trap (core dumped) 
  134

   $ seq 3 1 ; echo $?
  3
  2
  1
  0
   $ gseq 3 1 ; echo $?
  0

   $ seq -f '%a' 3  
  0x1p+0
  0x1p+1
  0x1.8p+1
   $ gseq -f '%a' 3
  0x8p-3
  0x8p-2
  0xcp-2

   $ seq -f '%i' 3  
  seq: invalid format string: `%i'
   $ gseq -f '%i' 3 
  gseq: Abort trap (core dumped) 

   $ seq -s / 3 ; echo $?
  1/2/3/0
   $ gseq -s / 3 ; echo $?
  1/2/3
  0

   $ seq -s '.\n' 3 ; echo $?
  1.
  2.
  3.
  0
   $ gseq -s '.\n' 3 ; echo $?
  1.\n2.\n3
  0

   $ seq -s / -t '\n' 3 ; echo $? 
  1/2/3/
  0
   $ gseq -s / -t '\n' 3 ; echo $?
  gseq: unknown option -- t
  Try 'gseq --help' for more information.
  1
   $ seq --help  
  seq: unknown option -- help
  usage: seq [-w] [-f format] [-s string] [-t string] [first [incr]] last
   $ gseq --help
  [... prints a novel ...]

   $ seq 0 .5 1
  0
  0.5
  1
   $ gseq 0 .5 1
  0.0
  0.5
  1.0



Re: accton(8) requires a reboot after being enabled

2020-11-03 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Mon, Nov 02, 2020 at 05:29:37PM +:

> - adding EXIT STATUS makes sense. i agree.

So i added just the .Sh and .Ex lines.

All the rest (both regarding "file" and "install") seems controversial
and hardly worth have a long discussion, so i dropped all the rest.

Yours,
  Ingo



Re: accton(8) requires a reboot after being enabled

2020-11-02 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Fri, Oct 30, 2020 at 12:10:41PM -0600:

> Yes, that diff is a whole bunch of TOCTUO.
> 
> If this was going to be changed, it should be in the kernel.
> 
> But I don't know if it should be changed yet, which is why I asked
> a bunch of questions.
> 
> Stepping back to the man page change, we could decide that accton
> should continue to behave how it does, and this conversation started
> because the man page fell into the trap of documenting the rc scripts
> RATHER than documenting accton.

Given that nobody seems in a rush to patch the kernel, i suggest to
improve the accton(8) manual page for now.  In particular, that
manual page lacks the required EXIT STATUS section, which happens
to be a natural place for mentioning what happens if the file does
not exist.

As usual, an EXAMPLES section is not strictly required, but in this
case, it seems useful to me, to save people from having to inspect
/etc/mtree/special for the recommended file permissions.

OK?
  Ingo


Index: accton.8
===
RCS file: /cvs/src/usr.sbin/accton/accton.8,v
retrieving revision 1.12
diff -u -r1.12 accton.8
--- accton.82 Nov 2020 13:58:44 -   1.12
+++ accton.82 Nov 2020 14:19:43 -
@@ -64,6 +64,17 @@
 .It Pa /var/account/acct
 default accounting file
 .El
+.Sh EXIT STATUS
+.Ex -std
+For example, it is an error if the
+.Ar file
+does not exist.
+.Sh EXAMPLES
+The following commands enable accounting if it was never used before:
+.Bd -literal -offset 4n
+# install -o root -g wheel -m 0644 /dev/null /var/account/acct
+# accton /var/account/acct
+.Ed
 .Sh SEE ALSO
 .Xr lastcomm 1 ,
 .Xr acct 2 ,



Re: accton(8) requires a reboot after being enabled

2020-10-30 Thread Ingo Schwarze
Hi Solene,

Solene Rapenne wrote on Fri, Oct 30, 2020 at 06:34:09PM +0100:

> Following diff changes accton(8) behavior.
> 
> If the file given as parameter doesn't exists, accton will create it.
> Then it will check the ownership and will make it root:wheel if
> it's different.
> 
> I added a check to be sure it's run as root because it's has no use if
> not run as root.

If it naturally runs into privileged system calls anyway and the
error message is comprehensible, that is not necessarily needed.
Currently, the error message seems OK to me:

   $ accton
  accton: Operation not permitted

> I don't often write C, if the logic is good but the C implementation
> wrong I'm open to critic.
> 
> The -f test and touch in /etc/rc won't be needed anymore with this
> change.
> 
> 
> Index: accton.8
> ===
> RCS file: /home/reposync/src/usr.sbin/accton/accton.8,v
> retrieving revision 1.11
> diff -u -p -r1.11 accton.8
> --- accton.8  10 Nov 2019 20:51:53 -  1.11
> +++ accton.8  30 Oct 2020 17:27:36 -
> @@ -36,7 +36,7 @@
>  .Nm accton
>  .Op Ar file
>  .Sh DESCRIPTION
> -With an argument naming an existing
> +With an argument naming a
>  .Ar file ,
>  .Nm
>  causes system accounting information for every process executed

*If* we change accton(8) to create the file if it does not exist,
the manual should probably be more explicit and say that an
existing file is appended to, but that it is created if it does
not exist.

Maybe it should even say something like

  Unlike other implementations, this version of
  .Nm
  creates the
  .Ar file
  if it does not exist.

in order to not set a trap for experienced users.

> Index: accton.c
> ===
> RCS file: /home/reposync/src/usr.sbin/accton/accton.c,v
> retrieving revision 1.8
> diff -u -p -r1.8 accton.c
> --- accton.c  27 Oct 2009 23:59:50 -  1.8
> +++ accton.c  30 Oct 2020 17:26:31 -
> @@ -27,6 +27,7 @@
>   * SUCH DAMAGE.
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -47,6 +48,10 @@ int
>  main(int argc, char *argv[])
>  {
>   int ch;
> + struct stat info_file;
> +
> + if(getuid() != 0)
> + err(1, "need root privileges");
>  
>   while ((ch = getopt(argc, argv, "")) != -1)
>   switch(ch) {
> @@ -63,6 +68,15 @@ main(int argc, char *argv[])
>   err(1, NULL);
>   break;
>   case 1:
> + if(stat(*argv,_file) != 0)
> + if(fopen(*argv,"w"))

Uh oh, that looks like a TOCTOU race condition to me.

> + if(stat(*argv,_file))

And that looks like another TOCTOU race condition.
Checking the return value of fopen(3) might be better.

Leaking a file descriptor from open(2) is also unusual.
Arguably, it may not be a problem here because the program
promptly exits anyway, but is is not nice for auditors.

Finally, what do you need fopen(3) for?
Wouldn't open(2) be sufficient?

> + err(1, "file %s couldn't be created", 
> *argv);
> +
> + if (info_file.st_uid != 0 || info_file.st_gid != 0)
> + if(chown(*argv, 0, 0) != 0)
> + err(1, "Impossible to fix %s ownership", *argv);
> +

Isn't that yet another TOCTOU race condition?
I guess using fchown(2) might be more appropriate?

Also, fchmod(2) seems to be lacking, or even better, couldn't that
be covered by the third argument to open(2)?

>   if (acct(*argv))
>   err(1, "%s", *argv);

Arguably, there might also be a race condition between userland and
the kernel, but i'm not so sure about kernel land.  Given how system
calls like open(2) work - taking a path, returning an int to avoid
race conditions - wouldn't it be more natural to have the acct(2)
system call create the file on the kernel side if necessary?

Yours,
  Ingo



Re: accton(8) requires a reboot after being enabled

2020-10-30 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Fri, Oct 30, 2020 at 09:59:09AM -0600:

> With a careful reading of the current manual page, everything is there
> and it is accurate.
> 
> With an argument naming an existing file
>
> 
> Ok so let's create it with touch.  Probably has wrong permissions.
> But now accton to that file works.  Or enable it and reboot, and now
> disable it and reboot, and the file still exists, so now accton works
> because it is an existing file (with the right permissions I guess).
> 
> So it *IS* working as documented.  It is just a bit weird, because the
> accton command (and system call) do not create the file.
> 
> 
> My problem with these changes is this is the man page for the accton(8)
> command, it documents *the binary*.  The manpage has already been subverted
> talk about rcctl, and about how /etc/rc runs the command.  But the man
> page should first and foremost be about the command, not about /etc/rc
> and rcctl, am i not right?

You are right.

The paragraph about rc.conf.local(8) is short (one sentence, one
line and three half lines) and near the end of the DESCRIPTION,
which is appropriate IMHO.  It feels more prominent than it should
only because the accton(8) command itself is so simple and
straightforward that it can be described exhaustively in three
lines of text.

> For instance, the ntpd manual page has a tiny section about rc.conf.

That paragraph has more or less the same length as the one in accton(8).

> So in conclusion, I think both of you are writing text about the startup
> subsystem, into the wrong manual page.  I think both proposals are skewed.
>
> So questions.
>
> 1 - historically it requires a file to be pre-created.  In the rc scripts,
> this is a touch.  That grabs the umask and ownership of root's run of
> /etc/rc.
> 2 - could we do better, in some way?
> 3 - accton system call does not have a umask, is that where this design came
> from?
> 4 - could we improve upon this?
> 5 - could accton always (attempt to) create the file, without harming
> existing use cases, with proper owner and chmod flags?

Maybe.  It would change behaviour, though.

Somebody might rely on the fact that accton(8) is a NOOP when the
file given as an argument does not exist.  I'm not saying that is
necessarily a good idea, but someone might have done it, for
example because they have to maintain a herd of similar machines,
want accounting on some but not on others, deploy the same
/etc/rc.conf.local to all and create or delete the file as desired.

Having the kernel or accton(8) create the file if it does not exist
would make the tool slightly easier to use for most users.  Is that
worth asking users doing soemthing like the above to change their
deployment?  Maybe, but i'm not sure.  The advantages feel small
either way.

> 6 - or should that be tied to a flag, making it easier to document?

I somewhat dislike that.  Either we regard creating the file as
part of the job.  Then accton(8) ought to do it without a flag, and
people who don't want accounting should not call it with an argument
in the first place.  Or we regard creating and deleting the file
as a separate decision, like we do it now.  Then accton(8) should
not meddle with it, not even as a matter of a flag.  Besides,
creating a root:wheel 0644 file is not rocket science requiring an
additional option in some specialized program to be accomplished.
For that purpose, several adequate and flexible tools already exist,
including install(1).


Triggered by looking at this manual page: Let's trim the ridiculous
HISTORY section.  No, it didn't exist forever.  No, the manual page
isn't new.

A predecessor "gacct(8)" may have existed briefly in v4, but i'm
not wholly convinced.  There is only a skeleton manual page

  https://minnie.tuhs.org/cgi-bin/utree.pl?file=V4/man/manx/gacct.8

not even containing a DESCRIPTION.  I found no code, and it seems
it probably wasn't contained in v5 nor in v6.  So it seems best
to not mention it.

OK?
  Ingo


Index: accton.8
===
RCS file: /cvs/src/usr.sbin/accton/accton.8,v
retrieving revision 1.11
diff -u -p -r1.11 accton.8
--- accton.810 Nov 2019 20:51:53 -  1.11
+++ accton.830 Oct 2020 16:43:55 -
@@ -73,4 +73,5 @@ default accounting file
 .Sh HISTORY
 The
 .Nm
-command has existed nearly forever, but this man page is new.
+command first appeared in
+.At v7 .



Re: [PATCH] Fix link in Porting Guide

2020-10-29 Thread Ingo Schwarze
Hi Martin,

Martin Vahlensieck wrote on Wed, Oct 28, 2020 at 09:02:21PM +0100:

> This refers to the libc function.

Committed, thanks.

> P.S.: I noticed that e.g. sysmerge(8) is mentioned like this but not a
> link. Is this intentional?

Probably not, i added a few more links while there.

Yours,
  Ingo


> Index: faq/ports/guide.html
> ===
> RCS file: /cvs/www/faq/ports/guide.html,v
> retrieving revision 1.91
> diff -u -p -r1.91 guide.html
> --- faq/ports/guide.html  15 Jul 2020 21:52:04 -  1.91
> +++ faq/ports/guide.html  28 Oct 2020 20:01:21 -
> @@ -1320,7 +1320,7 @@ Another very common problem is the   Heed the warnings of the bsd linker about its uses.
>  These must be fixed.
>  This is not quite as simple as s/mktemp/mkstemp/g.
> -Refer to https://man.openbsd.org/mktemp;>mktemp(3) for more
> +Refer to https://man.openbsd.org/mktemp.3;>mktemp(3) for more
>  information.
>  Correct code using mkstemp includes the source to
>  ed or mail.



Re: [PATCH] Mention unsupported stacking in softraid(4)

2020-10-25 Thread Ingo Schwarze
Hi Jeremie and Filippo,

Jeremie Courreges-Anglas wrote on Sun, Oct 25, 2020 at 03:05:04PM +0100:
> On Sun, Oct 25 2020, "Filippo Valsorda"  wrote:

>> Based on the text in faq14.html, but using the manpage language.

> Makes sense.  I'm not sure .Em is useful here, though.

Indeed.  We use .Em very sparingly in manual pages because too much
markup tends to look ugly, because manual pages tend to have quite a
bit of markup from semantical macros already, and because .Em can
sometimes be mistaken for semantical markup rather than stress
emphasis.  So i removed the .Em before i committed this, see the
commit below.

> ok jca@ if someone wants to pick this up, else I'll just commit it in
> a few hours.

Thanks to jca@ for checking.

Yours,
  Ingo


CVSROOT:/cvs
Module name:src
Changes by: schwa...@cvs.openbsd.org2020/10/25 09:33:20

Modified files:
share/man/man4 : softraid.4 

Log message:
mention that stacking disciplines is not supported,
with wording similar to the FAQ;
suggested by Filippo Valsorda ;
tweak and OK jca@


Index: softraid.4
===
RCS file: /cvs/src/share/man/man4/softraid.4,v
retrieving revision 1.42
retrieving revision 1.43
diff -u -r1.42 -r1.43
--- softraid.4  27 Jun 2017 16:02:05 -  1.42
+++ softraid.4  25 Oct 2020 15:33:19 -  1.43
@@ -1,4 +1,4 @@
-.\"$OpenBSD: softraid.4,v 1.42 2017/06/27 16:02:05 tb Exp $
+.\"$OpenBSD: softraid.4,v 1.43 2020/10/25 15:33:19 schwarze Exp $
 .\"
 .\" Copyright (c) 2007 Todd T. Fries   
 .\" Copyright (c) 2007 Marco Peereboom 
@@ -15,7 +15,7 @@
 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 .\"
-.Dd $Mdocdate: June 27 2017 $
+.Dd $Mdocdate: October 25 2020 $
 .Dt SOFTRAID 4
 .Os
 .Sh NAME
@@ -200,6 +200,9 @@
 .Pp
 The RAID 5 discipline does not initialize parity upon creation, instead parity
 is only updated upon write.
+.Pp
+Stacking disciplines (CRYPTO on top of RAID 1, for example) is not
+supported at this time.
 .Pp
 Currently there is no automated mechanism to recover from failed disks.
 .Pp



Re: diff: fixing a few broken links on 68.html, and other minor things

2020-10-22 Thread Ingo Schwarze
Hi,

Andras Farkas wrote on Wed, Oct 21, 2020 at 09:19:43PM -0400:

> While reading 68.html I noticed some of the man page links pointed to
> the man pages in the wrong section of the manual. (at least, given the
> manual section numbers listed next to them in the 68.html text)
> I decided to fix these.

Thanks, i just committed these four fixes.

By the way, this is why i urge developers to get into the habit
of always writing manual page links in the form

  https://man.openbsd.org/name.section

The two additional keystrokes for the dot and the section number
won't wear out your fingertips, and experience shows that trying
to do it "only when needed" just doesn't work.  Practically everybody
(myself included) occasionally forgets a required number when trying
that, and a habit of checking every link would cause more work than
adding two bytes.

Besides, sometimes new pages appear later that cause a link to
change.  For example, we have editline(3) for a long time, and then
later on, editline(7) appeared, too.  At that point, a link to just
https://man.openbsd.org/editline would have changed target, even
for pages of *past* releases.

> While there, I also made the fixed links point at 6.8 man pages
> rather than -current man pages.

I didn't take these changes: sysctl(2), scsi(4), video(4), drm(4)
are not going away.

> This is important in case functionality mentioned in 68.html
> is changed or removed in a later version of OpenBSD.

No that isn't important, even if in rare cases, old links go dead.
In general, it is useful to have old release pages link to -current
documentation (because the reason for looking at old release
pages is often "when did this feature appear?", and then the new
manual page is more useful for users than an outdated version).
Besides, adding /OpenBSD-X.Y/ makes the links too long and ugly.

> I also turned two mentions of bettertls.com into hyperlinks.

Never send diffs mixing logically unrelated changes.

I committed that separately.

> Diff attached:
> SHA256 (68diff) =
> d16eb33d863866b004d75041e42c24100dd4200864a5b2243913f98ad5d8eaa9

Hashing diffs is useless.  The are scrutinized before commit anyway,
and what matters is whether they are correct, not whether they
contain what you intended.

More importantly, never send diffs as attachments; that's just a
waste of developer time.  Include them directly into the body of
the mail, ideally at the end.  And please don't waste more time
arguing that your mail client can't do that; in that case just get
a decent mail client *before* sending diffs.

> Also, some links on 68.html (like amlpwrc

Hopefully, somebody is going to document that driver eventually,
and then the link will come alive.

> and sftp-client)

That was just an error: sftp-client.c is the source file, but the
program and the manual page are called sftp(1).  Fixed.

[...]
> issues noticed by
> https://validator.w3.org/nu/

Done, too.

> There was also a mention of sshd(1) that should probably be ssh(1) but
> I wasn't sure.  Please check.

Right, that's sshconnect2.c rev. 1.322, order_hostkeyalgs(), called
from ssh_kex2(), called from ssh_login(), called from ssh.c main(),
which is the ssh(1) client, not sshd(8).  Fixed.

Yours,
  Ingo



Re: libexec/security: don't prune mount points

2020-10-11 Thread Ingo Schwarze
Hi Todd,

Todd C. Miller wrote on Wed, Oct 07, 2020 at 09:36:33AM -0600:

> The recent changes to the daily security script will result in it
> not traversing file systems where the parent mount point is mounted
> with options nodev,nosuid but the child is mounted with setuid
> enabled.
> 
> For example, if /var/www is a separate file system that allows
> setuid but /var is mounted with nodev and nosuid, security will not
> traverse /var/www.
> 
> 198976b83d9da70f.e /var ffs rw,nodev,nosuid 1 2
> 198976b83d9da70f.f /var/www ffs rw 1 2
> 
> The simplest solution is to pass the list of file systems to traverse
> to File::Find and use the equivalent of find's -xdev option.
>
> Anyone want to double-check my logic? :-)

OK schwarze@;
feel free to use the two nits below, inline.

The patch also works.
On one of my machines, i have:
/co ffs rw,nodev,nosuid 1 2
/co/destdir ffs rw,noexec,noperm 1 2
The destdir gets checked while the parent /co does not.
Also, i see no regressions, not even with
/usr/ports ffs rw,nodev,nosuid 1 2
/usr/ports/pobj ffs rw,nodev,nosuid,wxallowed 1 2

Yours,
  Ingo


> Index: libexec/security/security
> ===
> RCS file: /cvs/src/libexec/security/security,v
> retrieving revision 1.40
> diff -u -p -u -r1.40 security
> --- libexec/security/security 17 Sep 2020 06:51:06 -  1.40
> +++ libexec/security/security 7 Oct 2020 15:34:14 -
> @@ -530,6 +530,7 @@ sub strmode {
>  
>  sub find_special_files {
>   my %skip;
> + my @fs;

The rest of the file uses the compact notation:

my (%skip, @fs);

>  
>   %skip = map { $_ => 1 } split ' ', $ENV{SUIDSKIP}
>   if $ENV{SUIDSKIP};
> @@ -541,11 +542,11 @@ sub find_special_files {
>   and return;
>   while (<$fh>) {
>   my ($path, $opt) = /\son\s+(.*?)\s+type\s+\w+(.*)/;
> - $skip{$path} = 1 if $path &&
> - ($opt !~ /local/ ||
> -  ($opt =~ /nodev/ && $opt =~ /nosuid/));
> + push(@fs, $path) if $path && $opt =~ /local/ &&

The parentheses are redundant, just

push @fs, $path if ...

is sufficient.

> + !($opt =~ /nodev/ && $opt =~ /nosuid/);
>   }
>   close_or_nag $fh, "mount" or return;
> + return unless @fs;
>  
>   my $setuid_files = {};
>   my $device_files = {};
> @@ -554,14 +555,19 @@ sub find_special_files {
>   File::Find::find({no_chdir => 1, wanted => sub {
>  
>   if ($skip{$_}) {
> - no warnings 'once';
>   $File::Find::prune = 1;
>   return;
>   }
>  
>   my ($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size,
>   $atime, $mtime, $ctime, $blksize, $blocks) = lstat;
> - unless (defined $dev) {
> + if (defined $dev) {
> + no warnings 'once';
> + if ($dev != $File::Find::topdev) {
> + $File::Find::prune = 1;
> + return;
> + }
> + } else {
>   nag !$!{ENOENT}, "stat: $_: $!";
>   return;
>   }
> @@ -592,7 +598,7 @@ sub find_special_files {
>   $file->{size}= $size;
>   @$file{qw(wday mon day time year)} =
>   split ' ', localtime $mtime;
> - }}, '/');
> + }}, @fs);
>  
>   nag $uudecode_is_setuid, 'Uudecode is setuid.';
>   return $setuid_files, $device_files;



Re: Make df output more human friendly in daily(8)

2020-10-03 Thread Ingo Schwarze
Hi Daniel,

my OK still stands, except that you went too far in one respect
in the manual page, see below.

Yours,
  Ingo


Daniel Jakots wrote on Fri, Oct 02, 2020 at 03:41:31PM -0400:

> Index: share/man/man8/daily.8
> ===
> RCS file: /cvs/src/share/man/man8/daily.8,v
> retrieving revision 1.28
> diff -u -p -r1.28 daily.8
> --- share/man/man8/daily.826 Jul 2020 13:27:24 -  1.28
> +++ share/man/man8/daily.82 Oct 2020 19:34:47 -
> @@ -114,15 +114,6 @@ Lists any daemons which are enabled in
>  .Xr rc.conf.local 8
>  but which are not actually running.
>  .It
> -Checks disk status.
> -Reports on the amount of disk used/available via
> -.Xr df 1 .

So far, so good.

But this needs to remain:

> -Reports on which file systems need to be dumped via
> -.Xr dump 8 .
> -.It

>From here on, deletion is OK again.

> -Reports networking statistics via
> -.Xr netstat 1 .
> -.It
>  Runs the
>  .Xr calendar 1
>  utility unless the environment variable
> @@ -205,15 +196,6 @@ If set to 1, run
>  with the no-write flag.
>  .It Ev ROOTBACKUP
>  If set to 1, make a backup of the root file system.
> -.It Ev VERBOSESTATUS
> -If set to 0,
> -.Xr df 1 ,
> -.Xr dump 8 ,
> -and
> -.Xr netstat 1
> -are skipped.
> -Consequently, if none of the other commands produce any output,
> -no mail will be sent to root.
>  .El
>  .Pp
>  The following variables can be set in
> @@ -250,9 +232,7 @@ Root
>  .Sh SEE ALSO
>  .Xr calendar 1 ,
>  .Xr crontab 1 ,
> -.Xr df 1 ,
>  .Xr locate 1 ,
> -.Xr netstat 1 ,
>  .Xr rdist 1 ,
>  .Xr whatis 1 ,
>  .Xr crontab 5 ,



Re: Make df output more human friendly in daily(8)

2020-10-02 Thread Ingo Schwarze
Hi Daniel,

Daniel Jakots wrote on Fri, Oct 02, 2020 at 02:16:57PM -0400:
> On Fri, 2 Oct 2020 19:55:53 +0200, Ingo Schwarze wrote:
>> Daniel Jakots wrote on Thu, Oct 01, 2020 at 10:32:31PM -0400:

>>> Currently daily(8) runs `df -ikl`.  

>> By default, it does not.  It only does that if you set VERBOSESTATUS.

> Are you sure? It looks like it does not, *if* you set VERBOSESTATUS to
> 0. (And that's what daily(8) says as well).

Sorry for misremebering this and for not checking it before i posted.
You are right.

>> I would prefer deleting the VERBOSESTATUS parts completely,
>> strictly enforcing the principle "daily(8) only produces output
>> when something unexpected happens", and tell people to use
>> daily.local(8) if they want to run df or netstat.  The code
>> for those two parts is totally trivial and riddled with
>> choices that look like personal preferences, like the one
>> you suggest to change.

> I agree, that would a better change indeed. I think I'll fix my
> problem by setting VERBOSESTATUS to 0 and add what I want to my
> daily.local.
 
>> I dimly remember that some developers wanted to keep VERBOSESTATUS,
>> though (i might misremember), so we'll probably keep it.  If we
>> keep it, i absolutely don't care what it does.  So i'll neither OK
>> this nor object to it.

> Anyone cares about this one way or the other? Here's a diff for it. (If
> we want to go this way, I'll craft a diff for current.html as well).

I certainly like this, and it works for me.

But i think a change like this would need more than one OK,
and you should wait some days such that developers can raise
objections.

Just in case you get sufficient OKs and there are no serious
objections, see below for two suggested tweaks.

Yours,
  Ingo


> Index: ./share/man/man8/daily.8
> ===
> RCS file: /cvs/src/share/man/man8/daily.8,v
> retrieving revision 1.28
> diff -u -p -r1.28 daily.8
> --- ./share/man/man8/daily.8  26 Jul 2020 13:27:24 -
> 1.28 +++ ./share/man/man8/daily.8 2 Oct 2020 18:12:39 -
> @@ -205,15 +205,6 @@ If set to 1, run
>  with the no-write flag.
>  .It Ev ROOTBACKUP
>  If set to 1, make a backup of the root file system.
> -.It Ev VERBOSESTATUS
> -If set to 0,
> -.Xr df 1 ,
> -.Xr dump 8 ,
> -and
> -.Xr netstat 1
> -are skipped.
> -Consequently, if none of the other commands produce any output,
> -no mail will be sent to root.
>  .El
>  .Pp
>  The following variables can be set in

The following would also have to go:

 * part of the list item "Checks disk status"
 * the list item "Reports networking statistics"
 * df(1) and netstat(1) in SEE ALSO

> Index: etc/daily
> ===
> RCS file: /cvs/src/etc/daily,v
> retrieving revision 1.93
> diff -u -p -r1.93 daily
> --- etc/daily 9 Sep 2019 20:02:26 -   1.93
> +++ etc/daily 2 Oct 2020 18:12:39 -
> @@ -137,20 +137,7 @@ next_part "Services that should be runni
>  rcctl ls failed
>  
>  next_part "Checking subsystem status:"

This output line should probably be updated, maybe something like:

   next_part "Backing up filesystems with dump:"

> -if [ "X$VERBOSESTATUS" != X0 ]; then
> - echo ""
> - echo "disks:"
> - df -ikl
> - echo ""
> - dump W
> -else
> - dump w | grep -vB1 ^Dump
> -fi
> -
> -next_part "network:"
> -if [ "X$VERBOSESTATUS" != X0 ]; then
> - netstat -ivn
> -fi
> +dump w | grep -vB1 ^Dump
>  
>  next_part "Running calendar in the background:"
>  if [ "X$CALENDAR" != X0 -a \



Re: Make df output more human friendly in daily(8)

2020-10-02 Thread Ingo Schwarze
Hi,

Daniel Jakots wrote on Thu, Oct 01, 2020 at 10:32:31PM -0400:

> Currently daily(8) runs `df -ikl`.

By default, it does not.  It only does that if you set VERBOSESTATUS.

I would prefer deleting the VERBOSESTATUS parts completely,
strictly enforcing the principle "daily(8) only produces output
when something unexpected happens", and tell people to use
daily.local(8) if they want to run df or netstat.  The code
for those two parts is totally trivial and riddled with
choices that look like personal preferences, like the one
you suggest to change.

I dimly remember that some developers wanted to keep VERBOSESTATUS,
though (i might misremember), so we'll probably keep it.  If we
keep it, i absolutely don't care what it does.  So i'll neither OK
this nor object to it.

Yours,
  Ingo



  1   2   3   4   5   6   7   8   9   10   >