Re: rev(1): pull MB_CUR_MAX out of the hot loop
On Fri, 2022-01-07 at 15:00 -0600, Scott Cheloha wrote: > On Fri, Jan 07, 2022 at 01:43:24PM -0600, Scott Cheloha wrote: > > > > [...] > > > > Like this? > > > > [...] > > Updated: make the for-loop update expressions match. > Why not go for the following diff? It has a comparable speed increase, but without the added complexity of a second inner loop. Either diff is fine by me though, so if you want to go ahead with your diff: OK martijn@ $ export LC_ALL=C $ for i in $(jot 10); do time rev /usr/share/dict/words > /dev/null; done 0m00.17s real 0m00.16s user 0m00.02s system 0m00.17s real 0m00.17s user 0m00.01s system 0m00.17s real 0m00.17s user 0m00.01s system 0m00.17s real 0m00.18s user 0m00.00s system 0m00.17s real 0m00.17s user 0m00.01s system 0m00.20s real 0m00.16s user 0m00.03s system 0m00.23s real 0m00.16s user 0m00.01s system 0m00.17s real 0m00.18s user 0m00.00s system 0m00.17s real 0m00.21s user 0m00.00s system 0m00.17s real 0m00.16s user 0m00.01s system $ for i in $(jot 10); do time ./obj/rev.scott /usr/share/dict/words > /dev/null; done 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.02s system 0m00.06s real 0m00.04s user 0m00.02s system 0m00.05s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.05s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.03s user 0m00.01s system 0m00.04s real 0m00.05s user 0m00.00s system $ for i in $(jot 10); do time ./obj/rev.martijn /usr/share/dict/words > /dev/null; done 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.03s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.06s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.05s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.01s system $ export LC_ALL=en_US.UTF-8 $ for i in $(jot 10); do time rev /usr/share/dict/words > /dev/null; done 0m00.17s real 0m00.17s user 0m00.00s system 0m00.17s real 0m00.18s user 0m00.00s system 0m00.17s real 0m00.17s user 0m00.00s system 0m00.17s real 0m00.18s user 0m00.01s system 0m00.17s real 0m00.17s user 0m00.01s system 0m00.17s real 0m00.18s user 0m00.00s system 0m00.17s real 0m00.17s user 0m00.00s system 0m00.17s real 0m00.17s user 0m00.02s system 0m00.17s real 0m00.17s user 0m00.01s system 0m00.17s real 0m00.18s user 0m00.00s system $ for i in $(jot 10); do time ./obj/rev.scott /usr/share/dict/words > /dev/null; done 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.05s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.00s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.01s system 0m00.04s real 0m00.04s user 0m00.00s system $ for i in $(jot 10); do time ./obj/rev.martijn /usr/share/dict/words > /dev/null; done 0m00.04s real 0m00.04s user 0m00.01s system 0m00.05s real 0m00.04s user 0m00.00s system 0m00.05s real 0m00.05s user 0m00.00s system 0m00.05s real 0m00.03s user 0m00.02s system 0m00.05s real 0m00.05s user 0m00.00s system 0m00.05s real 0m00.05s user 0m00.00s system 0m00.05s real 0m00.05s user 0m00.00s system 0m00.05s real 0m00.04s user 0m00.01s system 0m00.05s real 0m00.04s user 0m00.01s system 0m00.05s real 0m00.05s user 0m00.00s system martijn@ ps. I don't have your fancy nanotime. Where can I find that? Index: rev.c === RCS file: /cvs/src/usr.bin/rev/rev.c,v retrieving revision 1.13 diff -u -p -r1.13 rev.c --- rev.c 10 Apr 2016 17:06:52 - 1.13 +++ rev.c 8 Jan 2022 07:09:19 - @@ -43,6 +43,8 @@ int isu8cont(unsigned char); void usage(void); +int multibyte; + int main(int argc, char *argv[]) { @@ -53,6 +55,7 @@ main(int argc, char *argv[]) int ch, rval; setlocale(LC_CTYPE, ""); + multibyte = MB_CUR_MAX > 1; if (pledge("stdio rpath",
Re: scramble printf(9) %p
> As the kernel currently contains (at least) 1946 pointers leak by using > printf(9) and %p, it might be more efficient to scramble %p with a > static value (randomly assigned at boot time). Oh come on. That proposal has no security property! Over time we need to delete all such printf's, and log them in a more private fashion. But the proposal you just made goes in the opposite direction, expanding the potential for exposure, by providing easier interfaces which will encourage use.
Re: scramble printf(9) %p
On Fri, Jan 07, 2022 at 12:25:25PM -0700, Theo de Raadt wrote: > > I agree, but the intent is replacing a debugging method with another > > debugging method (hoping it is more useful). The messages showed here > > are the same that the ones which would be shown on the console before > > the diff. > > the thing about the kernel messages, is you see the pointers and you > know to go fix the code. > > The problem with hiding it in some optional-visibility interface, is > that these things may show up without the more paranoid developers > noticing the problem. > > How do we prevent non-#define DEBUG_FOO situations from growing? As the kernel currently contains (at least) 1946 pointers leak by using printf(9) and %p, it might be more efficient to scramble %p with a static value (randomly assigned at boot time). Using xor with such static-random-value will conserve uniqueness and reproductibility of the printf(9) for the kernel live-time. The following diff does it. If the real value is still wanted (but it should be for unmentionable reasons, as even in debug code is seems to be unwanted), a %P format options could be introduced for such case. Alternatively, %P might be used for print the scrambled value, but it will imply reviewing 1946 usage of %p. -- Sebastien Marie diff 9fbb8b8358a4a6c4112136a2bc60d2cee1a7a243 /home/semarie/repos/openbsd/src blob - a7de554db1973d60c53eae927b51febc6f547b8e file + share/man/man9/printf.9 --- share/man/man9/printf.9 +++ share/man/man9/printf.9 @@ -214,6 +214,9 @@ encountered distinguishable by its value being \*(Le 32 or \*(Ge 128 .Pc , or the end of the decoding directive string itself. +.It Li %p +The value showed is anonymized to prevent pointer information leaking. +Uniqueness and reproducibility are preserved for the kernel live-time. .El .Sh RETURN VALUES The blob - 3ca2a9226b57519b65852db170309d169ee9e751 file + sys/kern/init_main.c --- sys/kern/init_main.c +++ sys/kern/init_main.c @@ -173,6 +173,7 @@ main(void *framep) struct pdevinit *pdev; extern struct pdevinit pdevinit[]; extern void disk_init(void); + extern u_long prf_rand; /* * Initialize the current process pointer (curproc) before @@ -213,6 +214,9 @@ main(void *framep) random_start(boothowto & RB_GOODRANDOM);/* Start the flow */ + /* Initialize prf_rand with a random value */ + arc4random_buf(_rand, sizeof(prf_rand)); + /* * Initialize mbuf's. Do this now because we might attempt to * allocate mbufs or mbuf clusters during autoconfiguration. blob - e2ad6cd97b37ad844cddca0af596e862d0ef8060 file + sys/kern/subr_prf.c --- sys/kern/subr_prf.c +++ sys/kern/subr_prf.c @@ -99,6 +99,8 @@ struct mutex kprintf_mutex = extern int log_open; /* subr_log: is /dev/klog open? */ const char *panicstr; /* arg to first call to panic (used as a flag to indicate that panic has already been called). */ +u_long prf_rand; /* printf(9) %p pointer anonymizer */ + #ifdef DDB /* * Enter ddb on panic. @@ -903,7 +905,7 @@ reswitch: switch (ch) { * defined manner.'' * -- ANSI X3J11 */ - _uquad = (u_long)va_arg(ap, void *); + _uquad = (u_long)va_arg(ap, void *) ^ prf_rand; base = HEX; xdigs = "0123456789abcdef"; flags |= HEXPREFIX;
Add sdhc attribute
This adds attribute `sdhc' to sys/conf/files so that non-glue code, such as mvpxa(4), can indicate the sdhc dependency. OK? Index: arch/armv7/marvell/files.marvell === RCS file: src/sys/arch/armv7/marvell/files.marvell,v retrieving revision 1.9 diff -u -p -r1.9 files.marvell --- arch/armv7/marvell/files.marvell9 Jul 2018 09:58:40 - 1.9 +++ arch/armv7/marvell/files.marvell8 Jan 2022 05:18:51 - @@ -32,6 +32,6 @@ device mvpcie: pcibus attach mvpcie at fdt file arch/armv7/marvell/mvpcie.c mvpcie -device mvpxa: sdmmcbus +device mvpxa: sdmmcbus, sdhc attach mvpxa at fdt file arch/armv7/marvell/mvpxa.c mvpxa Index: conf/files === RCS file: src/sys/conf/files,v retrieving revision 1.706 diff -u -p -r1.706 files --- conf/files 11 Nov 2021 10:03:10 - 1.706 +++ conf/files 8 Jan 2022 05:18:52 - @@ -470,6 +470,7 @@ device ccp file dev/ic/ccp.cccp # SDHC SD/MMC controller +define sdhc device sdhc: sdmmcbus file dev/sdmmc/sdhc.csdhc
Re: rev(1): pull MB_CUR_MAX out of the hot loop
On Fri, 07 Jan 2022 15:00:52 -0600, Scott Cheloha wrote: > Updated: make the for-loop update expressions match. Looks good to me. OK millert@ - todd
Re: rev(1): pull MB_CUR_MAX out of the hot loop
On Fri, Jan 07, 2022 at 01:43:24PM -0600, Scott Cheloha wrote: > > [...] > > Like this? > > [...] Updated: make the for-loop update expressions match. Index: rev.c === RCS file: /cvs/src/usr.bin/rev/rev.c,v retrieving revision 1.13 diff -u -p -r1.13 rev.c --- rev.c 10 Apr 2016 17:06:52 - 1.13 +++ rev.c 7 Jan 2022 21:00:11 - @@ -50,9 +50,10 @@ main(int argc, char *argv[]) FILE *fp; ssize_t len; size_t ps = 0; - int ch, rval; + int ch, multibyte, rval; setlocale(LC_CTYPE, ""); + multibyte = MB_CUR_MAX > 1; if (pledge("stdio rpath", NULL) == -1) err(1, "pledge"); @@ -82,13 +83,18 @@ main(int argc, char *argv[]) while ((len = getline(, , fp)) != -1) { if (p[len - 1] == '\n') --len; - for (t = p + len - 1; t >= p; --t) { - if (isu8cont(*t)) - continue; - u = t; - do { - putchar(*u); - } while (isu8cont(*(++u))); + if (multibyte) { + for (t = p + len - 1; t >= p; t--) { + if (isu8cont(*t)) + continue; + u = t; + do { + putchar(*u); + } while (isu8cont(*(++u))); + } + } else { + for (t = p + len - 1; t >= p; t--) + putchar(*t); } putchar('\n'); } @@ -104,7 +110,7 @@ main(int argc, char *argv[]) int isu8cont(unsigned char c) { - return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80; + return (c & (0x80 | 0x40)) == 0x80; } void
Re: patch: add a new ktrace facility for replacing some printf-debug
On 07/01/22(Fri) 10:54, Sebastien Marie wrote: > Hi, > > Debugging some code paths is complex: for example, unveil(2) code is > involved inside VFS, and using DEBUG_UNVEIL means that the kernel is > spamming printf() for all processes using unveil(2) (a lot of > processes) instead of just the interested cases I want to follow. > > So I cooked the following diff to add a KTRLOG() facility to be able > to replace printf()-like debugging with a more process-limited method. I wish you could debug such issue without having to change any kernel code that's why I started btrace(8). You should already be able to filter probes per thread/process, maybe you want to replace some debug printf by static probes. Alternatively you could define DDBPROF and use the kprobe provider which allow you to inspect prologue and epilogue of ELF functions. Maybe btrace(8) is not yet fully functional to debug this particular problem, but improving it should hopefully give us a tool to debug most of the kernel issues without having to write a diff and boot a custom kernel. At least that's the goal.
Re: rev(1): pull MB_CUR_MAX out of the hot loop
On Fri, Jan 07, 2022 at 07:28:30PM +0100, Ingo Schwarze wrote: > > [...] > > Scott Cheloha wrote on Fri, Jan 07, 2022 at 09:45:41AM -0600: > > > In rev(1), we call MB_CUR_MAX for every byte in the input stream. > > This is extremely expensive. > > [...] > > So i agree with jour general idea to improve rev(1) and other > small utility programs that might have similar issues. > > > It is much cheaper to call it once per line and use a simpler loop > > (see the inlined patch below) if the current locale doesn't handle > > multibyte characters: > > [useful measurements snipped] > > > CC schwarze@ to double-check I'm not misunderstanding MB_CUR_MAX. > > I'm under the impression the return value cannot change unless > > we call setlocale(3). > > Mostly correct, except that uselocale(3) may also change it. > But of course, we don't call uselocale(3) in small utility programs. > > > ok? > > I think your patch can still be improved a bit before committing. > > millert@ wrote: > > > Why not just store the result of MB_CUR_MAX in a variable and use > > that? It's value is not going to change during execution of the > > program. This made a big difference in sort(1) once upon a time. > > I agree with that. > > Even with your patch, rev(1) would still be slow for large files > consisting of very short lines. Like this? Index: rev.c === RCS file: /cvs/src/usr.bin/rev/rev.c,v retrieving revision 1.13 diff -u -p -r1.13 rev.c --- rev.c 10 Apr 2016 17:06:52 - 1.13 +++ rev.c 7 Jan 2022 19:41:23 - @@ -50,9 +50,10 @@ main(int argc, char *argv[]) FILE *fp; ssize_t len; size_t ps = 0; - int ch, rval; + int ch, rval, multibyte; setlocale(LC_CTYPE, ""); + multibyte = MB_CUR_MAX > 1; if (pledge("stdio rpath", NULL) == -1) err(1, "pledge"); @@ -82,13 +83,18 @@ main(int argc, char *argv[]) while ((len = getline(, , fp)) != -1) { if (p[len - 1] == '\n') --len; - for (t = p + len - 1; t >= p; --t) { - if (isu8cont(*t)) - continue; - u = t; - do { - putchar(*u); - } while (isu8cont(*(++u))); + if (multibyte) { + for (t = p + len - 1; t >= p; --t) { + if (isu8cont(*t)) + continue; + u = t; + do { + putchar(*u); + } while (isu8cont(*(++u))); + } + } else { + for (t = p + len - 1; t >= p; t--) + putchar(*t); } putchar('\n'); } @@ -104,7 +110,7 @@ main(int argc, char *argv[]) int isu8cont(unsigned char c) { - return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80; + return (c & (0x80 | 0x40)) == 0x80; } void
Re: patch: add a new ktrace facility for replacing some printf-debug
> I agree, but the intent is replacing a debugging method with another > debugging method (hoping it is more useful). The messages showed here > are the same that the ones which would be shown on the console before > the diff. the thing about the kernel messages, is you see the pointers and you know to go fix the code. The problem with hiding it in some optional-visibility interface, is that these things may show up without the more paranoid developers noticing the problem. How do we prevent non-#define DEBUG_FOO situations from growing?
Re: rev(1): pull MB_CUR_MAX out of the hot loop
Hi Scott, thank you for spending quite some work on our small utility programs, and sorry for failing to follow up as often as i should. Scott Cheloha wrote on Fri, Jan 07, 2022 at 09:45:41AM -0600: > In rev(1), we call MB_CUR_MAX for every byte in the input stream. > This is extremely expensive. That's actually quite ridiculous. The reason why MB_CUR_MAX is so expensive is -- multi-thread support. That is of course completely pointless for small utility programs, on any operating system. And on OpenBSD in particular, it is mostly useless even for large application programs that actually use multiple threads because we basically only support one single locale in the first place. So i had a brief look at the implementation of MB_CUR_MAX in our libc to see whether it could be made less expensive, but i don't see a simple way without potentially breaking some application programs (for example, consider a third-party webserver that wants to run with a POSIX locale in one thread but with a UTF-8 locale in another thread). Apart from that, even if it could be made less expensive, avoiding to call it in tight loops might still be valuable in the interest of portability. On other systems, it must necessarily be expensive: that web server might want to run one thread with UTF-8, one with UTF-16, one with UTF-32, and one with Shift JIS... So i agree with jour general idea to improve rev(1) and other small utility programs that might have similar issues. > It is much cheaper to call it once per line and use a simpler loop > (see the inlined patch below) if the current locale doesn't handle > multibyte characters: [useful measurements snipped] > CC schwarze@ to double-check I'm not misunderstanding MB_CUR_MAX. > I'm under the impression the return value cannot change unless > we call setlocale(3). Mostly correct, except that uselocale(3) may also change it. But of course, we don't call uselocale(3) in small utility programs. > ok? I think your patch can still be improved a bit before committing. millert@ wrote: > Why not just store the result of MB_CUR_MAX in a variable and use > that? It's value is not going to change during execution of the > program. This made a big difference in sort(1) once upon a time. I agree with that. Even with your patch, rev(1) would still be slow for large files consisting of very short lines. > Index: rev.c > === > RCS file: /cvs/src/usr.bin/rev/rev.c,v > retrieving revision 1.13 > diff -u -p -r1.13 rev.c > --- rev.c 10 Apr 2016 17:06:52 - 1.13 > +++ rev.c 7 Jan 2022 15:38:39 - > @@ -82,13 +82,18 @@ main(int argc, char *argv[]) > while ((len = getline(, , fp)) != -1) { > if (p[len - 1] == '\n') > --len; > - for (t = p + len - 1; t >= p; --t) { > - if (isu8cont(*t)) > - continue; > - u = t; > - do { > - putchar(*u); > - } while (isu8cont(*(++u))); > + if (MB_CUR_MAX == 1) { > + for (t = p + len - 1; t >= p; t--) > + putchar(*t); > + } else { > + for (t = p + len - 1; t >= p; --t) { I dislike how you use t-- in one loop but --t in the other. It made me wonder for a moment whether that needs to be different depending on the character set. Wouldn't it be less confusing to pick one of the forms and stick to it? > + if (isu8cont(*t)) > + continue; > + u = t; > + do { > + putchar(*u); > + } while (isu8cont(*(++u))); > + } > } > putchar('\n'); > } > @@ -104,7 +109,7 @@ main(int argc, char *argv[]) > int > isu8cont(unsigned char c) > { > - return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80; > + return (c & (0x80 | 0x40)) == 0x80; > } The isu8cont() function is a semi-standard function used in many OpenBSD utility programs, originally designed by tedu@, i think, and it has proven highly useful to get UTF-8 support going. But back in the day, i think we failed to realize that MB_CUR_MAX is expensive. In that sense, it is probably somewhat ill-designed: it is typically called in tight loops, so i guess there aren't many use cases where the original version is really that good. Given that ksh(1) and ypldap(8) already use versions of this function identical to your version, i'm OK with your suggestion of keeping the name while changing the semantics of the function,
Re: WTERMSIG behavior difference
On Fri, 07 Jan 2022 09:29:01 -0800, Greg Steuck wrote: > Ah, so it's our ksh after all. Unless there's a good reason for it to be > different, should we change it to match the rest of the pack behavior? Probably, it is just not the nicest code base to play with ;-) - todd
Re: WTERMSIG behavior difference
"Todd C. Miller" writes: > On Fri, 07 Jan 2022 00:45:47 -0800, Philip Guenther wrote: > >> No where. The difference in behavior is that of 'sh' when signaled. Run >> your test programs under "ktrace -i" and compare the behavioral difference >> of the child 'sh' process after reception of the SIGINT. > > To add to this, the shell uses an exit value of 128 + signal number > when a command is killed by a signal. If you changes your test > code to use a different shell like dash or bash you will see the > same behavior as FreeBSD. Yup, zsh too. > Our ksh may have a missing check for the FMONITOR flag being set > in the exit value handling. Ah, so it's our ksh after all. Unless there's a good reason for it to be different, should we change it to match the rest of the pack behavior? Thanks Greg
Re: rev(1): pull MB_CUR_MAX out of the hot loop
Why not just store the result of MB_CUR_MAX in a variable and use that? It's value is not going to change during execution of the program. This made a big difference in sort(1) once upon a time. - todd
Re: WTERMSIG behavior difference
On Fri, 07 Jan 2022 00:45:47 -0800, Philip Guenther wrote: > No where. The difference in behavior is that of 'sh' when signaled. Run > your test programs under "ktrace -i" and compare the behavioral difference > of the child 'sh' process after reception of the SIGINT. To add to this, the shell uses an exit value of 128 + signal number when a command is killed by a signal. If you changes your test code to use a different shell like dash or bash you will see the same behavior as FreeBSD. Our ksh may have a missing check for the FMONITOR flag being set in the exit value handling. - todd
Re: patch: add a new ktrace facility for replacing some printf-debug
> If this gets added, more care is needed with the messages. For example, > kernel memory addresses should not be shown because they are privileged > information. No kidding. We spent a decade cleaning that up, and don't want to go back to exposing.
rev(1): pull MB_CUR_MAX out of the hot loop
In rev(1), we call MB_CUR_MAX for every byte in the input stream. This is extremely expensive. It is much cheaper to call it once per line and use a simpler loop (see the inlined patch below) if the current locale doesn't handle multibyte characters: # -current rev(1), $ for i in $(jot 10); do nanotime rev /usr/share/dict/words > /dev/null; done 0.136695757 real 0.130 user 0.010 sys 0.075677725 real 0.060 user 0.000 sys 0.075275764 real 0.070 user 0.000 sys 0.075140009 real 0.070 user 0.000 sys 0.075186630 real 0.070 user 0.010 sys 0.075256959 real 0.080 user 0.000 sys 0.076920069 real 0.080 user 0.000 sys 0.075097523 real 0.060 user 0.010 sys 0.075369093 real 0.070 user 0.000 sys 0.075532266 real 0.070 user 0.000 sys # patched rev(1) $ for i in $(jot 10); do nanotime obj/rev /usr/share/dict/words > /dev/null; done 0.068547813 real 0.060 user 0.010 sys 0.022880303 real 0.020 user 0.000 sys 0.022530839 real 0.020 user 0.000 sys 0.022801439 real 0.020 user 0.000 sys 0.022595941 real 0.020 user 0.000 sys 0.022768434 real 0.020 user 0.000 sys 0.022536526 real 0.020 user 0.000 sys 0.022611791 real 0.020 user 0.000 sys 0.022943240 real 0.020 user 0.000 sys 0.022329260 real 0.020 user 0.000 sys Over 3 times as fast as -current rev(1). Multibyte locales also benefit: # patched rev(1), LC_CTYPE=en_US.UTF-8 for i in $(jot 10); do LC_CTYPE=en_US.UTF-8 nanotime obj/rev /usr/share/dict/words >/dev/null; done 0.088514093 real 0.070 user 0.010 sys 0.026025780 real 0.020 user 0.000 sys 0.025384554 real 0.020 user 0.000 sys 0.025385471 real 0.020 user 0.000 sys 0.025474371 real 0.020 user 0.010 sys 0.025685188 real 0.030 user 0.000 sys 0.02518 real 0.030 user 0.000 sys 0.025783925 real 0.020 user 0.000 sys 0.025348339 real 0.020 user 0.000 sys 0.025672734 real 0.020 user 0.010 sys About 3 times as fast as -current rev(1). CC schwarze@ to double-check I'm not misunderstanding MB_CUR_MAX. I'm under the impression the return value cannot change unless we call setlocale(3). ok? Index: rev.c === RCS file: /cvs/src/usr.bin/rev/rev.c,v retrieving revision 1.13 diff -u -p -r1.13 rev.c --- rev.c 10 Apr 2016 17:06:52 - 1.13 +++ rev.c 7 Jan 2022 15:38:39 - @@ -82,13 +82,18 @@ main(int argc, char *argv[]) while ((len = getline(, , fp)) != -1) { if (p[len - 1] == '\n') --len; - for (t = p + len - 1; t >= p; --t) { - if (isu8cont(*t)) - continue; - u = t; - do { - putchar(*u); - } while (isu8cont(*(++u))); + if (MB_CUR_MAX == 1) { + for (t = p + len - 1; t >= p; t--) + putchar(*t); + } else { + for (t = p + len - 1; t >= p; --t) { + if (isu8cont(*t)) + continue; + u = t; + do { + putchar(*u); + } while (isu8cont(*(++u))); + } } putchar('\n'); } @@ -104,7 +109,7 @@ main(int argc, char *argv[]) int isu8cont(unsigned char c) { - return MB_CUR_MAX > 1 && (c & (0x80 | 0x40)) == 0x80; + return (c & (0x80 | 0x40)) == 0x80; } void
Re: patch: add a new ktrace facility for replacing some printf-debug
On Fri, Jan 07, 2022 at 11:06:30AM +, Visa Hankala wrote: > On Fri, Jan 07, 2022 at 10:54:54AM +0100, Sebastien Marie wrote: > > Debugging some code paths is complex: for example, unveil(2) code is > > involved inside VFS, and using DEBUG_UNVEIL means that the kernel is > > spamming printf() for all processes using unveil(2) (a lot of > > processes) instead of just the interested cases I want to follow. > > > > So I cooked the following diff to add a KTRLOG() facility to be able > > to replace printf()-like debugging with a more process-limited method. > > > > From ktrace(2) point of vue, it adds a new KTR_LOG record type, which > > is just a string. And it adds a function ktrlog(struct proc *, const char > > *, ...) > > which is a printf-like function. > > > > The following diff includes unveil(2) conversion from printf-debug > > message to KTRLOG-debug message. > > > > > > kdump output (LOG entries are new): > > > > 10388 a.outCALL unveil(0x149833ac,0x149833c3) > > 10388 a.outSTRU flags="rx" > > 10388 a.outNAMI "/usr/bin/id" > > 10388 a.outLOG "added name id underneath vnode 0xd580c9a0" > > 10388 a.outLOG "unveil: added name id beneath restricted vnode > > 0xd580c9a0, uvcount 6" > > 10388 a.outRET unveil 0 > > 10388 a.outCALL unveil(0,0) > > 10388 a.outRET unveil 0 > > 10388 a.outCALL kbind(0xcf7e4724,12,0x35c420ed57689eb8) > > 10388 a.outRET kbind 0 > > 10388 a.outCALL stat(0x149833ba,0xcf7e47a0) > > 10388 a.outNAMI "/usr/bin" > > 10388 a.outLOG "unveil: component directory match for vnode > > 0xd58537f8" > > 10388 a.outLOG "unveil: no match for vnode 0xd580c9a0" > > 10388 a.outLOG "unveil: matched \"bin\" underneath/at vnode > > 0xd58537f8" > > 10388 a.outSTRU struct stat { dev=3, ino=77952, mode=drwxr-xr-x , > > nlink=2, uid=0<"root">, gid=0<"wheel">, rdev=319024, atime=1641543393<"Jan > > 7 09:16:33 2022">.860011379, mtime=1641542018<"Jan 7 08:53:38 > > 2022">.406865980, ctime=1641542018<"Jan 7 08:53:38 2022">.406865980, > > size=5632, blocks=12, blksize=16384, flags=0x0, gen=0x0 } > > 10388 a.outRET stat 0 > > > > Additionnally, it permits to properly link the message string with the > > syscall involved and the process code path. > > > > > > Does it might be interesting for people ? or should I just keep it > > locally ? > > If this gets added, more care is needed with the messages. For example, > kernel memory addresses should not be shown because they are privileged > information. I agree, but the intent is replacing a debugging method with another debugging method (hoping it is more useful). The messages showed here are the same that the ones which would be shown on the console before the diff. And to see them you need to run a kernel with DEBUG_UNVEIL defined (in kern/kern_unveil.c file). -- Sebastien Marie
Re: patch: add a new ktrace facility for replacing some printf-debug
On Fri, Jan 07, 2022 at 10:54:54AM +0100, Sebastien Marie wrote: > Debugging some code paths is complex: for example, unveil(2) code is > involved inside VFS, and using DEBUG_UNVEIL means that the kernel is > spamming printf() for all processes using unveil(2) (a lot of > processes) instead of just the interested cases I want to follow. > > So I cooked the following diff to add a KTRLOG() facility to be able > to replace printf()-like debugging with a more process-limited method. > > From ktrace(2) point of vue, it adds a new KTR_LOG record type, which > is just a string. And it adds a function ktrlog(struct proc *, const char *, > ...) > which is a printf-like function. > > The following diff includes unveil(2) conversion from printf-debug > message to KTRLOG-debug message. > > > kdump output (LOG entries are new): > > 10388 a.outCALL unveil(0x149833ac,0x149833c3) > 10388 a.outSTRU flags="rx" > 10388 a.outNAMI "/usr/bin/id" > 10388 a.outLOG "added name id underneath vnode 0xd580c9a0" > 10388 a.outLOG "unveil: added name id beneath restricted vnode > 0xd580c9a0, uvcount 6" > 10388 a.outRET unveil 0 > 10388 a.outCALL unveil(0,0) > 10388 a.outRET unveil 0 > 10388 a.outCALL kbind(0xcf7e4724,12,0x35c420ed57689eb8) > 10388 a.outRET kbind 0 > 10388 a.outCALL stat(0x149833ba,0xcf7e47a0) > 10388 a.outNAMI "/usr/bin" > 10388 a.outLOG "unveil: component directory match for vnode 0xd58537f8" > 10388 a.outLOG "unveil: no match for vnode 0xd580c9a0" > 10388 a.outLOG "unveil: matched \"bin\" underneath/at vnode 0xd58537f8" > 10388 a.outSTRU struct stat { dev=3, ino=77952, mode=drwxr-xr-x , > nlink=2, uid=0<"root">, gid=0<"wheel">, rdev=319024, atime=1641543393<"Jan 7 > 09:16:33 2022">.860011379, mtime=1641542018<"Jan 7 08:53:38 > 2022">.406865980, ctime=1641542018<"Jan 7 08:53:38 2022">.406865980, > size=5632, blocks=12, blksize=16384, flags=0x0, gen=0x0 } > 10388 a.outRET stat 0 > > Additionnally, it permits to properly link the message string with the > syscall involved and the process code path. > > > Does it might be interesting for people ? or should I just keep it > locally ? If this gets added, more care is needed with the messages. For example, kernel memory addresses should not be shown because they are privileged information.
patch: add a new ktrace facility for replacing some printf-debug
Hi, Debugging some code paths is complex: for example, unveil(2) code is involved inside VFS, and using DEBUG_UNVEIL means that the kernel is spamming printf() for all processes using unveil(2) (a lot of processes) instead of just the interested cases I want to follow. So I cooked the following diff to add a KTRLOG() facility to be able to replace printf()-like debugging with a more process-limited method. >From ktrace(2) point of vue, it adds a new KTR_LOG record type, which is just a string. And it adds a function ktrlog(struct proc *, const char *, ...) which is a printf-like function. The following diff includes unveil(2) conversion from printf-debug message to KTRLOG-debug message. kdump output (LOG entries are new): 10388 a.outCALL unveil(0x149833ac,0x149833c3) 10388 a.outSTRU flags="rx" 10388 a.outNAMI "/usr/bin/id" 10388 a.outLOG "added name id underneath vnode 0xd580c9a0" 10388 a.outLOG "unveil: added name id beneath restricted vnode 0xd580c9a0, uvcount 6" 10388 a.outRET unveil 0 10388 a.outCALL unveil(0,0) 10388 a.outRET unveil 0 10388 a.outCALL kbind(0xcf7e4724,12,0x35c420ed57689eb8) 10388 a.outRET kbind 0 10388 a.outCALL stat(0x149833ba,0xcf7e47a0) 10388 a.outNAMI "/usr/bin" 10388 a.outLOG "unveil: component directory match for vnode 0xd58537f8" 10388 a.outLOG "unveil: no match for vnode 0xd580c9a0" 10388 a.outLOG "unveil: matched \"bin\" underneath/at vnode 0xd58537f8" 10388 a.outSTRU struct stat { dev=3, ino=77952, mode=drwxr-xr-x , nlink=2, uid=0<"root">, gid=0<"wheel">, rdev=319024, atime=1641543393<"Jan 7 09:16:33 2022">.860011379, mtime=1641542018<"Jan 7 08:53:38 2022">.406865980, ctime=1641542018<"Jan 7 08:53:38 2022">.406865980, size=5632, blocks=12, blksize=16384, flags=0x0, gen=0x0 } 10388 a.outRET stat 0 Additionnally, it permits to properly link the message string with the syscall involved and the process code path. Does it might be interesting for people ? or should I just keep it locally ? Thanks. -- Sebastien Marie diff aab2d6589bc1ae2859ab3800e0ad61fbd5f87fd1 598aa538be37ceddf77c1c352553462362219e2c blob - 23b2af5c88dfbc4858fe14718beed80c0ebb0d48 blob + 23fd09812f14fedc59f0a971507715e21c3b9441 --- sys/kern/kern_ktrace.c +++ sys/kern/kern_ktrace.c @@ -405,6 +405,34 @@ ktrpledge(struct proc *p, int error, uint64_t code, in atomic_clearbits_int(>p_flag, P_INKTR); } +void +ktrlog(struct proc *p, const char *fmt, ...) +{ + struct ktr_header kth; + char buf[1024]; + int n; + va_list ap; + + atomic_setbits_int(>p_flag, P_INKTR); + ktrinitheader(, p, KTR_LOG); + + va_start(ap, fmt); + n = vsnprintf(buf, sizeof(buf), fmt, ap); + va_end(ap); + + if (n >= sizeof(buf)) { + buf[sizeof(buf) - 4] = '.'; + buf[sizeof(buf) - 3] = '.'; + buf[sizeof(buf) - 2] = '.'; + buf[sizeof(buf) - 1] = '\0'; + } + + KERNEL_LOCK(); + ktrwrite(p, , buf, strlen(buf)); + KERNEL_UNLOCK(); + atomic_clearbits_int(>p_flag, P_INKTR); +} + /* Interface and common routines */ int blob - 801c210c113666aacf826b52a6a430e334f0a09d blob + 32d3284ca932ec7a504647dd424502ea8a0a1f7b --- sys/kern/kern_unveil.c +++ sys/kern/kern_unveil.c @@ -55,7 +55,9 @@ struct unveil { u_char uv_flags; }; -/* #define DEBUG_UNVEIL */ +#ifdef KTRACE +/*#define DEBUG_UNVEIL */ +#endif #define UNVEIL_MAX_VNODES 128 #define UNVEIL_MAX_NAMES 128 @@ -104,7 +106,7 @@ unveil_delete_names(struct unveil *uv) } rw_exit_write(>uv_lock); #ifdef DEBUG_UNVEIL - printf("deleted %d names\n", ret); + KTRLOG(curproc, "deleted %d names", ret); #endif return ret; } @@ -121,7 +123,7 @@ unveil_add_name_unlocked(struct unveil *uv, char *name return 0; } #ifdef DEBUG_UNVEIL - printf("added name %s underneath vnode %p\n", name, uv->uv_vp); + KTRLOG(curproc, "added name %s underneath vnode %p", name, uv->uv_vp); #endif return 1; } @@ -145,7 +147,7 @@ unveil_namelookup(struct unveil *uv, char *name) rw_enter_read(>uv_lock); #ifdef DEBUG_UNVEIL - printf("unveil_namelookup: looking up name %s (%p) in vnode %p\n", + KTRLOG(curproc, "unveil_namelookup: looking up name %s (%p) in vnode %p", name, name, uv->uv_vp); #endif @@ -160,10 +162,10 @@ unveil_namelookup(struct unveil *uv, char *name) #ifdef DEBUG_UNVEIL if (ret == NULL) - printf("unveil_namelookup: no match for name %s in vnode %p\n", + KTRLOG(curproc, "unveil_namelookup: no match for name %s in vnode %p", name, uv->uv_vp); else - printf("unveil_namelookup: matched name %s in vnode %p\n", + KTRLOG(curproc, "unveil_namelookup: matched name %s in vnode %p",
Re: WTERMSIG behavior difference
On Thu, Jan 6, 2022 at 10:47 PM Greg Steuck wrote: > I started by debugging a weird test failure in ghc. I narrowed it > down to a simple C program which behaves differently between OpenBSD and > FreeBSD. I stared at the headers of both systems for a bit and still > don't see why on OpenBSD the program prints: > > uname -a ; cc ./a.c && ./a.out > OpenBSD home.nest.cx 7.0 GENERIC.MP#28 amd64 > x 0 pret 130 > > Yet on FreeBSD: > uname -a ; cc ./a.c && ./a.out > FreeBSD ... 13.0-STABLE FreeBSD 13.0-STABLE > x 0 pret -2 > > Where's the signed/unsigned confusion hiding? > No where. The difference in behavior is that of 'sh' when signaled. Run your test programs under "ktrace -i" and compare the behavioral difference of the child 'sh' process after reception of the SIGINT. Philip Guenther